Hello,
I am new RNA sequencing work. I have raw fastq files (from extracted RNA and DNA both) for bacteria "Paucibacter toxinivorans strain IM4". I am not able to find whole reference genome for the same bacteria. But, there is partial 16S sequence available at NCBI
https://www.ncbi.nlm.nih.gov/nuccore/1031488746
Do I first need to process DNA fastq files for whole genome sequencing and then move to RNA sequencing analysis? If not, Can I use the available partial 16S sequence for alignment?
Please can anybody guide me ?
Original title: Whole genome sequencing for a bacterial strain for RNA sequencing
If you have the full genome sequenced, then it'd be preferable to first assemble the genome, annotate it (i.e. determine where gene regions are) and then use that to align the RNA-seq to. This seems like a pretty good run-down of the different ways to sequence and assemble bacterial genomes; I'm sure there are many more on PubMed.
If I interpret correctly, you're saying that you have bulk RNA-seq data (= potentially all transcripts of that one bacteria strain), which is very different from ribosomal DNA (!) sequencing that's usually applied to a MIX of different bacteria and is typically used to simply identify the different species present in the mix. I don't see how your data set would benefit from focusing on rRNA genes.
That being said, why are you looking at that data set to begin with?
Thank you very much for the reply. I have RNA-seq data (in fastaq format) from control and treatment group for the strain. I also have the DNA sequence for the same pure culture. Do you mean (Please correct me if I am wrong), I have to first assemble and annotate the DNA sequences (obtained in fastaq format) of the strain and then used the same as reference genome to align to the RNA sequences (obtained in fastq format).
first assemble and annotate the DNA sequences (obtained in fastaq format) of the strain
then used the same as reference genome to align to the RNA sequences (obtained in fastq format)
There is one genome available for this bacterium. It may not be your exact strain but it may work for a start.
Contact project PI for scripts/pipelines used in DSM 16998 genome assembly and it's annotation. https://genome.jgi.doe.gov/portal/PautoxDSM16998_FD/PautoxDSM16998_FD.info.html or tie up with a local bioinformatician.