Question

Cross-species RNA-seq analysis in bacteria

0

Entering edit mode

7.7 years ago

devikaparvathy ▴ 50

Hi,

I am working on microbial genomics, and there are a couple of datasets in SRA/ENA that I can use for my work. I want to combine these datasets in a single study but the problem is these datasets are all done on different subspecies of S. aureus.

I tried creating a common reference genome annotation according to a methodology by LoVerso and Cui - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4668955/, but it mapped only around 200 genes in common between two species.

Is there any other method wherein I can map the homologs of a second genome to the primary reference annotation and carry out an integrated analysis in a single go? (Because that creates many numbers of replicates under the same condition and increases the reliability of the studies)

Or am I supposed to do separate differential expression for each dataset and then compare the obtained genes separately?

RNA-Seq multiple datasets bacteria • 2.0k views

ADD COMMENT • link 7.7 years ago by devikaparvathy ▴ 50

0

Entering edit mode

Hi, thank you for your reply. But I doubt if I want to create a new consensus reference genome and carry out the analysis - in that case will all the corresponding genes be mapped correctly?

ADD REPLY • link 7.7 years ago by devikaparvathy ▴ 50

0

Entering edit mode

My aim is to do an integrative analysis of certain public RNA-seq data available for a particular bacterial species. But each experiment are done in different strains/subspecies.

What I plan to do is to align the reads to their respective reference genomes, and for further analysis, create an annotation file (GFF/GTF) - based on one of the selected subspecies (chosen "target" for lift over) and combine it with the mapped annotation of other subspecies ("source" for lift over).

Is this procedure right? Or are there any other alternatives? I do not wish to do all RNA-seq analysis separately and then simply compare the results of differential expressed gene lists.

ADD REPLY • link 7.7 years ago by devikaparvathy ▴ 50

score 0 · Answer 1 · 2017-10-16

Hey,

I had a similar design in a recent study on a different bacterial species.

One program that works quite efficiently and produces good results is Rockhopper: http://cs.wellesley.edu/~btjaden/Rockhopper/

Rockhopper will allow you to de novo assemble a consensus genome from whatever data you provide, and it then also performs differential expression analysis. It's output is actually just a FASTA sequence and then expression levels and different statistical parameters. You can then BLASTx these FASTA sequences in order to infer functionality.

Trust that this helps, Kevin