hey All,
i have few RNA seq samples (healthy and diseased). I have already aligned my samples with human hg19 reference using tophat2. Now i am trying to align these reads to the bacterial genome in order to know if the samples also have some bacterial genome in them or not. I need suggestions which tool should i use for this. i did try tophat2 but do you know any better one.
Do you already know the particular bacteria that you want to align against, or are you still trying to determine that? For what it's worth, bacteria tend to not have splicing, so you can often get away with directly using bowtie2/bwa/etc.
Yes, i am using streptococcus pneumonia ATCC700669 (FM211187). I downloaded the fasta file.. made the index file by bowtie-build and then mapped using tophat2. The result that i got for the diseased sample is :
And for the healthy (was just exoerimenting with the healthy sample, what comes out)
By just looking at the result, do you say that the bacterial genome remain are into the sample???
I just saw this paper mentioned on twitter (it literally just came out). It and some of the references therein may be of interest to you. That particular paper is for one of the iobio tools, which are always really slick.
Our internal threshold for calling a sample contaminated is 0.5% unique alignments, so I guess the diseased sample is borderline. I don't know where the samples were sourced from, so you might not expect a high amount of the bugs in the samples, even if the patient had them.
you could use SNAP/Bowtie2 to align the reads against bacterial genomes from NCBI. There are pipelines built for this, but it would be tedious if your main goal is not to identify the pathogens in the data.
http://chiulab.ucsf.edu/surpi/