Entering edit mode
8.2 years ago
dazhudou1122
▴
140
Hey guys,
I am trying to map RNA-Seq reads of bacteria in human tumor tissues to a reference genome. Then I realized (from published data) that many of the bacteria are clinical isolates and their genomic sequences might be drastically different than the ATCC reference genome. Therefore, I am wondering wether I can cluster all of the genomes belong to this genus (e.g. Clostridia cluster IV) and map my RNA-Seq reads to them. If that is possible, how do I read counts and differential analysis?
Thanks guys, any input is appreciated!
Is there a precedent for that? If the sequence is drastically different then it may not be the same species. How did you separate the bacterial reads and are you mapping those to just the bacterial reference?
Drastic might be a bit exaggerating, but the differences are more than a few basepairs, and some genes that these bacteria have are not presence on a single reference genome (for example C. symbiosum). But if you blast it, it will fish out genes in other clostridia species.
Do you expect to have a mixture of bacterial species in your sample? I am not sure how you are going to be able to uniquely align and count reads if that is the case.