Entering edit mode
10.5 years ago
jobinv
★
1.1k
This article seems to have an interesting approach; if I understand it correctly, they first map their human reads to the human genome, and then take the unmapped reads to find out whether they map to virus databases. If they do, then they seem to suggest that one could theorize that the virus could be implicated in the pathogenesis of the disease studied. Is this kind of conclusion warranted from such a finding?
Well, it's a first step. Obviously a lot of follow-up experiments and validation is needed to establish the link. As an initial screening step it is reasonable (and has been used many times by various groups).
I haven't read the paper but is there an explanation why to use bowtie(2) and not tophat(2). AFAIK bowtie is not capable of resovling splice junctions. Therefore some amount of unmapped sequences will fall onto those...
True, or you could throw in filtering against a cDNA reference as well. Bowtie2 is much faster than Tophat2, maybe that's the reason.
I am also trying to understand the pipeline used. They mention that it works with RNA-seq data. However, RNA-seq data will have all the introns spliced out. So what about that viral integration which occurred in the intronic region?