What are the most common viral DNA sequences found in human or other mammalian resequencing samples? I have seen there is an NCBI refseq viral genomes link, but what I would like to know is the common viruses found in normal next generation sequencing experiments. http://biostar.stackexchange.com/questions/13679/ncbi-refseq-viral-genomes
Cool question. That's the kind of test I ran a few days ago. I ran BWA with my reads (samples taken from blood) vs RefSeq. My main hit was EBV. Running bwa against EBV-only returned a few good but non-overlapping (depth=1) paired-end hits.
One could take those RefSeq viral genomes as queries and search against either the assembled reads from large genome projects (1000G, Complete Genomics) or even the trace archive.
Of course, the source of the sample matters a lot in this case. Most DNA for human sequencing projects is taken from blood, but not all viruses are found in these cells. Many more virus genomes will be found in human gut microbiome data than any other source with the possible exception of the oral microbiome.
Maybe. Depends when that was done and how much new data (to the SRA and to the viral RefSeq db) have been added since. Such analysis likely needs to be updated regularly. Perhaps it is now your turn to run the analysis.
I used to align long-read assemblies/BACs/fosmids to virus. As I remember, I did not find any significant hits, except EBV, the virus used for constructing cell lines. I happened to work with a sample from oral scrape. There are certainly virus contaminations, but I simply ignore them as I was not doing a metagenomics project. I have not tried on blood samples. When you work with ancient bones, the pattern of contaminations will be vastly different. As Larry said, it all depends on the source of DNA. I do not think there is something in common generally speaking. If you want to know the "common" virus, you will have a better chance to get the answer from metagenomics projects.
Cool question. That's the kind of test I ran a few days ago. I ran BWA with my reads (samples taken from blood) vs RefSeq. My main hit was EBV. Running bwa against EBV-only returned a few good but non-overlapping (depth=1) paired-end hits.