Illumina Pe Reads Mapping To Reference Genome
1
0
Entering edit mode
11.8 years ago

I have around 300 metagenomic sample set of PE-illumina reads. Assembly of these reads give me fairly longer contigs and I can map around 80-90% of my reads back to these contigs.

But when I map same reads to ~5000 complete microbial reference genomes, only a small fraction of reads (2%-8%) are mapped. Even if I forget that I can map my reads back to my contigs, I am surprised to see a metagenomic sample with only 2-8% reads from known microbial genomes.

I have used bowtie2 with default --end-to-end as well as --local settings.

Can any one guess about the probable situation?

illumina bowtie2 reference metagenomics • 3.8k views
ADD COMMENT
1
Entering edit mode

did you blast your contigs to see what they map to?

ADD REPLY
2
Entering edit mode
11.8 years ago
Lee Katz ★ 3.2k

I have two guesses, one computational and one biological. I have no idea if I'm actually right, but hopefully it can get you onto the right path.

  1. You are getting misassemblies. Is there a better assembler to use (e.g. Ray Meta)? Are your parameters too aggressive? You can alter the settings by increasing the overlap length, etc.
  2. Your metagenomics sample has many new taxa that are not characterized yet. Therefore they wouldn't have representative assembled contigs in your reference genomes.
ADD COMMENT
1
Entering edit mode

+1, My guess is you are right on both accounts Lee, the contigs are composed of misassemblies and there's a lot of biological diversity out there that we haven't tapped yet. This is partly one of the reasons I think people with metagenomic data should identify reads and then take time to assemble with algorithms designed for the strategy of dealing with extreme diversity.

ADD REPLY

Login before adding your answer.

Traffic: 1688 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6