Question

Why does MEGAN LCA algorythm not assign a species to some 1-hit reads?

0

Entering edit mode

4.7 years ago

vsingh ▴ 100

I imported a BLAST file with taxa names on the header (and also tried helping that with a "synonyms" file). It seems that MEGAN is able to detect the taxa for each read, however, I find plenty of reads that only aligned to one specific species but MEGAN has them on a LCA (last common ancestor). For example: - Have many reads displayed as Capreolus capreolus. Many hundred reads with 1 match only have been assigned to the LCA infraorder Pecora, many with high scores (some higher scores than on some reads assigned to species B). Why are reads within Pecora not assigned to the specific species when the score is 119 with a 100% identity (e.g. Bos taurs)? Some reads assigned to Capreolus capreolus have lower scores and identities below 90%. Can I change some parameters to achieve better separation, or are the default ones for "naive LCA" accurate or possibly conservative enough? Thanks in advance!

Metagenomics • 1.1k views

ADD COMMENT • link 4.7 years ago by vsingh ▴ 100

0

Entering edit mode

Does the 100% identity hit also has a 100% coverage? Also check the bitscore, one of the first steps in MEGAN is that it takes the highest bitscore of the hits lets say 300 and takes a top percentage like 8%. So all the hits with a bitscore above 300-(300*0.08)=276 will be used to determine the LCA. You can change those settings. MEGAN is not really "smart" as far as I know and even if you know for sure you have a perfectly good hit because of the algorithm it can go a taxonomic level higher.

ADD REPLY • link 4.7 years ago by gb ★ 2.2k