I'm trying to use MEGAN5 to get order-level taxonomy assignments for CO1 reads. I have a set of OTUs in a fasta file. I downloaded all the CO1 records from NCBI and created a blast database, then did a local blastn against that using outformat 7. I am able to successfully import all this into MEGAN but it has zero assigned reads. I have played with the LCA parameters, and I'm currently using min score=1, max expected=100, min support=1. These seem to have little effect on the outcome. The blast output file records look like this:
OTU_1 gi|414148152|gb|JX897403.1|90.05 211 21 0 1 211 9219 1e-76 286
So I think the record identifiers are present. Below is the output from MEGAN. I read through the MEGAN manual and I can't really see anything else obvious to try. Can anyone give me any hints?
Executing: import blastFile='resultsall.txt' fastaFile='CO1ALL061015.fasta' meganFile='resultsall.rma' maxMatches=100 minScore=1.0 maxExpected=100.0 topPercent=10.0 minSupport=1 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=false useKegg=false paired=false useIdentityFilter=false textStoragePolicy=Embed blastFormat=BlastTAB mapping='Taxonomy:BUILT_IN=true';
Importing data: 1 reads file(s), 1 blast file(s)
Input format: BlastTAB
TextStoragePolicy: Embed matches and reads in MEGAN file
Mapping all reads in memory
Processing CO1ALL061015.fasta: 33638
Processing resultsall.txt
Executing: show window=message;
Total reads: 656
Total no-hits: 0
Total matches: 332456
Matches discarded: 266207
Parsing required 12 seconds
Number of reads: 656
Low complexity: 0
With valid hits: 656
Number of taxa identified: 1
Data processor required: 1 secs
Total reads: 656
Assigned reads: 0
Unassigned reads: 656
Reads with no hits: 0
Reads low comp.: 0
Induce Taxonomy tree, keeping 2 of 1266115 nodes
Induced taxonomy tree has 2 nodes
ensure that the taxonomy mapping is also loaded
Thanks, Istvan, but I'm not sure what this means and I can't figure it out from the manual. I downloaded the gi_taxid-March2015X.bin file (3GB!) and it is in the same directory as the application. I am using the blast output format that should include adequate information for looking up the data. What else do I need to do? If this is documented somewhere please just point me to it.
Jennifer