What Dose Division Mean In Ncbi Taxonomy Database?
1
1
Entering edit mode
10.9 years ago
Mingkun ▴ 40

I recently started to analyze the metagenomic data, the first problem I met is how to quickly finish the mapping.

I was suggested to use BLASTN and BLASTP to map the HiSeq reads against the nt and nr database, it is very sloooow. After some search, I found Pauda to replace BLASTP, but no software for BLASTN, bwa and bowtie cannot keep all the hits (HSP), which is very important to infer the origin, and the bam/sam can not be recognized by most software (e.g., megan). Thus I was thinking to shrink the nt, I downloaded the gi_taxid_nucl.dmp, nodes.dmp and names.dmp, in nodes.dmp, I can not understand the 5th column, which is the "Division".

Division includes: 0 | BCT | Bacteria
1 | INV | Invertebrates
2 | MAM | Mammals 3 | PHG | Phages 4 | PLN | Plants 5 | PRI | Primates 6 | ROD | Rodents 7 | SYN | Synthetic
8 | UNA | Unassigned | No species nodes should inherit this division assignment
9 | VRL | Viruses 10 | VRT | Vertebrates
11 | ENV | Environmental samples

When I look through the nodes.tmp, I found many taxid have different Division relative to their parental nodes, for example:

2722 | 47936 | species | UP | 11 | 1 |

47936 | 1224 | no rank | | 11 | 0 |

1224 | 2 | phylum | | 0 | 1 |

2722 belong to 47936, both assinged in division "11", while parent of 47936 are located in division "0", why 2722 and 47936 did not inherted "0" from their parent node?

Similary, if I want to retrieve all virus sequences, I could use division==9 or taxid=10239, which would give me different results,why, and which I should use?

PS: Does anyone know any better software to replace BLASTN?

Thanks.

taxonomy • 3.1k views
ADD COMMENT
1
Entering edit mode

Sorry, I am unable to suggest a good alternative (I assume you are aware about LAST and BLAT). I just think that "environmental sample", ENV, can be a subset of Bacteria, BCT, without breaking of the database integrity. Also, it could be that some DB entries were miss-assigned, then, it would be nice to know the way to deal with this issue.

ADD REPLY
1
Entering edit mode
10.9 years ago
Pavel Senin ★ 1.9k

Was looking further into the blastn-like things (since I am currently running blast for similar purpose) it seems like it is possible to use LAST for taxonomic assignment, thank you Nick!

ADD COMMENT

Login before adding your answer.

Traffic: 1692 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6