how to create a custom database (GTDB) ?
0
1
Entering edit mode
3.0 years ago

Hello I was asked for creating a custom database from GTDB, I just need to incorporate some metagenome assembly genomes (MAGs) to the GTDB database the issue is that I dont know how to do that.

the GTDB file "gtdbtk_data.tar.gz" from release202 (https://data.gtdb.ecogenomic.org/releases/release202/) is the file that I want to add the MAGs that I built but they will not have a taxonomy identifier which I think is necesarry for the correct databse build

  • note: I classified the MAGs that I built using the GTDB-Tk program, I know that the taxonomy ID that GTDB-Tk gave to the created MAGs are important here but not sure how to do that... a classmate told me that I needed to do a python script to add the taxonomy classification information to the FASTA headers of each MAG's contig.
MAGs GTDB databases • 1.7k views
ADD COMMENT
0
Entering edit mode

Hello, have you found out a way to create a custom database from GTDB ? I will try to do it myself but I am interested if you have found a method to do it! Thank you.

ADD REPLY
0
Entering edit mode

Hello, I have not but people told me that you can build a custom database (in a reasonable easy way) using CLARK-S. on its readme file have the instructions. basically you build a file with two columns. the first column is the path to each genome, the second columns is where you have to add the taxonomy of each genome (on the taxonomic rank that you prefer)

https://www.reddit.com/r/bioinformatics/comments/rfqt7a/any_metagenomic_classifier_that_can_elaborate_a/

Personally I finally decided to not include my own MAGs to the GTDB database so I used kraken2 for read classification using a index built from GTDB database: https://github.com/rrwick/Metagenomics-Index-Correction which is better than using kraken2 refseq database for microbial taxonomy classification.

ADD REPLY

Login before adding your answer.

Traffic: 1457 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6