How to add taxonomic information to fasta headers
0
0
Entering edit mode
2.5 years ago

Hello I have 700 metagenome assembled genomes that were taxonomically classified using the GTDB database with the GTDB-tk software

So I have taxonomic information assigned for each one of these MAGs but for downstream analysis I need the fasta headers to contain the taxonomic information that GTDB-tk assigned.

This is how the fasta headers of one of the MAGs looks like:

    cat cluster1_bin.101.fa | grep '>' | head

> k141_1192826  

>k141_94001 

>k141_1104537

>k141_375209  

 >k141_375646 

> k141_742386

>  k141_560036 

>  k141_12021 

>  k141_838926

>   k141_1209697

And I want to know if there is a way of extract the full taxonomy of the following table and give it to the respective fasta headers of a MAG:

sample_table

So this is the desired output for each mag fasta headers using the "cluster1_bin.101.fa" as example

> k141_1192826  Phylum Class Order Family Genus Species 

>k141_94001  Phylum Class Order Family Genus Species

>k141_1104537 Phylum Class Order Family Genus Species

>k141_375209  Phylum Class Order Family Genus Species

 >k141_375646 Phylum Class Order Family Genus Species

> k141_742386 Phylum Class Order Family Genus Species

>  k141_560036 Phylum Class Order Family Genus Species

>  k141_12021 Phylum Class Order Family Genus Species

>  k141_838926 Phylum Class Order Family Genus Species

>   k141_1209697 Phylum Class Order Family Genus Species

any way to do that using any programming language?

MAGs taxonomy fasta • 909 views
ADD COMMENT
0
Entering edit mode

any way to do that using any programming language?

I think this can be done literally in any programming language of your choice. It is a simple fasta header addition which can be done with existing libraries (BioPerl, BioPython), or by using awk/sed to find header lines to which extra information needs to be added. But you will most likely need to write that script on your own.

ADD REPLY
0
Entering edit mode

Please do not post the images of the data.

ADD REPLY
0
Entering edit mode

You'll need to post the table in text form for us to be able to help easily.

ADD REPLY

Login before adding your answer.

Traffic: 2490 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6