Mapping RefSeq accession to assembly ID
2
0
Entering edit mode
8.0 years ago
lironyoffe • 0

Hi,

Is there a way to map RefSeq accession to assembly ID (I need to do it for all the bacteria in RefSeq FTP site) ?

For example: NZ_KN150745.1 -> GCF_001640985.1

Thanks

refseq Assembly • 2.4k views
ADD COMMENT
3
Entering edit mode
8.0 years ago
Sej Modha 5.3k

Unix eutils solution would be:

elink -db nuccore -id "NZ_KN150745.1" -target assembly|esummary|xtract -pattern DocumentSummary -element AssemblyAccession
ADD COMMENT
0
Entering edit mode
8.0 years ago
5heikki 11k

The fastest solution besides parsing from the sequence files that you might have already downloaded would be to download the *assembly_report.txt files and parse from there,

e.g.

awk 'BEGIN{FS="\t"}{if(!/^#/){print $7}}' GCF_000754995.1_PVA_assembly_report.txt
NZ_KN150745.1
NZ_KN150746.1
ADD COMMENT

Login before adding your answer.

Traffic: 2797 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6