genome assembly records not present in assembly_summary.txt
1
0
Entering edit mode
10 days ago
sapuizait ▴ 10

Hi all

Gathered from various sources my lab has downloaded several E.coli assemblies from NCBI. Now, that I am looking to the data, I am trying to retrieve the metainfo. Even though for most of them I can get that from the assembly_summary file that can be downloaded from the NCBI FTP, there are some that do not exist even in the latest version of the file. Example: GCA_018564605.1 does not exist in the assembly_summary file but if I look for it in the NCBI portal, it is there! https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_018564605.1/ Is it because the record is not curated? How can I retrieve these?

Thanks

ncbi bacteria assembly • 330 views
ADD COMMENT
1
Entering edit mode

If you have the accession numbers can you not use something like eutils?

ADD REPLY
2
Entering edit mode
10 days ago
GenoMax 142k

Can you clarify what metainfo you are referring to?

The accession that you refer to does exist in GenBank assembly_summary file.

$ grep GCA_018564605 assembly_summary_genbank.txt
GCA_018564605.1 PRJNA514245     SAMEA7577678    DAEFSU000000000.1       na      562     562     Escherichia coli        strain=110504014        110504014       latest  Contig   Major   Full    2021/05/27      PDT001039592.1  National Center for Biotechnology Information   na      na      https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/018/564/605/GCA_018564605.1_PDT001039592.1  from large multi-isolate project        na      na      haploid bacteria        5111194 5111194 50.500000       0       66      66       NCBI    NCBI Prokaryotic Genome Annotation Pipeline (PGAP)      2021/05/17      4979    4757    93      30286803
ADD COMMENT
0
Entering edit mode

jesus its in the Genbank file and I was looking at the refseq! I m such a moron - thanks for pointing it out -sorry about that :(

ADD REPLY

Login before adding your answer.

Traffic: 1671 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6