I want to automate a process to identify newly submitted plant accessions in NCBI. I am scanning the NCBI FTP server, but I have not yet found any address to locate all SRA accessions.
looks like today Nov 11, 2022 there were 4638 datasets deposited at SRA ... whoa, I did not expect that ... I am extraordinarily surprised to be honest. That is a lot of data.
This is extremely cool, Istvan and I want to thank you for being so helpful to us.
One question? Is there a way that I can focus the search only on plants, animals or bacteria?
Technically there is a field for TaxID in the output that runinfo option in the command above but it is sadly not populated for many entries (certainly not for new ones). I checked on that yesterday. You can add a TaxID number to the query in the first part of the command.
NCBI publishes a file containing SRA accession numbers. It is updated daily (file is almost a gigabyte so a largeish download). It appears to have accession numbers that start a ways back and are current up to a given date.
This is so cool! I was wondering how I can find the new plant species from there, for example. Do you have an idea? Thank you so much for your time and suggestions
This is extremely cool, Istvan and I want to thank you for being so helpful to us. One question? Is there a way that I can focus the search only on plants, animals or bacteria?
Technically there is a field for TaxID in the output that
runinfo
option in the command above but it is sadly not populated for many entries (certainly not for new ones). I checked on that yesterday. You can add aTaxID
number to the query in the first part of the command.thank you so much GenoMax :)