Question

Fasta upload files from NCBI

0

Entering edit mode

2.6 years ago

biology.may20 ▴ 20

Dear collegues,

Could you pls advise where to correct the code,

I need: upload as much as possible fasta files with some bacteria genomes (not certain but group, not certain ID, just lots of) and to convert it futher to json files. Second part I solved with json, but don't understand how to upload on my PC files from NCBI. I have code:

import os
from Bio import SeqIO
from Bio import Entrez

Entrez.email = "my_email@gmail.com" 

filename = "new.fasta" #  but actually I need to upload separate fasta files to one folder

if not os.path.isfile(filename): 
    with Entrez.efetch(db="genome", tern="Salmonella+enterica", rettype="fasta_cds_na", retmax=10) as net_handle: # how I will upload next files after this 10

        with open(filename, 'w') as out_handle:
            out_handle.write(net_handle.read())
        print("saved")
print("Parsing...")
record = SeqIO.read(filename, "fasta")
print(record)

# and also I need to input in this code part where I will concert to json:
import json
my_dict = {}
with open("Salm_ser_Enteritidis.fasta", 'r') as new_fasta:
    for x in SeqIO.parse(new_fasta, 'fasta'):
        my_dict = {
            "dataset": x.id,
            "sequence": str(x.seq)
        }
with open('my_dict.json', 'w') as f:
    json.dump(my_dict, f)

fasta upload entrez ncbi • 691 views

ADD COMMENT • link 2.6 years ago by biology.may20 ▴ 20

score 2 · Answer 1 · 2022-04-06

2

Entering edit mode

2.6 years ago

GenoMax 147k

upload to NCBI means something different. It looks like you are trying to download some genome files from NCBI. Unless you are required to do this via Biopython you can find other tools that can do this easily: How to download all Pseudomonas aeruginosa Genomes from NCBI Genomes database?

ADD COMMENT • link 2.6 years ago by GenoMax 147k

0

Entering edit mode

thaks, it's sure upload from NCBI, I saw earlier ways to download via ncbi-genome-download for example, but I'm afraid to download without limitations, as there are more that 400 000 genomes and uploaded file might be too big

ADD REPLY • link 2.6 years ago by biology.may20 ▴ 20