Extract raw sequence from the .sra file and to check whether the extracted sequence is a complete genome using python
0
0
Entering edit mode
7.9 years ago
Sanchez95 • 0

I am a newbie to bioinformatics I need to extract the genome sequence from the sra file.I have tried converting the .sra to fasta and fastq format and extracting the sequence but concatenating all the reads do not result in a assembled complete genome.So all I want, is to extract a complete genome sequence from the .sra file.

genome sequencing sra fasta ncbi • 3.3k views
ADD COMMENT
0
Entering edit mode

Those sra files contain just raw sequence data, so no assembled genome will be in there. You'll have to do the assembly yourself, and whether that's possible will depend on the data/experiment/organism/technology.

Try to be more informative with regard to which technology, which organism and which purpose.

ADD REPLY
0
Entering edit mode

Regarding the technology,organism and purpose, I have used the NCBI Toolkit ( https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc , using fastq-dump)to convert the sra to fasta or fastq formats and still exploring the method to assemble the genome from the raw sequence obtained. For now I am particularly looking for staphylococcus aureus and its subspecies. Although, I am aware that these file are there on NCBI but to crosscheck my method for classification of species I am relying upon the research already done in the field. The data set I'm referring to is http://www.nature.com/articles/ncomms10063#supplementary-information (Supplementary data set 1) I am comfortable with python and have used the Bio-python repository.Please suggest a way I could assemble the genome using Python

ADD REPLY
1
Entering edit mode

You would need to use a program for de novo assembly such as velvet, for example, or abyss or soapdenovo.

ADD REPLY

Login before adding your answer.

Traffic: 1612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6