New to the field...how to work with .SRA files?
2
0
Entering edit mode
8.2 years ago

Hi all. Thus far, have figured out how to convert a large .SRA file that was obtained from the NCBI to fastq format--but I have the SRA Toolkit downloaded on my Mac, so according to what I've found online so far, I think that I could just as easily convert this file to ABI SOLiD native, fasta, sff, sam, or Illumina native using similar commands in the Mac Terminal.

My question: what is the best program/approach to go from .SRA data to being able to search for specific SNPs by their rs number? Is Illumina's Genome Studio a useful program? Are there other programs/approaches that would be better?

SNP • 2.8k views
ADD COMMENT
0
Entering edit mode

In case you run into trouble with SRAtoolkit (which you eventually will) here is a way to avoid it altogether.

For most SRA# (except very recent ones, which will be eventually caught up) you can find the fastq files directly by searching EBI-ENA with the SRA#.

ADD REPLY
1
Entering edit mode
8.2 years ago
igor 13k

Illumina GenomeStudio is for analyzing array data, which is not applicable to your case.

FASTQ format contains the raw sequences. You then need to align them to a reference genome, call variants to find positions that are different from the reference, then annotate the variants with dbSNP info to get rs identifiers.

Check GATK Best Practices for the recommended workflow for this type of analysis: https://software.broadinstitute.org/gatk/best-practices/

That might be a lot of work, but it's worth reading just to be aware. You can also try a graphical solution like Galaxy, Illumina BaseSpace, Seven Bridges, etc.

ADD COMMENT
0
Entering edit mode
8.2 years ago
charco ▴ 50

If you are really only interested in SNPs, there are methods which don't require aligning to a reference genome, so called 'reference-free' methods. They will be faster than a full alignment and SNP calling workflow, but the accuracy may be less.

Reading: http://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-15-S4-S10 http://nar.oxfordjournals.org/content/early/2014/11/16/nar.gku1187.full.pdf

ADD COMMENT

Login before adding your answer.

Traffic: 2929 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6