Fastq Format Issue
2
0
Entering edit mode
12.4 years ago
Arpssss ▴ 40

I am doing some experiment with BowTie. Now, I want to do experiment with 150 bps read length. So, I download it from here. And converted to fastq format. Now, I see, the fastq format looks like,

@ERR103405.1 M10_151:1:2:12250:1321 length=302 ATTTACTGCCTTGTGTCTCCAGTGCGCTGAAAATACCTTTATCTTGAAATAAGTTAACTAACTCTTGGATACCTTTAATTAATGCTGGGTTACCACCAGAAATTGTAACGTGGTTAAATAAATCGCCACCAATACGTTTTAATTCATCATAGAACAGCTGGATGTGATTATCGCTGTAGCTGGTGTGATTCTGCATTTACTTGGGATGGTAGTGCTAAAGGCGATATAAAACTCATGACCGCTGAAGAAATTTATGATGAATTAAAACGTATTGGTGGCGATTTATTTAACCACGTTACAAT
+ERR103405.1 M10_151:1:2:12250:1321 length=302 CCCFFFFFHHHHHHHIHJJJJJIIJJIJJIJJJIIGJJJJIIGIJJHIGIIJJIIIJIIJJIJEIJIJFIIIFJGHHGHHFFFFFFFEDCCACCDA?ABDDDDDDCDC@?<ABBBDDDDEDDDC<?B?@BDDDDDB>CC@C:>AADDCACDB@CFFFDDHHBFHEHIIIIIGJIHHEGHIIHE1C?D?GGGIIIIGIFI>BHHIJ@3CHBDGGICHGEHIIGHE>BEDEDE;ACCDDCCA?B=BBCDCCCC@@>>C@CDC>@DCDCDDD<<@?AC(2??BDBDBCDCDDCC::?881<?C>:

Now in NCBI, they described it as "DNA for paried end (150bp) sequencing on an illumina MiSeq". But here it looks it is 302 bps read. Can anybody help me why it is given in above sequence, "length=302" while it is written in the page that it is a 150 bps read.

bowtie genome fastq • 3.8k views
ADD COMMENT
3
Entering edit mode
12.4 years ago
Neilfws 49k

If you are converting from SRA to FASTQ using the SRA Toolkit, you need to split joined reads. This option puts forward + reverse reads into one file:

fastq-dump --split-spot myfile.sra

And this one generates 2 separate files for forward + reverse:

fastq-dump --split-files myfile.sra

See my blog post for some more details.

ADD COMMENT
0
Entering edit mode

Thanks a lot. This is what I really want.

ADD REPLY
2
Entering edit mode
12.4 years ago
Vikas Bansal ★ 2.4k

Hi. The link you provided, it clearly says that reads are joined. So you can parse the file and separate the reads till 151 bp each. If you will click on "Metadata" (at link you mentioned), its written that actual_read_length:151.

ADD COMMENT
0
Entering edit mode

Yah. I just saw it. Thanks. But, I want 150 bps single read .sra file. Finding no where.

ADD REPLY
0
Entering edit mode

For the same or any file having single end reads?

ADD REPLY
0
Entering edit mode

Any file, better if Human/ChIP DNA.

ADD REPLY
0
Entering edit mode

But should have 150 bps single end reads.

ADD REPLY
0
Entering edit mode

Another point is, here (http://www.ncbi.nlm.nih.gov/sra/SRX145461) it says 1 forward, 151 reverse. What does it mean ?

ADD REPLY
0
Entering edit mode

read1 and read2 are in paired end

ADD REPLY

Login before adding your answer.

Traffic: 2229 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6