Question

FASTQ file names meaning at this link

0

Entering edit mode

10.6 years ago

win ▴ 990

Hi all,

There are many fastq file at this link: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/NA12892/sequence_read/ but some of the fastq files are named ERR001827_1.filt.fastq.gz and others are named SRR001203.filt.fastq.gz

It also seems that that ERR001827 are also available as ERR001827_2 which means paired end but the SRR ones are not.

I am trying to create a single FASTQ files for each paired end read. Can someone please let me know if I should use the ERR series only.

FASTQ • 2.3k views

ADD COMMENT • link updated 3.4 years ago by Ram 45k • written 10.6 years ago by win ▴ 990

score 0 · Answer 1 · 2015-04-10

0

Entering edit mode

10.6 years ago

Devon Ryan 105k

That particular SRR file is from a single-end run, but some other SRR files are from paired-end runs. So if you're looking for the paired-end datasets then you'll use a subset of both groups of files (I presume they were sequenced in different places, since one is from SRA and the other ENA).

ADD COMMENT • link 10.6 years ago by Devon Ryan 105k