Question

Identify sequencing machine based on fastq-file/reads

3

Entering edit mode

8.8 years ago

Jenez ▴ 540

Hello fellow bioinformaticians!

Once again I have a rather broad questions to ask, namely if it is possible in a general manner to identify which sequencing platform (and mayhaps machine?) has been used to produce the raw reads of a certain sequencing file without any other information other than the sequencing file.

There's always the educated guess that can be made based on read length, phred score scale, identifier lines, bam vs fastq, paired vs single end, etc...

I'm still very much a novice in the field and I have a quite poor feeling for how the sequencing technologies have evolved over the past decade which have included different phred scales, ascii ranges, and varying read lengths.

So my question is: is there a sure-fire way to figure out what technology was used to create the raw reads from say the past ~5 years? My main wish would probably be to differentiate the main sequencing technologies (illumina, ion torrent, pacbio, nanopore, 454), and all inputs are appreciated.

Thank you!

sequencing raw reads fastq sequencing machine • 12k views

ADD COMMENT • link updated 7.8 years ago by vaslanzadeh ▴ 20 • written 8.8 years ago by Jenez ▴ 540

0

Entering edit mode

Hello, I have a fastq file from 2012 and want to know what technology was used to create this file. I read this post and could not find my answer. Here is one of the reads:

@ILLUMINA-08A740:1:FC64PL6AAXX:5:1:3350:1028 1:N:0:TGACCA NTTGACTGTGCTACGCGAATCATGGAATTCTCGGGT

Thanks

ADD REPLY • link 7.8 years ago by vaslanzadeh ▴ 20

0

Entering edit mode

This is clearly marked as "Illumina" which would be the technology used. Are you interested in finding out what kind of Illumina sequencer this was sequenced on? Since the sample ran in lane 5, it is most likely a HiSeq.

ADD REPLY • link 7.8 years ago by GenoMax 151k

score 2 · Answer 1 · 2016-08-10

2

Entering edit mode

8.8 years ago

GenoMax 151k

Information for Illumina was compiled in a recent thread: Illumina Instrument Type from fastq?

PacBio sequence data information is here.

FASTQ format entry from WikiPedia has a nice recap of how things have changed (and finally stabilized) over the last decade.

ADD COMMENT • link 8.8 years ago by GenoMax 151k