Hello fellow bioinformaticians!
Once again I have a rather broad questions to ask, namely if it is possible in a general manner to identify which sequencing platform (and mayhaps machine?) has been used to produce the raw reads of a certain sequencing file without any other information other than the sequencing file.
There's always the educated guess that can be made based on read length, phred score scale, identifier lines, bam vs fastq, paired vs single end, etc...
I'm still very much a novice in the field and I have a quite poor feeling for how the sequencing technologies have evolved over the past decade which have included different phred scales, ascii ranges, and varying read lengths.
So my question is: is there a sure-fire way to figure out what technology was used to create the raw reads from say the past ~5 years? My main wish would probably be to differentiate the main sequencing technologies (illumina, ion torrent, pacbio, nanopore, 454), and all inputs are appreciated.
Thank you!
Hello, I have a fastq file from 2012 and want to know what technology was used to create this file. I read this post and could not find my answer. Here is one of the reads:
@ILLUMINA-08A740:1:FC64PL6AAXX:5:1:3350:1028 1:N:0:TGACCA NTTGACTGTGCTACGCGAATCATGGAATTCTCGGGT
Thanks
This is clearly marked as "Illumina" which would be the technology used. Are you interested in finding out what kind of Illumina sequencer this was sequenced on? Since the sample ran in lane 5, it is most likely a HiSeq.