Entering edit mode
8.3 years ago
14134125465346445
★
3.6k
Has anyone experienced repeated read ids from Illumina MiniSeq fastq.gz files?
I have seen a few cases where the fastq.gz files produced from the bcls of a MiniSeq run contain a few spurious reads printed twice consecutively.
Is this something people have also experienced in here?
$ zgrep -A3 -n '@MN00325:3:000H223KC:1:11106:5362:8398\ 1:N:0:GTCCGC' SAMPLE01_S17_L001_R1_001.fastq.gz
SAMPLE01_S17_L001_R1_001.fastq.gz:1162953:@MN00325:3:000H223KC:1:11106:5362:8398 1:N:0:GTCCGC
SAMPLE01_S17_L001_R1_001.fastq.gz:1162954-AAAAAAATAAATAATTTTATTAAAAAGTGGGTAAAGCATATGAATGGATATTTTTTAAAAGAAGATATTTATGTAC
SAMPLE01_S17_L001_R1_001.fastq.gz:1162955-+
SAMPLE01_S17_L001_R1_001.fastq.gz:1162956-AFFFFFFFFFFFFFF//=F//FFFFFF66AFAFFFFFFFF//FF6//F/F=FA//AFFFF/FF/FFF///FAA/F/
SAMPLE01_S17_L001_R1_001.fastq.gz:1162957:@MN00325:3:000H223KC:1:11106:5362:8398 1:N:0:GTCCGC
SAMPLE01_S17_L001_R1_001.fastq.gz:1162958-AAAAAAATAAATAATTTTATTAAAAAGTGGGTAAAGCATATGAATGGATATTTTTTAAAAGAAGATATTTATGTAC
SAMPLE01_S17_L001_R1_001.fastq.gz:1162959-+
SAMPLE01_S17_L001_R1_001.fastq.gz:1162960-AFFFFFFFFFFFFFF//=F//FFFFFF66AF/FFFFFFFF//FF66/F6F=FF//AFFFF/FF/FFF///F/A/F/
If real and reproducible then it sounds like a bug of some sort in the on-board data processing software on the MiniSeq. This would have been caught a long time ago unless the MiniSeq you are using has not been updated.
May also be worth emailing Illumina tech support in case whoever produced the data has no satisfactory answer.
can you post an example ?
I added an example in the question now.
The quality scores are slightly different, although they come from the same position on the flow cell. This is.... odd. Was BQSR done and then the files were merged or something?
Its strange Contact people who provided the data and ask if they have done any processing.
Never see anything like that using bcl2fastq2.