index sequence in fastq header
1
0
Entering edit mode
3 months ago
hpapoli ▴ 150

Hello,

I've been just inspecting my Fastq files a bit and I have a question about the index sequence.

Given a fastq sequence header as follows:

@A00181:639:HNTFMDSX5:2:1101:1018:1000 1:N:0:ACACTAAG+TTATGGAT

I understand that ACACTAAG+TTATGGAT is a sequence index which differentiates samples on a flow cell. My first question is whether my understanding is right?

If so, wouldn't I expect that all reads in a given sample to have exactly the same sequence for their index? This is mostly the case except for example, I also see different indices here and there in the same fastq file such as ACACAAAG+ATATGGAT. Why is that the case?

Thanks so much for your help!

fastq • 435 views
ADD COMMENT
2
Entering edit mode
3 months ago
GenoMax 148k

My first question is whether my understanding is right?

Yes. It is actually a pair of indexes (+ separate the two indexes). This is a dual indexed sample. One can also have single indexed samples (there will be only one sequence in the header).

wouldn't I expect that all reads in a given sample to have exactly the same sequence for their index?

For a particular sample labeled with that index pair, yes.

I also see different indices here and there in the same fastq file such as ACACAAAG+ATATGGAT.

If the indexes differ by 1 or two bases then they are considered to be identical for the purpose of demultiplexing (allowing for sequencing errors). See hamming distance: https://en.wikipedia.org/wiki/Hamming_distance

Only other way that is possible it the file you have contains multiple samples and may need to be demultiplexed (assuming this is not single-cell data). There are methods such as demuxbyname.sh from BBMap suite and deML that can be used for demultiplexing the data.

ADD COMMENT
0
Entering edit mode

This is great! Thanks very much! Each fastq file contains 1 sample and indeed, the difference is in 1 base. Thanks for explaining this very clearly!

ADD REPLY

Login before adding your answer.

Traffic: 1983 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6