Question

Mate pairs and paired reads confusion

0

Entering edit mode

3.0 years ago

Joel Wallenius ▴ 220

Hello!

I find myself confused about insert size and pairing of reads. How are read pairs paired? As in, how does the aligning software know that they belong together? How does the sequencing machine? And, will the aligner know the exact distance between reads in a pair, so as to build a scaffold?

Or can only mate pairs do the latter? Are mate pairs still a thing? How does the aligner know the distance between two mates?

I realize these are quite basic questions, and apologize in advance.

Sincerely, Joel

sequencing • 3.6k views

ADD COMMENT • link 3.0 years ago by Joel Wallenius ▴ 220

0

Entering edit mode

Difference Between "Mate Pair" And "Pair-End"

http://seqanswers.com/forums/showthread.php?t=15626

ADD REPLY • link 3.0 years ago by Sej Modha 5.3k

score 2 · Answer 1 · 2022-03-17

2

Entering edit mode

3.0 years ago

Istvan Albert 102k

Mated-pairs is a type of paired-end reads where the distance and orientation of the pairs is different.

Paired-end reads came to describe the Illumina sequencing protocol where the reads are pointing towards one another,r the read lengths are about 150 bp and the distance between ends is a few hundred base pairs:

==150==>     <==150==

|-------  400 ------|

Mated pair libraries used to mean some sort of circularization method during library preparation, where, after sequencing, the reads point in the same direction, and the distance is a few thousand base pairs.

 ==150==>                           ==150==>

|-------            2000              ------|

Note how the aligner can immediately tell what the distance and orientation of the reads pairs are and thus identify the protocol.

Mated pairs are typically used for assembly as it allows ordering more distant pieces of DNA even when the intermediate sequences are missing.

ADD COMMENT • link 3.0 years ago by Istvan Albert 102k

0

Entering edit mode

Thank you Istvan for your reply! You say "Mated pair libraries used to mean...", are you implying this is no longer the case? Are they still being used?

ADD REPLY • link 3.0 years ago by Joel Wallenius ▴ 220

0

Entering edit mode

I have not seen data produced with this technology for a many years now, hence I am not quite sure if it is still in use and wether the terminology is still the same.

I suspect that long-read technology like PacBio has turned mated-pairs into somewhat obsolete technology.

ADD REPLY • link 3.0 years ago by Istvan Albert 102k

0

Entering edit mode

Thank you very much Istvan. Have a lovely week!

ADD REPLY • link 3.0 years ago by Joel Wallenius ▴ 220

score 0 · Answer 2 · 2022-03-17

0

Entering edit mode

3.0 years ago

i.sudbery 21k

See @IstvanAlbert's answer for the difference between paired-end and mate-pair.

For your other question:

The sequencing machine knows pairs belong together because they reside at identical locations on the flow cell. Basically, the two ends of a fragment of DNA have different primers on them. A run of the machine is done using the read1 primer first, the flurorescence at each coordinate on the flowcell recorded at each base cycle, and then the results stripped off. The process is then repeated using the read2 primer. As read1 and read2 are reads from the same physical piece of DNA, they will be in the same location on the flowcell.

The aligner knows that two reads belong together because of the order in the fastq file. The first read in the read1 fastq is the pair of the first read in the read2 fastq, and the 600th read in the read1 fastq is the pair of the 600th read in the read2 fastq. This is why it is important not to change the order of reads in fastq files without taking account of pairing.

ADD COMMENT • link 3.0 years ago by i.sudbery 21k

0

Entering edit mode

Thank you for your helpful reply! I understand now, that was a great explanation. Thank you so much :-] Do you happen to also know how the analogous process works for mate pairs?

ADD REPLY • link 3.0 years ago by Joel Wallenius ▴ 220

1

Entering edit mode

I believe that if you reverse complement the second read (before aligning) the mated-pairs will be in the same orientation as a "regular" paired-end would.

Thus workflows that need that orientation would work with it.

ADD REPLY • link 3.0 years ago by Istvan Albert 102k

0

Entering edit mode

I'm afraid I don't. I've not handled mate-pair reads before, and my impression is that they have mostly been replaced by long read sequencing, but I might be wrong.