Hello,
recently I received quite big dataset of three pair-end libraries (350, 550, 700 is) and two mate-pairs libraries (Nextera, 3000 and 5000 is). I ve started with Trimmomatic creating really nicely trimmed pair-ended libs. This procedure failed badly for mate-pairs libraries, due to principle of mate-pair sequencing. So, I ve took quite recent and very nicely looking NxTrim, which does exactly what I need (cutting nextera adapters and sorting reads), but it outputs all mp and pe reads in one file. So, I assume, that mate pairs are coded in the headers of reads, but I am wondering how... And how to sort reads to R1 and R2 files respectively. Because, I still need to trim sequencing adapters from the sorted mate-pairs libraries and Trimmomatic expects paired reads in separated files on the input.
So, does anybody know, how exactly are paired reads recognised?
I found that this merged fastq files are called interleaved FASTQ (found at webpage of another trimmer adapterremoval). Using this, I ve googled a bash script which is probably solution for my problem, but not really answering a question - How it is coded. Is it just the order of sequences? The bash script is just soring odd reads to R1 file and even reads to R2 file...
maybe you can give us some read header from the mp and pe files so we can see how the differences is encoded!