I am learning how to analyse NGS data. I have a data for 192 samples. These were obtained through a targeted sequencing library prep.
I have 192 samples, but technically I have received 2 files for each sample. For example:
- sample1_TTGCCTT_L008_R1_001.fastq.gz
- sample1_TTGCCTT_L008_R2_001.fastq.gz
Presumably the reason there are 2 files is because paired-end sequencing was performed. I've been reading around on the various steps of NGS analysis but I can't seem to find an answer to the following question:
How to handle paired-end reads? Do you do have to merge them and if so when? Before alignment presumably? Also, do I have to uncompress the fastq.gz before I do anything? I am very new to NGS so apologies if these are really basic questions. Thanks.
And if the reads are longer than the fragment then you'll sequence through the fragment into the adapters. This is why many pipelines include an adapter trimming step.