Hi, i'm new to Aligning tools and handling DNA data.
I understood the process of aligning the reads to reference genome and qualifying them based of quality. (i was interpreting results from BAM file). but i'm very confused about the paired end reads. i'm assuming that we have reads in FASTQ file in this way. say 150bp sequence and it is splitted into 2 reads, Read 1 and Read 2. and also we have reverse compliment of same Read 1 and Read 2. is it true ?
i have gone through some materials but still confused about this, could you please give me clarification about this.
Thanks for your time.
Just to clarify (as per the OP's question): R2 is not the reverse complement of R1. R1 and R2 come from different places in the genome, which are usually 200-500 bp apart (depending on the insert size). It is true that R1 and R2 are sequenced from different strands, but usually you don't really care about that, as all software tools which handle FastQ files will take care of this for you.