Hi
I recently got 3 sample WGS for snp analysis from same genome. I see that my fastq file has Ns. How should I tackle this? If I do not trim this and take it for mapping onto my reference genome will the aligner ignore this N while mapping?
Total number of bases 45191688940
Number of base N 884912
Also the total number of bases in all 3 sample is different. How is this possible when sequencing was done for same genome of different samples?
s1_R1 s1_R2 s2_R1 s2_R2 s3_R1 s3_R2
45191688940 45191688940 43709052900 43709052900 53171402300 53171402300
So what do you mean by total number of bases.?
each fastq line has a length (for example, 100 bp). multiply that by the number of fastq entries and that's how many bases of sequence you have. Only after you align the data do you have the ability to talk about sequence coverage across the genome.
Seriously, though - There are many resources explaining the sequencing and alignment process. I recommend that you seek out and read some of them so that you understand this before proceeding.