Hi, I have a doubt about the number of sequences in a paired end merging using PANDASEQ and the sequencing counting. Let me explain, I merged a forward and reverse set of sequences using pandaseq, before this I checked the number using the command
EDIT: I forgot to mention I converted the fastq to fasta with fastq_to_fasta script.
grep -c ">" sample_R1.fasta
It gives me around 200.000 sequences, when I execute PANDASEQ the last report shows this.
0x7fffc5dabb50:2 STAT TIME Tue Mar 12 02:53:23 2019
0x7fffc5dabb50:2 STAT ELAPSED 47
0x7fffc5dabb50:2 STAT READS 63600
0x7fffc5dabb50:2 STAT NOALGN 4286
0x7fffc5dabb50:2 STAT LOWQ 156
0x7fffc5dabb50:2 STAT BADR 0
0x7fffc5dabb50:2 STAT SLOW 8547
0x7fffc5dabb50:2 STAT OK 59158
0x7fffc5dabb50:2 STAT OVERLAPS 0 0 0 0 0 648 3666 3069 224 201 57 1076 101 57 231 47 170 4537 2147 4845 393 56 193 1327 334 57 1460 121 278 859 3108 26842 1541 75 16 7 7 3 10 1 5 3 7 2 0 0 0 7 98 833 68 3 4 1 5 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 2 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 6 1 0 2 22 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 2 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 4 27 2 3 0 1 0 0 0 1 0 2 0 0 0 0 0 0 2 0 0 0 0 7 16 0 0 0 39 4 0 0 1 0 1 0 2 0 0 0 0 0 11 153 0 1 1 0 28
For that reason I recounted the merged pairs and I obtain again 200.000 sequences. I also checked the merged paired end fastq with fastQC and it showed me 200.000 sequences, and the size of the sequences and their Q shows the behaivour of merged sequences.
So, my questions is. Why pandaseq shows that last report showing around 63000 READS and other softwares shows the merged sequences are properly consistent with what I expected?
I will be really grateful with any suggestion,
Oscar :)
Hi, sorry for my mistake. I forgot to mention that I converted the fastq to fasta and counted. Them with that command. In the other hand fasQC also red the fastq and counted around 200.000 sequences.
I tried to use your suggestion replacing my.fastq with my file names, but it doesn´t work.
edit: I could run the fastq count, there was just an ' missing at the end of the awk command