Hi I'm working with fastx to get the reverse_complement
from fastq file downloaded from sra and separated by sra-toolkit. My problem is after using fastx reverse_complement
, the over-represented sequences identified by fastqc change. I would expect the number of reads over-represented in the two file were the same and the sequence in the second file were the reverse_complement. The command used was:
fastx_reverse_complement -Q 33 -i FILE2.fastq -o FILE2_rev_com.fastq
First two examples of overrepresented sequence in original file detected by fastqc:
AGGCTAGTTTGTTAGTGGCGTGTCCGTCCGCAGCTGGCAAGCGAATGTAA 143240 1.7818072306991304
GGCTAGTTTGTTAGTGGCGTGTCCGTCCGCAGCTGGCAAGCGAATGTAAA 76434 0.9507864693609142
First two examples of over-represented sequences in after reverse_complement
over original file:
CTCGGTACTACATGCTTAGTCAGTCTTTACATTCGCTTGCCAGCTGCGGA 136155 1.6936746962848372
CCTCGGTACTACATGCTTAGTCAGTCTTTACATTCGCTTGCCAGCTGCGG 70491 0.8768596306842531
Any idea what is happening here or what I'm doing wrong?