Hi Fellows,
I used below cmd to remove duplicate reads.
fastx_collapser -Q33 -v -i CombineIonXpresRNA.fastq -o Collapsed.fastq
But output file was fasta instead of fastq. Output file says fastq but in actually it is fasta file.
I would really appreciate your help!
Thanks,
Naresh
Hi Deedee,
Fastq format will have quality value and header starting with @ sign as shown below. I want this both option including sequence in output file. But output file must represent all of the reads as shown below: Input: 25122053 sequences (representing 25122053 reads) Output: 17946833 sequences (representing 25122053 reads)
Fastq file format:
@GWG70:09324:09027 GAGGGAACCTGATCTCCGGTCGTCAGCTCCGTCCGATTCTGCTTCTGACTAGGAAGCCAATGGCTCTGGTTAAGAAGCTCAGGAA + 59;>4;4;7>8877885;6:;888;;:::57552577478797265776333/4+..45+..35442634.2/25466:=885;; @GWG70:09324:09031 ACTTCCTTCCACACCTTACCTAATCTAATTCCGAATCTGGGATTTGGATCTCAGAAAGATGAAGGTGGTTGATAAAATTCAAATCTGTGACAGGATCGAAGCCAA + 9661715493886604/341//+.656387<6<<59878829;:2;6;:::::::;3;<:::5<5::2:3691../)/-022010---36854145554*-/145 @GWG70:09324:09043 TCAGGTGATCATAGTCTAGTCCATCTGTGTTGTGTTTATGCTTGTTCTCCCGTTTCTTTAAATTCTATGTTCTTTAAATTTCTATGTGAAACTAGTGTTTCTATTTCCTTATTCACACTAC
Fasta format:
It would be awesome, if can update tool for me. Thanks a lot in advance.
Naresh