Entering edit mode
5.7 years ago
priya120195
▴
20
Running of Prinseq tool for filtering of paired end reads gives singletons file,good file and bad file separately. I got the correct results when I worked on files having header like
@HWI-ST1025:8:1101:1826:1992
TATGCTGAAGAAGACTCCTGTCAACTCGCTGAATGTTTCATTTGTAGCACGTAACTTGTGCTATCTGATGAAGCAC
+
JGIJJJG?GGIJIGIJFGHIJJJJIIJIJIJHHIGGHBHHHGDGGICHHGHFFFDEDEEECDCDDCACDCD:A<AC
now for running the same tool with reads having different annotation format(given below) is giving only singletons file and bad file.
@SRX1797356.1.1 1 length=101
CCCGTTCGTCGTCGACGAGCATGGCACGGCGCGGTATCAGCTTCAACTCAAACTAACTTACTTCCAGAAAGGAGATCGCACCGTATGAAACCTGTCTCTTA
+SRX1797356.1.1 1 length=101
CCCFFFFFFHHFHJJJIIJJJGIIIIJJJGGIEF9??CDEEDCDDDDD<CCCDCD@CDDDDDDDD>::?CCD?BBCC?BB@DD<?BCCCACBDBDCCDDDC
@SRX1797356.2.1 2 length=100
GTGTACTACTCCGGCGACGCCATCACCATGATCGACGATAACCCCGACCTTGCCTGGGTGTTCCCGGAGGAGGGCAGTGTGCTGTCGGTGGACTGCATGG
+SRX1797356.2.1 2 length=100
???AD?DDDDD:DE)<EE@?DD@8BDD@<BCEIDII@D;5A;@@D@@???A>A>AA?79;<<A:>7&05;8>>9><93>:3>>>>:>?&2&2(48>A>A8
@SRX1797356.3.1 3 length=97
AATCAGGGCATACAGCGGGCGGCGGCTGTCACCGATGCTGCGCAGGATTGAGGAACTGAAATCATAAAGCATGATAAATGGCATGCCGAGAAAATAG
Is there any script to change the header annotation like former file reads to get the good ,singleton and bad file separately??
Please tell more details about 1) the fastq files (were they downloaded from SRA? which command? do you have R1 and R2 reads in separate files, or are they interleaved in one file?) 2) how did you run prinseq. Without this information, there is no way to provide help. I will just point out the second fastq file you showed is probably an interleaved fastq file - is the snippet you showed the input or the output of prinseq?.
P. S.: you probably mean header notation, not annotation.
1)they were downloaded from SRA by prefetch command of sratool kit. Yes I have R1 and R2 reads in separate files.
2) Command
I already have process plenty of fastq files. I know that the issue is only with header line but I am not getting it how to annotate the header and what to write in header.
I don't have a good answer for your question. Did you use
fastq-dump -F
to retrieve the original fsatq headers? Other than that, I would use some other software to process the reads, like Trimmomatic or BBDuk.yes I have used fastq-dump.
But you did not use the
-F
option to recreate original Illumina fastq headers as suggested by @h.mon?no I didnt use this ,as it was paired end data ,i direct used fastq -dump split files command to get 2 reads
-F
option recovers Illumina read headers in the format that you are familiar with. Unfortunately submitters in this case appear to not have provided the necessary data. You are usingprinseq
to merge the reads?