Entering edit mode
8.8 years ago
firestar
★
1.6k
I have a fastq file. I know that the quality lines at say position 1142354 and 1145663 is of incorrect length or bad format etc. How do I remove those two lines along with associated sequences/quality lines (ie; if a line is removed, then other 3 related lines must be removed as well) as well and save to a new file?
I currently use this to get positions with incorrect read lengths.
awk '{if(NR%4==2) print NR"\t"$0"\t"length($0)}' input.fastq > input-readLength
awk '{if(NR%4==0) print NR"\t"$0"\t"length($0)}' input.fastq > input-qualityLength
awk 'NR==FNR{a[$3]++;next}!a[$3]' input-readLength input-qualityLength