I need to get rid of all the reads with 3' prime A in a fastq file and get the new fastq without them. How ccould you acheive this ?
I need to get rid of all the reads with 3' prime A in a fastq file and get the new fastq without them. How ccould you acheive this ?
See if this does the trick:
cat your.fastq | paste - - - - | awk -F '\t' '{if ($2 !~/A$/){ print $0}}'| tr "\t" "\n" > filtered.fastq
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Just to be clear you are only talking about discarding reads where the last base is
A
. Can you modify the answers from your question yesterday to start thinking about how you can do this: Command to count reads in fastq file with last basesCurious as to why you want to do this.
Why do you want that ; I am really curious!
By that, do you mean you don't want reads like these
AAAGTACGATCACTACTACATC
AAGTACGATTAACTACTACATC
AGTGTACGGGGATCACTACTAC
But these will be okay?
I want reads like: AATTTATATGGGAGCCAC But not: GATTAGGGCCGCGGGATA
I need to analyze small RNA structures so need to do this with reads not ending with A