Entering edit mode
8.6 years ago
padma18krishna
•
0
How to extract only the nucleotide sequence from the entire fastq file containing 1000's of reads using Linux ?
How to extract only the nucleotide sequence from the entire fastq file containing 1000's of reads using Linux ?
For example:
paste - - - - <file.fq | cut -f2 > only_seq
Don't know why anyone would want to do that but it's what you asked..
awk '{if(NR%4==2) print $0}' file.fq > seq.txt
No clue why you'd want to do that, but there you go.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Adding to the below answer, this linux tutorial would help you for future tasks.
Since a couple of answers below have asked "why anyone would want to do this" I will offer one explanation. It may not be the one applicable in this case.
I remember doing this for someone who wanted to get counts and sequences of all unique combinations present in the dataset. Just counts.
The thing is, it looks like an assignment offloading.