Extracting only the sequence from fastq files
2
0
Entering edit mode
8.6 years ago

How to extract only the nucleotide sequence from the entire fastq file containing 1000's of reads using Linux ?

Fastq Linux • 12k views
ADD COMMENT
0
Entering edit mode

Adding to the below answer, this linux tutorial would help you for future tasks.

ADD REPLY
0
Entering edit mode

Since a couple of answers below have asked "why anyone would want to do this" I will offer one explanation. It may not be the one applicable in this case.
I remember doing this for someone who wanted to get counts and sequences of all unique combinations present in the dataset. Just counts.

ADD REPLY
0
Entering edit mode

The thing is, it looks like an assignment offloading.

ADD REPLY
1
Entering edit mode
8.6 years ago
5heikki 11k

For example:

paste - - - - <file.fq | cut -f2 > only_seq

Don't know why anyone would want to do that but it's what you asked..

ADD COMMENT
0
Entering edit mode

Agreed, to tack onto this, think about the problem, remember "just because you can, doesn't mean you should..."

ADD REPLY
1
Entering edit mode
8.6 years ago

awk '{if(NR%4==2) print $0}' file.fq > seq.txt

No clue why you'd want to do that, but there you go.

ADD COMMENT
1
Entering edit mode

or just awk '(NR%4==2)' file.fq > seq.txt :-P

ADD REPLY

Login before adding your answer.

Traffic: 3087 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6