How to extract reads which contain specific sequence from pair-end fastq files
1
0
Entering edit mode
2.7 years ago
17318598206 ▴ 20

Hi, I wanna extract some reads which contain my interest sequence information from pair-end fastq files,at least 10 bases can be matched, with a maximum of 2 mismatches allowed . How could I extract these reads?

interest sequence like:

GTTTAATTGAGTTGTCATATGTTAATAACGGTAT
CAAATTAACTCAACAGTATACAATTATTGCCATA

thank you!

reads extract fastq • 875 views
ADD COMMENT
0
Entering edit mode

For PE, try cutadapt. Your interest sequence is too long (~ 34 nt) and if minimum 10 bases to be matched, rest bases are mismatched and would be higher than 2 mismatches. Requirements are confusing. Are these two sequences are specific (one for R1 and the other for R2) or applicable to both forward and reverse reads?

ADD REPLY
1
Entering edit mode
2.7 years ago

You are looking for seqkit grep.

ADD COMMENT

Login before adding your answer.

Traffic: 2835 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6