Entering edit mode
5.6 years ago
hsu
▴
40
Hi, I wanna extract some reads which contain my interest sequence information from fastq file, but these sequence information I interested is in the middle of a read or start of a read. How could I extract these reads?
Thank you!
If not only looking for exact matches, you can try seqkit grep:
seqkit grep --by-seq --max-mismatch 1 --pattern "ATCGAAG" test.fq
Use
bbduk.sh
from BBMap suite in filter mode. User guide here. Addliteral=sequence_you_are_looking
.What are these sequences, and why not extract them with the grep function?
Are you are looking for an exact match of your specific sequence? What have you tried?