How can I extract the sequencing reads containing a specific linker/tag?
2
1
Entering edit mode
4.5 years ago
naeem40thju ▴ 10

I have two FASTQ data sets/files (read.R1.fastq and read.R2.fastq) generated from paired-end read sequencing. One file (read.R1.fastq) contains 18 nucleotide long linker/tag. How can I extract the reads containing the linker (allowing 3/4 mutations inside it) from read.R1.fastq and its corresponding reads from read.R2.fastq and save the extracted reads into two separate files? Is it possible to prepare a single file after extraction which will contain the full-length sequence of reads and their information (such as ID, quality score, etc.)?

Thanks in advance.

sequencing next-gen • 1.4k views
ADD COMMENT
0
Entering edit mode

How can I extract the reads containing the linker (allowing 3/4 mutations inside it) from read.R1.fastq and its corresponding reads from read.R2.fastq and save the extracted reads into two separate files

Use cutadapt for this. You can use max error rate or write a regex with known positions of variation.

Is it possible to prepare a single file after extraction which will contain the full-length sequence of reads and their information (such as ID, quality score, etc.)?

If you are looking for merging reads from R1 and R2 retaining quality scores etc, try pandaseq. If you are looking for interleaving, try bbmap

ADD REPLY
1
Entering edit mode
4.5 years ago
Ido Tamir 5.2k

you can use cutadapt for this serching for the linker allowing the appropriate number of mutations action = none and saving the reads with/without linker into separate files

ADD COMMENT
1
Entering edit mode
4.5 years ago
GenoMax 147k

Actually three programs from BBTools can do all of these operations. bbduk.sh with literal=real_linker_seq hdist=4 (up to 4 errors) to look for the linker tag. You can keep or filter the reads out. bbmerge.sh to merge the reads (if they actually are designed merge). reformat.sh to interleave the reads (if that is desired). And much more.

ADD COMMENT
0
Entering edit mode

Thank you very much!

ADD REPLY
0
Entering edit mode

please upvote useful answers and accept one as the best, not thank in a comment

ADD REPLY

Login before adding your answer.

Traffic: 2343 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6