Merging paired fastq read files with small overlap region.
0
0
Entering edit mode
3.9 years ago
dbready2 • 0

I have some data from a lab member who sequenced a crispr library plasmid pool that they made. The forward and reverse reads overlap by 6 bp's and I was wondering how I could merge these fastq files together based upon knowing this overlap size. When I use bbmerge or other merging tools, few reads (less than 5%) are merged presumably because of the very short overlap region.

sequencing alignment • 1.1k views
ADD COMMENT
0
Entering edit mode

Yes, that is in fact a short overlap, and due to the limited size of overlapping bases the confidence to decide whether the overlap is true and reliable is actually limited. Wouldn't it be simpler to trim one of the reads back a few bases?

ADD REPLY
0
Entering edit mode

Giving this a try a now.

ADD REPLY
0
Entering edit mode

let me try to understand, it is always 6bp?

ADD REPLY
0
Entering edit mode

Yes, the amplicon library they prepared is from a CRISPR library pool in which the only thing that varies is what is contained in the 20 bp gRNA. I inspected a couple read pairs in the fastq to confirm.

ADD REPLY
0
Entering edit mode

could you post 5 sequences or so?

ADD REPLY
0
Entering edit mode

If these are amplicons and you are sure they should overlap then try the following option with bbmerge.sh. You will need enough sequence data for this to work. Set this to 10 and see if that works.

extend=0             Extend reads to the right this much before merging.
                     Requires sufficient (>5x) kmer coverage.
ADD REPLY

Login before adding your answer.

Traffic: 1775 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6