Precise Ngs Alignment
2
3
Entering edit mode
12.4 years ago
Leszek 4.2k

I need to precisely align NGS data. The problem is, in global alignment I loose plenty of reads (20%), while local alignment fails to align merely 5% of reads. I don't care that much about 3'-end, but 5'-end has to be aligned with 1bp precision. Iterative hard-clipping of 3'-ends helps, but it would be perfect to do it in one run. Do you know of any program that can precisely align 5'-end and allows for soft-clipping of 3'-end at the same time?

next-gen alignment • 3.6k views
ADD COMMENT
1
Entering edit mode

I see BWA does the soft clipping of the 3' end based on quality (-q option). and also MOSAIK v2.0 and Bowtie 2 supports soft clipping.

ADD REPLY
0
Entering edit mode

bowtie2 does global or local alignment. in local mode, both ends may be clipped and I need entire 5'-end. Concerning BWA, indeed it does soft-clipping of 3-end based on quality down to 35bp ("-q INT quality threshold for read trimming down to 35bp [0]"), but I have noticed some reads are able to align after trimming to 25, 21 or even 16bp. I have never used MOSAIK, but will give it a try:)

ADD REPLY
0
Entering edit mode

BTW did you tried pre- trimming the reads to specific length and aligning it...Generally I follow pre- trimming the reads after looking at the FastQC report.

ADD REPLY
0
Entering edit mode

I did, but the point is some reads align uniqeuly @ 41bp, and other @ 31bp, and other @ 21bp. So I would need to do iterative 3'-trimming of unaligned reads. Novoalign does it for me automatically:)

ADD REPLY
3
Entering edit mode
12.4 years ago

Novoalign has a miRNA mode that does something similar to what you are asking, though I haven't used it in quite some time.

ADD COMMENT
0
Entering edit mode

+1 Novoalign is slower than bowtie2 or bwa, but align much more reads and algs look much better for me

ADD REPLY
2
Entering edit mode
12.4 years ago
JC 13k

I don't know an optimal solution for this, once I had a similar case where some reads need trimming with different sizes, I use Blat to align to the reference and then use some Perl scripts to convert the PSLX to a valid SAM format, inserting soft-masking ends and indels flags in the CIGAR when it was required. The mapping worked terrific but it was extremely slow. Do you want to take a look? http://github.com/caballero/RNAseq-Pi/ the files megablat.pl and pslx2sam.pl under bin/

ADD COMMENT
0
Entering edit mode

thanks JC, I was thinking about BLAT. I think it could handle current data (4x6mln reads), but we expect a lot more in the following months, so I prefer to find efficient solution for that.

ADD REPLY
1
Entering edit mode

yes, I agree, now I only use Blat in extreme cases when Bowtie/BWA cannot be used.

ADD REPLY
0
Entering edit mode

gmap is faster and probably still as sensitive as blat.

ADD REPLY

Login before adding your answer.

Traffic: 1854 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6