Question

Aligning multiple short reads with multiple long reference reads

0

Entering edit mode

7.3 years ago

MAPK ★ 2.1k

Hi, I was wondering if there a way to align short reads with multiple long reads and see a long stretch of aligned region from the same genome? I have millions of short reads and I want to align those short reads to thousands of long sequences from the same genome and see the aligned region. Thanks

alignment • 1.8k views

ADD COMMENT • link 7.3 years ago by MAPK ★ 2.1k

0

Entering edit mode

That is basically what every NGS aligner does. So is there a question here?

ADD REPLY • link 7.3 years ago by GenoMax 148k

0

Entering edit mode

Hehe Just got confused. So basically can use BWA?

ADD REPLY • link 7.3 years ago by MAPK ★ 2.1k

1

Entering edit mode

Yes, you can use BWA, but if the error rate of the long reads (are they long reads or contigs / scaffolds?) is high BWA will be very slow and possibly many reads will remain unaligned.

Or maybe you want to align short reads AND long reads to the same reference genome?

Also, there are tools to use illumina reads to do error correction of long reads.

ADD REPLY • link 7.3 years ago by h.mon 35k

0

Entering edit mode

If you're talking about reads around the 30-50bp mark, I'd use bowtie2 and switch on uniquely-mapped reads only (--best -m 1). If your reads are >70bp in length, use BWA mem.

ADD REPLY • link 7.3 years ago by Kevin Blighe 88k

0

Entering edit mode

I did use bowtie, but the problem arises when it extracts lots of sequences that are not exact match ( it allows for too many mismatches as well). My sequenses are srna reads and I want to align them to retrotransposons (LTR) regions. I have about a thousand LTR sequences and they are a few hundred bases long each. My short reads should match pretty well with LTR regions if there are any read from that region, but I am expecting very few matches from my experimental data. In any case, bowtie pulls out too many reads even from not-so-perfectly aligned regions

ADD REPLY • link 7.3 years ago by MAPK ★ 2.1k

0

Entering edit mode

If you have the reference genome I would suggest to align the reads to the genome first and re-map to the LTR the reads that didn't map to the genome. That should reduce the noise you're dealing with.

ADD REPLY • link 7.3 years ago by Asaf 10k