Assembly Aligned Paired-End Reads

0

Entering edit mode

11.4 years ago

mathieu.bahin ▴ 90

Hi all,

I have a set of mapped paired-end reads and I would like to assemble the ones that overlap respecting the pairing information.

This means assemblying only pairs when the 2 first mates overlap and the 2 second mates overlap too. The reads are already mapped on a genome, there is nothing more to do with the sequences, only with the positions.

The goal is to get the extended positions with the count information.

Pairs example:

chr5:1456-1498,+     chr5:1654-1702,+
chr5:958-1012,+      chr5:1318-1388,+
chr5:1423-1478,+     chr5:1612-1667,+

I would like to get:

2     chr5:1423-1498,+     chr5:1612-1702,+
1     chr5:958-1012,+       chr5:1318-1388,+

I can't find any software working on the positions, all I can find is FLASH, PEAR, etc. which are working on the fastq files.

Cheers

paired-end reads rna-seq assembly • 2.9k views

ADD COMMENT • link 11.4 years ago by mathieu.bahin ▴ 90

0

Entering edit mode

Why not just use cufflinks and then get counts using featureCounts from the resulting GFF file?

ADD REPLY • link 11.4 years ago by Devon Ryan 105k

0

Entering edit mode

Thank you for your answser. I am not sure that I totally understand it. I think that cufflinks would assemble reads independently of the pairing information, which I don't want. I want to process each pair against each pair. Is there an option in cufflinks to only assemble when the 2 mates are overlapping ?

ADD REPLY • link 11.4 years ago by mathieu.bahin ▴ 90

0

Entering edit mode

Hmm, true, I guess that wouldn't work for you. You might have to code something with GenomicRanges.

ADD REPLY • link 11.4 years ago by Devon Ryan 105k

0

Entering edit mode

Ok thanks, maybe I'll try that. I have another lead with 'PairtoPair' from bedtools too.

ADD REPLY • link 11.4 years ago by mathieu.bahin ▴ 90

Login before adding your answer.