PS: this message has also been cross posted on seqanswers. I just want to reach out to more bioinfo guys so thought of posting it here too.
Problem:
So we have a dataset of variable biological insert library as we are sequencing the 5' and 3' end of transcripts. As a result the distance between the mates( <--- --->) is dependent on the length of transcript. To map the reads initially I am first using Mosaik which i belv does a better job with variable insert mate pair data.
After mapping we still see 40% orphaned reads where one read maps and the other doesn't. Is there a way that I can do a local re-alignment for these orphaned reads and attempt to map the mate within a given radius of the mapped mate.
Anything already out there ?
Thanks! -Abhi
what is your alignment rate if you turn off pairing?
@Jeremy : The alignment rate for read 1 and read 2 independently is > 80%. It is the pairing that is causing problems.
@All : Any way I can know through an email when any updated is posted for a question I am interested in. I current get an email but a day later which doesn't help.
your realignment should be your unpaired alignment. I would suggest loading the subset of read names whose mates are unmapped (samtools view -bf 0x0004 reads.bam_ and then using those to examine where the mates align naturally