I have sam/bam files containing rna-seq reads aligned to a genome. I need to find reads that are aligned to more than one chromosome. Is there a way (using samtools or other software) to achieve this?
EDIT: Even better would be a way to graphically view (similarly to IGV) aligned reads to more than one chromosome at a time and compare between them.
First thoughts on that - I think you would be best off writing a python script to do it. Python has modules that allow parsing BAM/SAM files easy and then I expect you can get co ordinates out easily too. Then you can work out which reads map to multiple chromosomes. It would make a fun programming task.
I need to find reads that are aligned to more than one chromosome
If with this you mean chimeric reads, then the easiest yet time consuming step would be to re-run the mapping (perhaps with TopHat-fusion).
Fusion related options:
If, instead, you mean pairs that have mates mapping in different scaffolds, then it's just a matter of setting the right bitwise flag on samtools view. That would be:
First thoughts on that - I think you would be best off writing a python script to do it. Python has modules that allow parsing BAM/SAM files easy and then I expect you can get co ordinates out easily too. Then you can work out which reads map to multiple chromosomes. It would make a fun programming task.
Thanks for the tip! I have little experience with python, but I will surely look into this.