Hi!
I need to get the introns coordinates (chr:strat-end and the strand) of the spliced reads wich are in SAM file.
I have no experience with this kind of format, so I plan to transform the SAMs files to BED12 files with this pipe line:
samtools view -bS -o file.bam file.sam && bamToBed -bed12 -i file.bam > file.bed12
And then, use galaxy or own python script (which I haven't wrote yet) to extract the introns from the spliced reads.
The problem with this method is weight of the files... it'll take some time to convert to BED12 and then extract the introns...
Do you know a more direct way to extract the introns coordinates from the SAMs files?
The intron coordinates aren't really there, if anything, they are implicit in (many of) the alignments. Are you sure you want to roll your own instead of running tophat or mgene or similar?
Yes, I know there are implicit in bed12... But I'm not worry about that, because I wrote a scrip to extract introns coordinates from psl alignment format, so I'll try to adapt to bed12 inputs, or do as brentp proposes...