Question

Counting reads supporting a specific splice junction from bam file

0

Entering edit mode

9.5 years ago

Nicolas Rosewick 11k

Hi,

How can I count the number of reads supporting a specific splice junction (e.g. chr1:15000-20000 where 15000 is the splice donor and 20000 is the splice acceptore position) in a RNA-Seq bam file aligned with STAR ?

Thanks

splice junction bam • 5.1k views

ADD COMMENT • link updated 2.3 years ago by Ram 44k • written 9.5 years ago by Nicolas Rosewick 11k

0

Entering edit mode

Doesn't STAR's .junc file provide that count?

ADD REPLY • link 9.5 years ago by Devon Ryan 104k

1

Entering edit mode

Thanks Devon I forget about the SJ.out.tab file. However if someone do not have this file and only have the bam file. Any idea?

ADD REPLY • link updated 2.3 years ago by Ram 44k • written 9.5 years ago by Nicolas Rosewick 11k

0

Entering edit mode

Ah, I'd forgotten the exact name, thanks.

If you don't have that then you'd have to follow Irsan's suggestion. Of course, this would likely mean needing to code something since I don't personally know anything that would do it. In theory, that should be possible with a bit of pysam to get the splice junctions. The simplest method would then be to write a script to output only splice junctions and then to pipe that to sort and then uniq -c followed by awk to reformat. The performance might not be optimal, but for an ad hoc solution I figure that'd work.

ADD REPLY • link 9.5 years ago by Devon Ryan 104k

0

Entering edit mode

In the *.Log.final.out the number of introns/splices are recorded but I am not sure if this this is the number of spliced reads. I have checked the manual but couldn't find an answer. A read can be spliced multiple times of course.

ADD REPLY • link updated 2.3 years ago by Ram 44k • written 9.5 years ago by Irsan ★ 7.8k

Ram · Answer 1 · 2015-04-24

2

Entering edit mode

9.5 years ago

Irsan ★ 7.8k

Star uses N's in CIGAR strings for reads that were spliced (over introns). You can use that to count total number of spliced reads.

If you want to do this for 1 specific regions I think its easiest to make a sashimi plot with IGV.

ADD COMMENT • link updated 2.3 years ago by Ram 44k • written 9.5 years ago by Irsan ★ 7.8k

0

Entering edit mode

What is the possible way, to count the spliced reads (over introns). I want to count it for all the introns. Is there any method to extract that information from IGV? I can see in sashmi plot, it shows the spliced read count for each intron.

ADD REPLY • link 6.0 years ago by Gabby ▴ 20