How To Get Junctions File Using Gsnap
1
2
Entering edit mode
12.5 years ago
Varun Gupta ★ 1.3k

Hi Everyone

So I am working on rna seq data. I am using gsnap for my analysis(though I have also used tophat). With tophat , a junctions.bed file is produced which actually tells us about the splicing patterns going on.

Using gsnap I am able to produce sam/bam files. Is there a way I can convert these sam/bam files into something similar to junctions.bed files.

My main goal is to see what is the difference between junctions produced from tophat and gsnap. Tophat produces junctions.bed file but gsnap doesnt. So is there a way I can get a junctions file from sam/bam output of gsnap.

Hope to hear from you guys soon

Regards
Varun

gsnap RNA-seq • 3.8k views
ADD COMMENT
0
Entering edit mode

Are you comfortable writing perl scripts?

ADD REPLY
0
Entering edit mode

yes i am comfortable

ADD REPLY
0
Entering edit mode

great, then it is relatively straightforward to obtain the junctions given a SAM/BAM file. All you need is the start position of a read and its CIGAR. Suppose a read starts at 10000 and cigar is 15M20N75M, then the read is present from 10000-10014 and 10035-10109. So, the read's junction coordinates are 10015-10034. It should be a bit more involved to take into account the effect of Insertions and deletions. But this is the basic principle. Let me know if you run into problems.

ADD REPLY
0
Entering edit mode

Hey Arun

What if the CIGAR string is something like this 1S14M20N75M

ADD REPLY
0
Entering edit mode
12.0 years ago
Malcolm.Cook ★ 1.5k

If you're comfortable with R/BioConductor

and have a current installation

then

you can the the approach given in my recent update to Obtaining Junctions.Bed File From Aligners Such As Gsnap

ADD COMMENT

Login before adding your answer.

Traffic: 1554 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6