Calculate Percent Spliced In (PSI)
1
4
Entering edit mode
10.6 years ago
Vikas Bansal ★ 2.4k

Dear all,

I was wondering if there is any tool to calculate PSI from the BAM files. Basically what I want is-

I have gtf file with all the exons and I want to calculate for each exon (lets take exon filled with grey for example) how many reads mapped to junction of that exon (a and b in figure) and how many reads skip that exon (c in figure). So if we have, let's say, 10k exons in the gtf, the output should contain 10k rows with a,b and c as columns.

I used MATS and bam2ssj but did not get what I was expecting. I have gtf file with approx. 300k exons but MATS gives the output only for 3k exons. bam2ssj is calculating the values for introns rather than exons.

Any suggestions?

Thanks in advance.

RNA-Seq Splicing • 20k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks for the reply. I read about MISO but it says

Two general kinds of analyses are possible:

  1. Estimate expression level of exons ("exon-centric" analysis) or,
  2. Estimate expression level of whole transcripts ("isoform-centric" analysis)

I think it include the reads mapped on exons but I am interested only in junction reads.

ADD REPLY
0
Entering edit mode

in the documentation there is also a paragraph about psi estimation.

the definition given in their paper is: 'percentage spliced in' (PSI or Ψ)11 denotes the fraction of mRNAs that represent the inclusion isoform. Reads aligning to the alternative exon or to its junctions with adjacent constitutive exons provide support for the inclusion isoform, whereas reads aligning to the junction between the adjacent constitutive exons support the exclusion isoform; the relative read density of these two sets forms the standard estimate of Ψ.

the paper is here

ADD REPLY
0
Entering edit mode

Hi Vikas,

we have just written a small protocol to do this:

http://onlinelibrary.wiley.com/doi/10.1002/0471142905.hg1116s87/abstract

ADD REPLY
6
Entering edit mode
10.6 years ago

I just tried to write something for this: https://github.com/lindenb/jvarkit/wiki/Biostar103303

I don't have any data to test though.

ADD COMMENT
0
Entering edit mode

Thanks a lot for the reply. Its amazing that you included it in jvarkit so quickly. I am trying to install jvarkit but it is not working. Below are my commands -

ADD REPLY
1
Entering edit mode

use jdk7 please, not 8 (java 8 is too "new" for the libraries included in htsjdk)

ADD REPLY
0
Entering edit mode

Thanks. Now build is fine. But when I do ant biostar103303 or ant Biostar103303 or ant Biostar103303.java, it says

Buildfile: /project/specht/tools/jvarkit/htsjdk/build.xml

BUILD FAILED
Target "biostar103303" does not exist in the project "htsjdk".

Total time: 0 seconds
ADD REPLY
0
Entering edit mode

you forgot to cd ... Please run ant biostar103303 in varkit not in varkit/htsjdk

ADD REPLY
0
Entering edit mode

Finally, I got the output file. But the columns - "exon.count_prev_and_next", "exon.count_prev_and_curr" "exon.count_curr_and_next" are all 0. I think it is not able to calculate junction reads.

ADD REPLY
0
Entering edit mode

do you have a bam (a subset) and a link to gtf so I can test my program ?

ADD REPLY
0
Entering edit mode

thanks, I'll play with it next week.

ADD REPLY
0
Entering edit mode

I've updated my sources.It seems to work now. Re-download everything to use git to update the sources. Please use https://github.com/lindenb/jvarkit/issues?state=open to report others bug.

ADD REPLY
0
Entering edit mode

Thanks. I think I found a bug and I will post it on github. Just one thing- Did you publish jvarkit? I found it quite fast. Although I never use java, I found it easy to use.

ADD REPLY
0
Entering edit mode

Will you explain what the columns represent in the results.tsv file?

ADD REPLY

Login before adding your answer.

Traffic: 1808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6