How to calculate "Tissue specificity score" using JSD
3
1
Entering edit mode
7.6 years ago
Ashley ▴ 90

Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse cellular processes. Determining the function of individual lncRNAs remains a challenge. Recent advances in RNA sequencing (RNA-Seq) and computational methods allow for an unprecedented analysis of such transcripts. Our catalogue unifies previously existing annotation sources with transcripts we assembled from RNA-Seq data across human 24 tissues and cell types.

We want to find that lncRNA expression is strikingly tissue specific compared to coding genes. I'm using JS divergence to evaluate the tissue specificity. Recently, I read a paper "Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses". However, I don't know how to calculate for python code as follows:

import os
from scipy.stats import entropy
from numpy.linalg import norm
import numpy as np
def JSD(P, Q):
    _P = P / norm(P, ord=1)
    _Q = Q / norm(Q, ord=1)
    _M = 0.5 * (_P + _Q)
    return 0.5 * (entropy(_P, _M) + entropy(_Q, _M))
RNA-Seq next-gen sequencing python • 4.9k views
ADD COMMENT
0
Entering edit mode

Please reformat the question to make it readable. It looks like the non-code is formatted as code and vice versa.

ADD REPLY
0
Entering edit mode

How to best go about it will depend a bit on the exact input you have. Here's what we're doing for JSD calculation in deepTools. Our requirements are a bit odd, since we have a very spiky distribution, which is why all of the interpolation stuff is done.

ADD REPLY
1
Entering edit mode
5.5 years ago

In case you're still interested in that, I developed a Python package that computes several tissue-specificity metrics, including JSD.

http://github.com/apcamargo/tspex

I'm finishing a web version of it that will be online soon (I hope).

ADD COMMENT
0
Entering edit mode

Thank you very much.

ADD REPLY
0
Entering edit mode

You're welcome! In the next few days I plan to release a new version with a CLI interface and a full tutorial explaining how to use the package inside Python.

ADD REPLY
0
Entering edit mode
7.6 years ago
BioinfGuru ★ 2.1k

I recently developed a tissue specificity pipeline based on the Tau algorithm as recommended by a benchmark paper published this year. I had RNA-seq expression data from 23 tissues and found plenty of tissue specific protein coding genes but NO tissue specific long lnc-rna.

I highly recommend studying the benchmark paper as your starting point.

EDIT: Apologies, I should qualify - by tissue specific I mean " only expressed in 1 tissue" (out of 23). Many may show selectivity for a few (>1) tissues but I found no lincRNA that are specific for a single tissue. (Thank you EagleEye)

ADD COMMENT
1
Entering edit mode

Hi kennethcondon2007,

I have small suggestion. Long noncoding RNAs have been studied extensively in recent years. There are large number of studies and approaches have shown that lncRNAs act in tissue specific manner.

I Kindly request you to reconsider you pipeline and evaluate with different datasets.

Some benchmarking article references:

http://genome.cshlp.org/content/24/4/616.long?view=long&pmid=24429298

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3185964/?report=classic

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0570-4

ADD REPLY
0
Entering edit mode

Thank you EagleEye - I have edited my answer above. I'll certainly be reading those papers over the weekend.

ADD REPLY
0
Entering edit mode
7.6 years ago
Ashley ▴ 90

It's confused to me. I'm poor in math and computer, I prefer to some softwares for this goal. Thank you very much!

ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Shannon Entropy is one of the metrics benchmarked in the paper I cite above.

ADD REPLY
0
Entering edit mode

Thank you very much, I'll try it later! Good Luck!

ADD REPLY

Login before adding your answer.

Traffic: 2781 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6