Dear All,
I'm interested in checking the conservation of lncRNAs and protein-coding genes from Gencode and similarly from my data I also have some newly assembled lncRNAs which are not found in gencode. I would like to make a plot like below:
The above image is taken from the paper Recurrently deregulated lncRNAs in hepatocellular carcinoma. Figure 1e
Similar to above image, I also wanted to check the conservation of known lncRNAs, protein coding genes and newly found lncRNAs.
How to calculate the phastcons score for all the genes?
Is this also by you: https://support.bioconductor.org/p/129140/ ?
There is a whole literature for that
http://compgen.cshl.edu/phast/phastCons-HOWTO.html
In the paper you mentioned they have downloaded directly but if your data is specific and customized you would have to do the calculations using the tool which you can download from http://compgen.cshl.edu/phast/downloads.php
For the newly found lncRNAs, yes I will do it myself using
phast
. But how do I get thephastcons
score for gencode protein-coding genes and known lncRNAs?https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=cons100way
At the end there are links to download phastcons scores. I think thats what they mean in the paper
I found an R package GenomicScores to extract the
phastcons
scores.In the tutorial of the
GenomicScore
I see that information can be exxtracted only by each chromosome location.But I would like to know, whether there is any way to get the
phastcons
score for all protein coding genes and known lncRNAs fromphastCons100way.UCSC.hg38
Yes from the above example I'm able to get the
phastcons
score for the specific chromosome location. But I'm interested in extracting the information for all protein coding genes and known lncRNAs.