Question

Top 20% expressed orthologus genes in two species

1

Entering edit mode

9.0 years ago

Saad Khan ▴ 440

Hi,

I am looking at publicly available RNA-seq data and trying to see how in two species(mouse and human) the epigenomic data varies with the gene expression for highly expressed genes and genes with low expression.I know of papers which set a cutoff for the gene expression (http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000598) beng greater than 0.1 rpkm for it to be considered expressed. Here is how I am taking the top 20% expressed orthologs and the bottom 20% expressed orthologs. I take a cut off for a gene to be expressed (FPKM/RPKM > 0.1) and remove gene pair in either species which are below that cutoff (even if gene from one species is below cutoff and other is not I don't consider that pair). Then I sort the resulting gene pair list based on the expression value in descending order and take top 20% as highly expressed ortholog pairs and bottom 20% as low expressed pairs.

I know its a very naive approach and has its flaws so I would appreciate if someone can suggest me a better more robust statistical approach.

Thanks

PS: crossposted from (Comparing features of high expression genes vs low expression genes in two species) since that went unanswered.

RNA-seq RPKM FPKM • 2.0k views

ADD COMMENT • link updated 8.8 years ago by Biostar 20 • written 9.0 years ago by Saad Khan ▴ 440