Human transcriptome reference for RNASeq analysis
3
0
Entering edit mode
9.0 years ago

What reference do you use when mapping your RNASeq reads from human?

(I have been recently quite frustrated by inconsistencies between different databases or even within the same database. e.g. when downloading data chromosome by chromosome using http://useast.ensembl.org/biomart/martview I am getting different results than when downloading genes for the whole genome in bulk)

The important features for me right now are:

  1. completeness; I want ALL genes, including Y-chromosomal genes (e.g. XKRY is often missing)
  2. sequence and corresponding chromosome must be listed

I also looked at RefSeq genes here: ftp://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/RefSeqGene/ where it looks like I should concatenate those 8 files before looking into them (???)

I will be happy if you could share which reference you are using in your RNASeq experiments.

human transcriptome RNA-seq • 7.8k views
ADD COMMENT
0
Entering edit mode

Also, those 8 RefSeq files together only have 5598 entries.

ADD REPLY
3
Entering edit mode
ADD COMMENT
1
Entering edit mode
9.0 years ago

You should align your data against the whole human genome (hg19 or the new hg38 assembly). you can find them on ENSEMBL website. Then you can use an annotation file (gtf file available also in ENSEMBL) to count the number of read using featureCounts (for example).

ADD COMMENT
0
Entering edit mode
9.0 years ago
Constantine ▴ 290

Here are the complete genomes, fully annotated, from Illumina:

https://support.illumina.com/sequencing/sequencing_software/igenome.html

It gives you the option to choose from different databases

ADD COMMENT

Login before adding your answer.

Traffic: 1650 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6