Entering edit mode
4.5 years ago
yueli7
▴
250
Hello,
I use star to make human index.
I have hg38.fa file which downloaded from: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/
There are four gtf files:hg38.refGene.gtf.gz, hg38.ncbiRefSeq.gtf.gz, hg38.knownGene.gtf.gz and hg38.ensGene.gtf.gz
in: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/
Which one I can use to make human index?
Thanks in advance foe any help!
Best,
Yue
No strict rule about this. I personally use GENCODE https://www.gencodegenes.org/human/
Choice is often arbitrary. People often use either what they stumbled over first or what they find most appealing towards the formatting of the files and gene names or what a colleague advised them to use, which in turn is probably based on what the colleague stumbled over first or found most appealing.
GENCODE is more comprehensive then RefSeq. It contains more transcripts, especially non-coding ones. So if you are interested in lncRNA GENCODE might be a better choice. RefSeq is more conservative and contains fewer transcripts.
See for a more in-depth comparison e.g. https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-16-S8-S2
Hello, ATpoint,
Thank your so much for your quick response and detailed explanation!
Thank you and Best!
Yue