Hi there,
There are two types of genome fasta files for human species from GENCODE database (https://www.gencodegenes.org/human/):
- Genome sequence, primary assembly (GRCh38) - ALL: Nucleotide sequence of the GRCh38.p12 genome assembly version on all regions, including reference chromosomes, scaffolds, assembly patches and haplotypes
- Genome sequence, primary assembly (GRCh38) - PRI: Nucleotide sequence of the GRCh38 primary genome assembly (chromosomes and scaffolds)
Also, there are five types of GTF files:
- Comprehensive gene annotation - CHR: It contains the comprehensive gene annotation on the reference chromosomes only
- Comprehensive gene annotation - ALL: It contains the comprehensive gene annotation on the reference chromosomes, scaffolds, assembly patches and alternate loci (haplotypes)
- Comprehensive gene annotation - PRI: It contains the comprehensive gene annotation on the primary assembly (chromosomes and scaffolds) sequence regions
- Basic gene annotation - CHR: It contains the basic gene annotation on the reference chromosomes only
- Basic gene annotation - ALL: It contains the basic gene annotation on the reference chromosomes, scaffolds, assembly patches and alternate loci (haplotypes)
The purpose of my study is to identify the differential expression of human protein-coding genes and long non-coding RNA genes.
I wonder which genome file and which GTF file should be used for this aim?
Many thanks,
Tom
I want to analyze tRNA fragments from small RNA sequencing samples. Which type of annotation file should I use? GENCODE includes tRNA.gtf, as well. Should I use it?
Why is this added as an answer to a 5 year old question? I'm moving it to a comment. Open a new question and in the future, add answers only when you're actually answering the top level question.