Are there risks to use a GRCh38 gtf and a hg19 fa in alignment?
1
0
Entering edit mode
3.4 years ago
ddzhangzz ▴ 90

We have used STAR program for aligning RNA sequences against hg19 genome but I noticed the programmer has used a hg19 fasta file and a Gencode v30 (for GRCh38) gtf for annotation and counts. Is there any risks behind this alignment?

RNASeq • 916 views
ADD COMMENT
0
Entering edit mode

It's not a `risk' - it will likely screw up the experiment if you use totally different builds for two different steps.

ADD REPLY
1
Entering edit mode

It is a different coordinate system, so I would upgrade "likely" to "almost certainly" screw up results.

ADD REPLY
1
Entering edit mode
3.4 years ago

If your gtf and reference file don't match correctly, your assessment of how many reads align to genes will be off. If the chromosome names don't match between gtf and genome, no genes at all will be counted.

For instance, for some reason 10xGenomics makes their references based on ensembl genomes and gencode gtfs. But they have to do a few lines of finageling to make them work together.

https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#mm10_#{files.refdata_mm10.version}

It's far easier and far safer to do things right from the start. Get your genome and gtf from the same place.

ADD COMMENT

Login before adding your answer.

Traffic: 2604 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6