How to create a gtf using the GRC38 gtf and ERCC92.gtf
1
0
Entering edit mode
8.4 years ago
adam • 0

Hi,

I am trying to incorporate ERCC spike-ins into my RNA seq experiments. Everything I have read says to just append the ERCC gtf (ERCC92.gtf) onto my other gtf (GRC38.83.gtf). I tried that but am now getting an error from featuresCounts saying that my new gtf is not usable.

When I compare the ERCC92.gtf with the GRC38 format, they are totally different, so it is not surprising that simply appending doesn't work.

Has anyone been able to get these two typed of gtfs to work together?

Thanks!

RNA-Seq • 3.5k views
ADD COMMENT
0
Entering edit mode
8.4 years ago
neelablore ▴ 10

Contact:projectsatbangalore.com/bioinformatics.html

ADD COMMENT
0
Entering edit mode

Hi adam, are you able to post the first couple of lines of your GRC38.83.gtf file? are you working on a Mac? There could be issue with the line return character.

I have been able to merge the two quite easily using cat on the command line.

ADD REPLY
0
Entering edit mode

Thank you for your suggestion. I have pasted the first lines of output from OS X terminal below. There are "#" symbols before the "!" of the first few lines, but they were doing weird things to the formatting here, so I removed them in this post. I also added new lines after each entry here, to help with the formatting.

Also, previous to receiving your answer, I looked more closely at the files and what I thought were differences between the two do not appear to be after all. I now believe them to be compatible. However, if I try to run STAR alignment using a indices created from the GRCh38.83.gtf file alone, alignment works great. However, if I try to run it with the indices created from the concatenated GRC and ERCC files, STAR looks like it is running, but the Aligned.out.sam file reaches 3.6 KB very quickly then stays that size for at least an hour. I killed STAR at that point, so not sure what would happen if I let it continue to run.

!genome-build GRCh38.p5

!genome-version GRCh38

!genome-date 2013-12

!genome-build-accession NCBI:GCA_000001405.20

!genebuild-last-updated 2015-10

1 havana gene 11869 14409 . + . gene_id "ENSG00000223972"; gene_version "5"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2";

1 havana transcript 11869 14409 . + . gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2"; transcript_name "DDX11L1-002"; transcript_source "havana"; transcript_biotype "processed_transcript"; havana_transcript "OTTHUMT00000362751"; havana_transcript_version "1"; tag "basic"; transcript_support_level "1";

ADD REPLY

Login before adding your answer.

Traffic: 1421 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6