Removing genes found on two (or more) chromosomes from gtf file
0
0
Entering edit mode
7.0 years ago
arta ▴ 670

Hi,

I am analyzing the rna-seq data in terms of significantly exon usage from DEXSeq. However, i have faced an error during preparation of annotation file for the main analyze. Here is the error;

raise ValueError, "Same name found on two chromosomes: %s, %s" % ( str(l[i]), str(l[i+1]) )

I realized some genes are annotated in multiple chromosomes in gtf file (grch38 hg38) and I need to remove them.

Any ideas & help ??

Thanks...

gtf awk bash dexseq python • 1.7k views
ADD COMMENT
0
Entering edit mode

It seems like something to do with the exon coordinates - https://support.bioconductor.org/p/44963/

ADD REPLY
0
Entering edit mode

Well i know this might be another issue, but my first concern is removing that genes which are found on more than one chromosomes.

ADD REPLY
0
Entering edit mode

From where did you download the GTF?; did you execute the dexseq_prepare_annotation.py Python script that comes with DEXSeq?

ADD REPLY

Login before adding your answer.

Traffic: 1271 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6