Hello,
I would like to extract common exons of multiple transcripts of the same gene from gencode.v19.annotation.gtf
. Is there any easy way to do that?
Thanks in advance for your help
Jean
Hello,
I would like to extract common exons of multiple transcripts of the same gene from gencode.v19.annotation.gtf
. Is there any easy way to do that?
Thanks in advance for your help
Jean
Perhaps make two BED files of genes and exons, and then map exons to genes:
$ wget -qO- ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_19/gencode.v19.annotation.gff3.gz \
| gunzip --stdout - \
| awk '$3=="gene"' - \
| convert2bed -i gff - \
> genes.bed
$ wget -qO- ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_19/gencode.v19.annotation.gff3.gz \
| gunzip --stdout - \
| awk '$3=="exon"' - \
| convert2bed -i gff - \
> exons.bed
$ bedmap --echo --echo-map genes.bed exons.bed > genes_with_mapped_exons.bed
Once you have this list of exon IDs that correspond to each gene, you could perhaps use a Perl or Python script to process each gene-line, parse out each exon's coordinates to a key-value pair in a hash table, and find the common exons based on a duplicate key. Print those exons or exon IDs out to standard output, along with the gene ID, for instance.
Hi,
I'm the developer of pyGeno. pyGeno is designed to simplify this kind of operations and allow you to go further if needed. Here's an example that works for the Gene TPST2. Of course it would work for any other gene, and you could also loop through all the genes of the genome if you would like to:
from pyGeno.Genome import *
ref = Genome(name = "GRCh37.75")
gene = ref.get(Gene, name = "TPST2")[0]
#you can also make a query using the Ensembl Id
#gene = ref.get(Gene, id = "ENSG00000128294")[0]
exons = {}
for trans in gene.get(Transcript) :
for exon in trans.exons :
if exon.id not in exons :
exons[exon.id] = exon
print "Exons in common"
for e in exons :
print e
print 'whole sequence', e.sequence
print 'coding sequence', e.CDS
pyGeno uses Ensembl annotations.
Hope that helps.
Cheers
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi,
I tried importing the pyGeno module. I am getting the following error:
Please help me. Thank you.