Extracting all the genes without introns for a species
1
0
Entering edit mode
2.1 years ago
GR ▴ 400

Hi All,

I was wondering if there is a quick way to extract all the genes without introns for a species from the gff file?

Thanks, RT

introns • 871 views
ADD COMMENT
1
Entering edit mode

Take a look at AGAT toolkit. It should have something in it that will do this. Doc available: https://agat.readthedocs.io/en/latest/?badge=latest

ADD REPLY
0
Entering edit mode

What do you mean by a gene without introns? mRNA (spliced transcript)? What you want to achieve when threre are isoforms? Extract each isoform independently or create a chimere by merging all possible isoforms in one single feature?

ADD REPLY
0
Entering edit mode
2.1 years ago

you're looking for transcripts having count(exon)==1. So it's something like:

wget -O - -q "https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_42/gencode.v42.annotation.gff3.gz" |\
gunzip -c |\
awk '($3=="exon")' |\
cut -f 9 |\
tr ";" "\n" |\
grep '^transcript_id=' |\
cut -f2 -d '=' |\
sort |\
uniq -u
ADD COMMENT

Login before adding your answer.

Traffic: 1680 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6