Hi,
I would like to create a GFF3 file containing information only for specific coordinates from the chromosome level GFF3 file. I know how to extract gene and CDS info separately but don't know how to do trimming based on coordinates. Can someone please assist me with this?
My GFF3 file looks like this; and I want to retrieve the region 10700000:16500000
##gff-version 3
##sequence-region 1 1 43270923
#!genome-build IRGSP IRGSP-1.0
#!genome-version IRGSP-1.0
#!genome-date 2015-10
#!genome-build-accession GCA_001433935.1
#!genebuild-last-updated 2019-06
1 IRGSP-1.0 chromosome 1 43270923 . . . ID=chromosome:1;Alias=Chr1,AP014957.1,NC_029256.1
###
1 RAP2018-11-26 gene 2983 10815 . + . ID=gene:Os01g0100100;biotype=protein_coding;description=RabGAP/TBC domain containing protein;gene_id=Os01g0100100;logic_name=rapdb_genes
1 RAP2018-11-26 mRNA 2983 10815 . + . ID=transcript:Os01t0100100-01;Parent=gene:Os01g0100100;biotype=protein_coding;transcript_id=Os01t0100100-01
1 RAP2018-11-26 exon 2983 3268 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0100100-01-E1;rank=1
1 RAP2018-11-26 five_prime_UTR 2983 3268 . + . Parent=transcript:Os01t0100100-01
1 RAP2018-11-26 five_prime_UTR 3354 3448 . + . Parent=transcript:Os01t0100100-01
1 RAP2018-11-26 exon 3354 3616 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E2;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0100100-01-E2;rank=2
1 RAP2018-11-26 CDS 3449 3616 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
1 RAP2018-11-26 exon 4357 4455 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E3;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100100-01-E3;rank=3
1 RAP2018-11-26 CDS 4357 4455 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
1 RAP2018-11-26 exon 5457 5560 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E4;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0100100-01-E4;rank=4
1 RAP2018-11-26 CDS 5457 5560 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
1 RAP2018-11-26 exon 7136 7944 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E5;constitutive=1;ensembl_end_phase=1;ensembl_phase=2;exon_id=Os01t0100100-01-E5;rank=5
1 RAP2018-11-26 CDS 7136 7944 . + 1 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
1 RAP2018-11-26 exon 8028 8150 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E6;constitutive=1;ensembl_end_phase=1;ensembl_phase=1;exon_id=Os01t0100100-01-E6;rank=6
1 RAP2018-11-26 CDS 8028 8150 . + 2 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
1 RAP2018-11-26 exon 8232 8320 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E7;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0100100-01-E7;rank=7
1 RAP2018-11-26 CDS 8232 8320 . + 2 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
1 RAP2018-11-26 exon 8408 8608 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E8;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100100-01-E8;rank=8
1 RAP2018-11-26 CDS 8408 8608 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
1 RAP2018-11-26 exon 9210 9615 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E9;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0100100-01-E9;rank=9
1 RAP2018-11-26 CDS 9210 9615 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
1 RAP2018-11-26 exon 10102 10187 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E10;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0100100-01-E10;rank=10
1 RAP2018-11-26 CDS 10102 10187 . + 2 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
1 RAP2018-11-26 CDS 10274 10297 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
1 RAP2018-11-26 exon 10274 10430 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E11;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0100100-01-E11;rank=11
1 RAP2018-11-26 three_prime_UTR 10298 10430 . + . Parent=transcript:Os01t0100100-01
1 RAP2018-11-26 exon 10504 10815 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01-E12;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0100100-01-E12;rank=12
1 RAP2018-11-26 three_prime_UTR 10504 10815 . + . Parent=transcript:Os01t0100100-01
###
Thank you!
try this: https://gffutils.readthedocs.io/en/latest/gtf_extract.html
Cool! Thank you !