Entering edit mode
4.6 years ago
Cecelia
▴
30
I have a gff file, each line looks like this:
2 scaffold1 maker mRNA 1443962 1446567 . + . ID=ASPAM00000002080;Parent=ASPAG00000001349;Dbxref=MetaCyc:PWY-7980,InterPro:IPR024034,InterPro:IPR005725,Gene3D:G3DSA:1.10.1140.10,Gene3D:G3DSA:2.40.50.100,Gene3D:G3DSA:2.40.30.20,Gene3D:G3DSA:3.40.50.300,TIGRFAM:TIGR01042,KEGG:00190+7.1.2.2,KEGG:00195+7.1.2.2,ProSitePatterns:PS00152,Pfam:PF16886,Pfam:PF00006,Hamap:MF_00309,CDD:cd01134,CDD:cd18119;Name=vhaa;Ontology_term=GO:0005524,GO:0046034,GO:1902600,GO:0033180,GO:0046961;_AED=0.10;_QI=0|0|0|1|1|1|8|0|613;_eAED=0.10;makerName=maker-scaffold3-augustus-gene-14.11-mRNA-1;product=V-type proton ATPase catalytic subunit A;uniprot_id=Q2TJ56
Here I only want to extract the fields with ID and go terms and divide them into several rows, first column is ID, second column is GO accession. Now I only managed to extract id and go terms that looks like this:
ASPAM00000002080 GO:0005524,GO:0046034,GO:1902600,GO:0033180,GO:0046961
The ideal output should look like:
ASPAM00000002080 GO:0005524
ASPAM00000002080 GO:0046034
ASPAM00000002080 GO:1902600
ASPAM00000002080 GO:0033180
ASPAM00000002080 GO:0046961
Is there a easy way to do this?
Thx in advance, C
Dear Cecelia,
I trying to obtain what you was able to do I mean :
From the gff3 file
I would be delighted if you could share how you did this
Thanks
Best
AG