Is it possible to find synteny between two subgenomes using a subset of annotation file
0
0
Entering edit mode
5.3 years ago
Hann ▴ 110

Hello,

I am interested to look at regions that experienced selective sweep in both subgenomes. I have identified a set of regions that are under selection in both subgenomes

Example, identified regions from subgenome A

Dexi_CM05836_chr04A maker   gene    10734723    10737376    .   +   .   ID=Dexi4A01G0012730 Name=Dexi4A01G0012730;Alias=maker-Dexi_CM05836_chr04A-augustus-gene-10.211;Note=Similar to At2g04740: BTB/POZ domain-containing protein At2g04740 (Arabidopsis thaliana OX%3D3702);
Dexi_CM05836_chr04A maker   gene    10742717    10746201    .   +   .   ID=Dexi4A01G0012750;Name=Dexi4A01G0012750;Alias=augustus_masked-Dexi_CM05836_chr04A-processed-gene-10.51;Note=Similar to NHX4: Sodium/hydrogen exchanger 4 (Arabidopsis thaliana OX%3D3702);
Dexi_CM05836_chr04A maker   gene    10738093    10740497    .   -   .   ID=Dexi4A01G0012740;Name=Dexi4A01G0012740;Alias=maker-Dexi_CM05836_chr04A-augustus-gene-10.233;Note=Similar to GDI1: Rho GDP-dissociation inhibitor 1 (Arabidopsis thaliana OX%3D3702);
Dexi_CM05836_chr04A maker   gene    10746947    10747608    .   +   .   ID=Dexi4A01G0012760;Name=Dexi4A01G0012760;Alias=maker-Dexi_CM05836_chr04A-snap-gene-10.146;Note=Protein of unknown function;
Dexi_CM05836_chr04A maker   gene    10748967    10749276    .   +   .   ID=Dexi4A01G0012770;Name=Dexi4A01G0012770;Alias=maker-Dexi_CM05836_chr04A-snap-gene-10.147;Note=Protein of unknown function;
Dexi_CM05836_chr04A maker   gene    10759176    10763102    .   +   .   ID=Dexi4A01G0012780;Name=Dexi4A01G0012780;Alias=maker-Dexi_CM05836_chr04A-augustus-gene-10.214;Note=Similar to CTPA3: Carboxyl-terminal-processing peptidase 3%2C chloroplastic (Arabidopsis thaliana OX%3D3702);
Dexi_CM05836_chr04A maker   gene    11758524    11761888    .   -   .   ID=Dexi4A01G0013320;Name=Dexi4A01G0013320;Alias=augustus_masked-Dexi_CM05836_chr04A-processed-gene-11.248;Note=Similar to DTX16: Protein DETOXIFICATION 16 (Arabidopsis thaliana OX%3D3702);

Example, identified regions from subgenome B

Dexi_CM05836_chr04B maker   gene    11596306    11599349    .   +   .   ID=Dexi4B01G0012770;Name=Dexi4B01G0012770;Alias=maker-Dexi_CM05836_chr04B-augustus-gene-11.235;Note=Similar to At5g06830: CDK5RAP3-like protein (Arabidopsis thaliana OX%3D3702);
Dexi_CM05836_chr04B maker   gene    11600308    11604609    .   -   .   ID=Dexi4B01G0012780;Name=Dexi4B01G0012780;Alias=maker-Dexi_CM05836_chr04B-augustus-gene-11.260;Note=Similar to BGLU24: Beta-glucosidase 24 (Oryza sativa subsp. japonica OX%3D39947);
Dexi_CM05836_chr04B maker   gene    11618445    11619569    .   +   .   ID=Dexi4B01G0012790;Name=Dexi4B01G0012790;Alias=maker-Dexi_CM05836_chr04B-snap-gene-11.121;Note=Similar to LHC Ib-21: Chlorophyll a-b binding protein 1B-21%2C chloroplastic (Hordeum vulgare OX%3D4513);
Dexi_CM05836_chr04B maker   gene    11646461    11647555    .   -   .   ID=Dexi4B01G0012810;Name=Dexi4B01G0012810;Alias=maker-Dexi_CM05836_chr04B-snap-gene-11.149;Note=Protein of unknown function;
Dexi_CM05836_chr04B maker   gene    11640928    11642414    .   +   .   ID=Dexi4B01G0012800;Name=Dexi4B01G0012800;Alias=maker-Dexi_CM05836_chr04B-snap-gene-11.122;Note=Protein of unknown function;
Dexi_CM05836_chr04B maker   gene    11650401    11650789    .   -   .   ID=Dexi4B01G0012820;Name=Dexi4B01G0012820;Alias=snap_masked-Dexi_CM05836_chr04B-processed-gene-11.215;Note=Similar to selO: Protein adenylyltransferase SelO (Nitrosospira multiformis (strain ATCC 25196 / NCIMB 11849 / C 71) OX%3D323848);
Dexi_CM05836_chr04B maker   gene    11618445    11619569    .   +   .   ID=Dexi4B01G0012790;Name=Dexi4B01G0012790;Alias=maker-Dexi_CM05836_chr04B-snap-gene-11.121;Note=Similar to LHC Ib-21: Chlorophyll a-b binding protein 1B-21%2C chloroplastic (Hordeum vulgare OX%3D4513);

Is it possible to find synteny between two subgenomes using the two subsets of the annotation file? Is there is any specific tool can do this?

It came to my mind to use command line where I can just look for common genomics regions "common start and end" between the two subset files, I end up having no common regions, as they're unique for each file. However, I don't think this is the correct way to do it. I think using synteny could be the ideal way.

gene genomics • 768 views
ADD COMMENT

Login before adding your answer.

Traffic: 2241 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6