Pangenome: Gene presence and absence analysis
1
0
Entering edit mode
5.2 years ago
ysas ▴ 10

Hello. I would like to have an idea for gene presence and absence determination for pangenome analysis.

I have built a pangenome from several yeast strains of interest by merging each CDS and eliminate the redundantly using CD-HIT. Then, I mapped Illumina reads from each strain against the pangenome individually, generated sorted-bam files and calculated mean coverages by qualimap. Based on this information (e.g., mean coverage), I want to determine each CDS cover ratio and define the CDS presence if 95% of it is covered by reads. Do you have any idea how to run this step or a better idea? By looking over previous discussions, I have tried to use samtools depth and got read depth per each single base location. However, I still wondering how to transform the data, calculate each CDS cover ratio, and use for pangenome analysis. Thank you for your support.

sequencing pangenome • 1.1k views
ADD COMMENT
0
Entering edit mode
21 months ago

Very late (only 3.5 years), but I have used odgi pav successfully for this. There is a nice tutorial here:

https://odgi.readthedocs.io/en/latest/rst/tutorials/presence_absence_variants.html

ADD COMMENT

Login before adding your answer.

Traffic: 1648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6