Entering edit mode
13.3 years ago
Omid
▴
590
what is the easiest way to get to 5'CpG3' island percentage in all exons of a specific gene?
what is the easiest way to get to 5'CpG3' island percentage in all exons of a specific gene?
I put a script here that uses the UCSC mysql server to grab the exons for your gene and then calculate the coverage from the cpgIslandExt table
Usage is
$ python exon-cpg.py [genome] [gene]
e.g.
$ python exon-cpg.py hg19 gata3
gives:
exon-start exon-end coverage cpg:pct,...
8096666 8096854 1.0 CpG::60.1
8097249 8097859 1.0 CpG::60.1
8100267 8100804 0.715083798883 CpG::71.4
8105955 8106101 0
8111435 8111561 0
8115701 8117164 0
where the final column is a comma-delimited list of cpg-name:pct-gc
You can verify this is correct by looking at Gata3 in the browser
Python code:
exons = [exon_seq1, exon_seq2, exon_seq3]
for exon in exons:
exon = exon.lower()
CpG_number = exon.count("cg")
CpG_proc = CpG_number * 2 * 100.0 / len(exon)
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Could you rephrase the question?