I wonder what are the base compositions (percentage of As, Ts, Cs, and Gs) of the human genome?
An Internet search shows that the mean GC content of the human genome is around 41%. I wonder if the human genome A-rich or T-rich?
And the results for one of the strands of human DNA for each main chromosome. The other strand for each chromosome would obviously just be the complement of these counts.
>base_content
# A tibble: 5 x 3
letter value freq
<chr> <int> <dbl>
1 A 867153993 0.281
2 C 599043897 0.194
3 G 601515125 0.195
4 N 150630720 0.0488
5 T 869942666 0.282
Beautiful R codes! Thanks. Based on your analysis of the human genome ( haploid if I understand right), it seems there is no obvious GC skew or AT skew in the human genome, very interesting.
Beautiful R codes! Thanks. Based on your analysis of the human genome ( haploid if I understand right), it seems there is no obvious GC skew or AT skew in the human genome, very interesting.
The human genome is overall more AT rich. However, CpG islands are common in promoters.