Question

What Is The Gc-Content Across Different Human Chromosomes?

8

Entering edit mode

12.9 years ago

Dan12345 ▴ 160

Does anyone know what is the GC-content of different human chromosomes?

chromosome gc • 39k views

ADD COMMENT • link updated 8.5 years ago by sacha ★ 2.4k • written 12.9 years ago by Dan12345 ▴ 160

2

Entering edit mode

Funny that there are three different GC% answers for Chr1 ...

ADD REPLY • link 12.9 years ago by Martin A Hansen 3.0k

2

Entering edit mode

Depends on the genome build and version that you use - it's perfectly 'legit', as they say in Cockney London slang.

The truth of the matter is that we do not have an honest representation of the true GC content because the reference genome builds exclude / mask telomeric and centromeric regions, where GC content is high.

Thus, all values represented in this thread are based on the genome builds and are not reflective of the actual GC content, which would be larger and which would differ from individual to individual.

ADD REPLY • link 6.7 years ago by Kevin Blighe 88k

score 15 · Answer 1 · 2012-01-12

**EDIT**

OK, so I felt bad about not actually answering your question, so here you go (generated by the method outlined below):

#Sequence   GC content
chr1          0.43
chr2          0.40
chr3          0.40
chr4          0.38
chr5          0.40
chr6          0.40
chr7          0.41
chr8          0.40
chr9          0.43
chr10         0.42
chr11         0.42
chr12         0.41
chr13         0.40
chr14         0.43
chr15         0.44
chr16         0.45
chr17         0.46
chr18         0.40
chr19         0.48
chr20         0.44
chr21         0.43
chr22         0.49
chrX          0.40
chrY          0.46
chrM          0.44

**EDIT ENDS**

The GC content of human chromosomal DNA is very heterogeneous, rendering chromosome-wide statistics relatively meaningless. It has been shown that the human genome is a mosaic of GC-rich and GC-poor regions, of around 300kb in length, called isochores.

You can plot these regions of varying content using the Emboss program isochore. For example, for chromosome 1.

wget <http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chr1.fa.gz>
gunzip chr1.fa.gz
isochore -sequence chr1.fa -outfile chr1.isochore -graph png

Gives the following result:

Isochores of Chr 1

You could also get the sequences of the individual chromosomes and work out their overall GC content, also using Emboss, this time geecee:

   geecee -sequence chr1.fa

Gives us an answer of 43% for Chromosome 1.

score 12 · Answer 2 · 2012-01-12

GRCh37/hg19/b37:

1   0.417439
2   0.402438
3   0.396943
4   0.382479
5   0.395163
6   0.396109
7   0.407513
8   0.401757
9   0.413168
10  0.415849
11  0.415657
12  0.40812
13  0.385265
14  0.408872
15  0.42201
16  0.447894
17  0.455405
18  0.39785
19  0.483603
20  0.441257
21  0.408325
22  0.479881
X   0.394963
Y   0.391288
MT  0.443626

Done by:

seqtk comp hs37m.fa.gz | awk '/^[0-9MXY]/{x=$4+$5;y=x+$3+$6;print $1"\t"x/y}'

ChrY has lots of ambiguous bases and that is why my result differs most on chrY in comparison to the EMBOSS result. EMBOSS is wrong, IMHO.

score 1 · Answer 3 · 2012-01-12

1

Entering edit mode

12.9 years ago

Martin A Hansen 3.0k

I was about to tell you, but then someone crashed the server. Here is how (using Biopieces):

read_fasta -i /home/DATA/downloads/Homo_sapiens/human_hg19.fasta.gz | analyze_gc | write_tab -ck SEQ_NAME,GC% -x
#SEQ_NAME       GC%
gi|89161184|ref|AC_000044.1| Homo sapiens chromosome 1, alternate assembly Celera, whole genome shotgun sequence        40.77

ADD COMMENT • link 12.9 years ago by Martin A Hansen 3.0k

0

Entering edit mode

Well, AC_000044 is the Celera assembly, not hg19. In addition, Perl is notoriously inefficient for looping through each base.

ADD REPLY • link 12.9 years ago by lh3 33k

score 1 · Answer 4 · 2016-06-01

1

Entering edit mode

8.5 years ago

sacha ★ 2.4k

using bedtools nuc on hg19 :

1 chr1 0.377295
2 chr2 0.394172
3 chr3 0.390478
4 chr4 0.375491
5 chr5 0.388130
6 chr6 0.387498
7 chr7 0.397821
8 chrX 0.384356
9 chr8 0.392218
10 chr9 0.351521
11 chr10 0.402901
12 chr11 0.403720
13 chr12 0.397843
14 chr13 0.319767
15 chr14 0.336276
16 chr15 0.336248
17 chr16 0.391037
18 chr17 0.436335
19 chr18 0.380423
20 chr20 0.416613
21 chrY 0.172677
22 chr19 0.456450
23 chr22 0.326388
24 chr21 0.297838

ADD COMMENT • link 8.5 years ago by sacha ★ 2.4k

2

Entering edit mode

Ehm your results are remarkably different from what was obtained earlier here in this topic. Also not sure if this topic was worth reviving after 4.4 years.

ADD REPLY • link 8.5 years ago by WouterDeCoster 47k

1

Entering edit mode

Someone else revived it just now... after 6 years! They up-voted lh3's answer. I also then gave my own comment at the very top

ADD REPLY • link 6.7 years ago by Kevin Blighe 88k