the distance between genes
2
1
Entering edit mode
8.1 years ago
elisheva ▴ 120

I have this data: In the human genome 3 billion kb and every human has 25,000 genes. The average length of every gene is 50 kb. And the length of the section that translated into a protein is 2 kb. 1. What is the average distance betwwen genes? 2. How much percent from the human genome is DNA that translated into a protein.

*I really tried quite hard but i didn't get the correct answer, please help me

genome gene • 5.6k views
ADD COMMENT
0
Entering edit mode

Sounds like a homework question.

How did you attempt to determine the answer (i.e. what formula/calculation did you use)?

ADD REPLY
0
Entering edit mode

I'v tried to calculate the whole distance which is: 3 million - (5025,000) And then i divided again in 25,000 - What leads to illogical result. About the second question: (225000)/3 million

ADD REPLY
0
Entering edit mode

Based on your calculations, what are your answers? And why do you think that your calculations are incorrect and your results illogical?

One factor that may improve your understanding would be to use scientific notation, rather than representing the lengths as kb (e.g., the human genome is 3 x 10e9 bp, average gene length is 5 x 10e4 bp, etc.). The other factor would be to include the units in all of your calculations.

ADD REPLY
4
Entering edit mode
8.1 years ago

You have 25k genes of average length 50,000 bases. Therefore, assuming all genes are on one strand and do not overlap, they cover 1,250,000,000 bases (2.5e4 x 5e4 = 1.25e9).

The genome is 3,000,000,000 or 3e9 bases long (not kb). Therefore, the part of the single-stranded genome not covered by genes is 1.75e9 (3e9 - 1.25e9).

If 25000 genes are spaced equally apart on one strand all along the genome (imagine the genome to be one big chromosome), then there are 24999 gaps between genes. The average distance between genes is the space covered by one of these gaps.

Divide the space not covered by genes by the number of non-gene spaces that you have: (1.75e9 bases)/(24999 gaps) = 70003 bases per gap.

The human genome is made up of DNA, which is typically double-stranded, and genes are directional based on what strand they are on. So you might also think about how this exercise changes how many gaps you have and how big those gaps are, if you instead assume genes can be put on two strands.

If 25000 genes are spaced equally apart on two strands, you have 12500 genes on one strand and 12500 genes on the other. In that case, on one strand, you have 12499 gaps in between genes. 12500 genes cover 6.25e8 bases. So the space (on one strand) not covered by genes is greater: 3e9 - 6.25e8 = 2.38e9 bases.

Dividing this larger space by fewer gaps gives you an average of 190015 bases or 1.9e6 bases of distance between genes on double-stranded DNA.

Note that these are very relaxed and unrealistic assumptions. Genes (defined roughly as functional regions of the genome, say, although there are other definitions) do not get distributed evenly throughout the genome. On the contrary, they tend to cluster in specific locations that have been highly conserved over millions of years. Further, you have chromosomal regions where you shouldn't expect genes, like the tips or telomeres, or other regions where there is highly-repeated DNA. So depending on where you look, you might expect genes to be close together, far apart, or nowhere to be found -- when you calculate an average, do you include regions that wouldn't otherwise contain genes?

Or you may even have genes defined as simply inheritable parts of the genome, which can include non-coding regions that don't make proteins but which are still functional. Do those get considered or not? This can be controversial for some people, as was the case with some ENCODE papers a few years back. Assumptions and definitions and units are important, which is why when you find a number via a Google search, it pays to find out how people got to that number.

In any case, science is a numbers game, and scientific notation helps keep the numbers accurate. You should know how to use scientific notation.

ADD COMMENT
0
Entering edit mode

Thank you for your explanation and clarification about the units. But I'v found that the average between the genes is 25e3 bases

ADD REPLY
0
Entering edit mode

If that's the answer you get, that's what you get, but I don't immediately see how you get that answer. Feel free to share your assumptions and your work.

ADD REPLY
0
Entering edit mode

No, I mean when I did my calculations I got the same result that you explained. But when I did some google research I saw (I don't remember where) that the distance between genes in the DNA is 25e3. (Maybe they got the wrong answer).

ADD REPLY
0
Entering edit mode
8.1 years ago
elisheva ▴ 120

My answers are: 1. 70 kb. 2. 1.6 % I don't see how using in scientific notation will improve my understanding. As long as I keep on the correct ratio the answers will be the same (it doesn't matter if it's in kb or in it's real value)

ADD COMMENT
0
Entering edit mode

Your answers are correct if you ignore strandedness (see @Alex Reynold's answer for an alternative correct calculation). Since you thought the answers were illogical, I'd assumed you'd botched the units. My mistake.

ADD REPLY
0
Entering edit mode

But when I did some google search I'v found that the average distance between genes at the DNA is about 25 bases.

ADD REPLY
0
Entering edit mode

Citation?

[dummy text for minimum character limit]

ADD REPLY

Login before adding your answer.

Traffic: 1739 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6