Hi, I'm very confused about the IGH gene. So on NCBI it looks to be a single IGH@ gene or locus. however
the region seems to cover many other IGH genes IGHD1, G1, exc... this is how it looks like, browser
Why I'm interested is because of gene fusions where the 3' gene is written as IGH. However, this is ambiguous because this region seems to cover many variants. My questions are the following.
- is IGH a single gene or locus containing many small variants. Or if this a gene that gets post prossessed post joining different segments? 1b. do these IGH "variant" contain exons/intron boundaries.
- On the CCDS website I cannot seem to be be able to locate any IGH genes at all. What I want is to have the coordinates and sequences for each variant.
- Is there a place I can get the the sequences for each segment and coordinate, best if I can find it on CCDS.
thanks in advance.
thanks. I'm still a bit confused -- please let me know if I'm understanding this correctly.
so each "gene segment" is a gene? If so why not just call it gene, why a segment? I keep seeing this word and this is one of my confusion.
I went to the url you sent IMGT, I'm interested in humans, https://www.imgt.org/IMGTrepertoire/index.php?section=LocusGenes&repertoire=GeneOrder&species=human&group=IGH so based on this table it looks like there are at least. 214 - 217 IGH genes? Am I reading this correctly?
Also can help me decipher this? What does it mean when a gene is label as IGHV3-66 vs IGHV3-65?
I couldn't find the sequences on IMGT but luckily it looks like biomart enesembl has the sequences.
You're welcome!
1.I think they call them segments because they are spliced together to form a whole antibody. You're right: each one is a gene. At the same time, V, D, J, and constant region genes form the segments of an entire antibody.
2.Only functional genes are used to make antibodies. If you look in the "Fct" column on IMGT, you will see that genes are annotated as either F, P, or ORF. F means functional, which are the genes that are used to make antibodies. P stands for pseudogenes, which are not used. ORF stands for open reading frame; however the genes marked ORF in IMGT are non-functional, despite having an open reading frame. So for counting genes, you will most likely be interested in those that have a unique "IMGT gene name" and are labeled F in the "Fct" column. By my count, this means 57 IGHV, 23 IGHD, and 6 IGHJ genes for humans. And then there are also the HC constant region genes (IgA1, IgA2, IgD, IgE, IgG1, IgG2, IgG3, IgG4, and IgM), which can be found under IGHC on IMGT.
3.IGHV3-66 and IGHV3-65 are two VH genes in the IGHV3 family. This means that they have similar amino acid sequences. You might have also noticed that there are different alleles in the "IMGT allele name" column (e.g. IGHV1-2[star]01, IGHV1-2[star]02, etc.). These refer to different nucleotide sequences that have been observed for the same amino acid sequence.
You can find all sequences by clicking on the link in the "Accession numbers" column.
this is great thank you so much - this is very comprehensive.
can you elaborate on this. What does it mean by constant vs the other IGH genes? thanks again. A
1.In the IMGT link I gave above, you'll want to scroll down to #7 (Gene tables), and then click on human under IGHV, IGHD, IGHJ, or IGHC. Then you should see everything that I mentioned.
2.The variable region of an antibody (which is made up of V, D, and J genes) can be very different between different antibodies. For example, an antibody that binds Covid-19 could have a totally different variable region compared to an antibody that binds a bacterium. For humans, 57 X 23 X 6 = 7866 possible HC combinations. However, diversity in the variable region is actually much greater due to somatic hypermutation, as well as insertions and deletions. On the other hand, human antibodies have only 9 different constant region genes, which I listed in my previous comment. These sequences don't change between antibodies. IgM is the same regardless of the variable region sequence and what the antibody binds to. Rather than binding to antigen, the constant regions direct a function to occur. IgA, IgD, IgE, IgG, and IgM are termed the five antibody isotypes. Here is a link giving a little background on each isotype:
Antibody Isotypes
this is really awesome and I learned a lot from your comments. Thank you for your time.