Question

T-Cell Receptor Sequencing Clonotype Information

2

Entering edit mode

9.9 years ago

bioinformatics.cancer ▴ 260

Hi,

I have RNA sequencing data of T-Cell Receptor from several patients before and after immunotherapy. The sequencing was done by a vendor and they provided us with tables of data for individual patients. These tables contain rows with clonotypes (as per my understanding defined as CDR3 sequences with a unique combination of V, D, and J segments, along with counts of the clonotype. The CDR3 AA sequences are given in each row along with the V, D, and J names. The D segments are missing (I believe that should be fine). But I found some of the AA sequences are duplicates and so are some of the V-J segment combinations. This is confusing to me because each row should contain a unique clonotype. I am new to this type of data and might not understand this type of data correctly. Can someone please help me out with understanding this data set.

Thank you,

- Pankaj

TCR-Sequencing-Clonotype • 18k views

ADD COMMENT • link updated 2.9 years ago by Ram 45k • written 9.9 years ago by bioinformatics.cancer ▴ 260

score 3 · Answer 1 · 2016-03-19

3

Entering edit mode

9.4 years ago

mikhail.shugay 3.5k

My 5 cents:

vdj

The number of V-J pairs is limited, most of TCR diversity is due to randomly added N-nucleotides
Several nucleotide variants can code for the same amino acid sequence, this is called "convergent recombination". This is common for clonotypes with few or no N-nucleotides. Such clonotypes are close to germline, frequently being public (shared across many individuals) and specific to commonly encountered pathogens (CMV, EBV, etc).
Clonotype should not be confused with "clone". The former typically refers to single chain (TCR beta, IGH, ...), while the latter refers to the antigen receptor heterodimer (TCR alpha-beta, IG heavy-light chain)
- As TCR alpha is recombined after TCR beta, and has less diversity, so distinct TCR beta clonotypes are likely to correspond to distinct clones. Thus the number of unique TCR beta nucleotide sequences is a good measure of TCR repertoire diversity.

ADD COMMENT • link 9.4 years ago by mikhail.shugay 3.5k

0

Entering edit mode

This was very helpful. I had a couple of follow ups: 1. Is there a reference for the figure and the second bullet point. I would like to use them for as a references in a paper we are working on. 2. How variable is the N region in terms of number of nucleotides.

ADD REPLY • link 9.1 years ago by bioinformatics.cancer ▴ 260

0

Entering edit mode

Hello!

1a I've just referenced this page: https://www.bsse.ethz.ch/lsi/research/systems-immunology.html . I bet such sort of picture can be found in various textbooks, e.g. Janeway's immunobiology
1b http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2984183/ - for convergent recombination. In http://www.jimmunol.org/content/196/12/5005.short we have a figure directly showing that for the core (naive) repertoire the #nt sequences per #aa sequences is > 1.
2 Again you can check out the "insert size" figure from http://www.jimmunol.org/content/196/12/5005.short that was generated for healthy donors in various age. For a more detailed insert size distribution of V-D-J recombination machinery please refer to http://www.phys.ens.fr/~awalczak/PUBLI/mmwc12.pdf

ADD REPLY • link 9.1 years ago by mikhail.shugay 3.5k

score 2 · Answer 2 · 2015-09-17

The rows do contain distinct clonotypes, it's just that different clonotypes may not always be different on the protein level. You're thinking in terms of amino acids but defining types according to nucleic acids, which is where the confusion is coming from.

Edit: If you haven't done so, read this paper.