Question

Shannon vs Simpson for TCR diversity estimation

0

Entering edit mode

6.1 years ago

CY ▴ 750

My aim is to evaluate TCR status of cancer patient. Here is what I got after conducting some research:

Based on Shannon formula (α=1), Shannon index treats each TCR subclone proportionally as their relative fraction. Simpson (α=2) index however put more weights on dominate subclone.

I was prone to Shannon initially as I couldn't think of any reason not treating each subclone proportionally. However, now I think Simpson index may have its own consideration: 1: Very rare TCR sequences may be false positive due to technical error during TCR-Seq sequencing (difference between different TCR subclone may only be one nucleotide). 2: Dominate subclones may deserve more attention since they are most likely the one activated by neoantigen.

In addition for point above, TCR subclone with only one nucleotide difference may target same neoantigen. Therefore, they should not be treated seperately (I guess this is the concept of functional diversity).

Not sure if my opinion make sense. Can someone share some comments? Thanks in advance.

TCR shannon simpson • 3.3k views

ADD COMMENT • link updated 13 months ago by ATpoint 88k • written 6.1 years ago by CY ▴ 750

0

Entering edit mode

What are you trying to achieve when you say evaluate TCR status? If you're worried about technical errors, try technical replication. See what the error rates look like.

ADD REPLY • link 6.0 years ago by karl.stamm 4.1k

0

Entering edit mode

I have performed the NGS analyis of faecal sludge .The values obtained are Chao1 = 257 Shannon =4.703018739 Simpson = 0.982884914 can you please interpret the given values.

ADD REPLY • link 13 months ago by salinidevisalu • 0

0

Entering edit mode

This does not belong here -- it's bad practice to ask questions in existing threads. Please open a new question, add context and some effort beyond a one-liner.

ADD REPLY • link 13 months ago by ATpoint 88k

score 1 · Answer 1 · 2023-01-26

Hi, first of all I would recommend using UMIs when prepare cDNA library, that will make it easier to both normalize data and to correct errors introduced during PCR amplification and sequencing.

Also, using the proper software should help with dealing with errors. MiXCR has q way to correct PCR errors (even if you dont have UMIs, although I would strongly recommend molecular barcoding if its possible).

As for the diversity indices bot Shannon and Simson should give you more or less similar results in most cases. Btw, MiXCR can also evaluate all the main diversity inices:

https://docs.milaboratories.com/mixcr/reference/mixcr-postanalysis/#diversity-measures

From my experience the best setting is to combine UMI based normalization and Normalized Shannon-Wiener index (divided by log of the number of clonotypes). I actually have an article published on that: https://pubmed.ncbi.nlm.nih.gov/29080364/