Question

filter out T cell clones

1

Entering edit mode

2.0 years ago

chi.delta ▴ 40

Dear all,

I just mapped some bulk Seq reads to reference VDJ genes of T cell receptors in order to extract the T cell clonotypes (using mixcr). The resulted clonotypes come in a txt file per sample that looks like this:

count   freq    cdr3nt  cdr3aa  v   d   j   VEnd    DStart  DEnd    JStart
76  0.05846153846153846 TGTGCCTTATCGGGGTACACCGATAAACTCATCTTT    CALSGYTDKLIF    TRDV3   .   TRDJ1   7   -1  -1  6
59  0.045384615384615384    TGTGCTGTGCGGCCTGCCGGGACTGCAAGGCAACTGACCTTT  CAVRPAGTARQLTF  TRAV20  .   TRAJ22  16  -1  -1  22
.....

However, in some cases, I would like to filter out some TRD clones, like for example here the first one that contains a TRDV3 and a TRDJ gene. Can one do this easily directly on the txt file in R? Or is there another way of doing instead of reading the table as a data frame, filtering, and then exporting it as txt again? I eventually import these txt files to vdjtools for further processing.

Any help or idea would be much appreciated

Thanks a lot!

vdjtools TCR mixcr • 1.2k views

ADD COMMENT • link updated 17 months ago by Ram 44k • written 2.0 years ago by chi.delta ▴ 40

score 0 · Answer 1 · 2022-11-19

0

Entering edit mode

2.0 years ago

Jesse ▴ 850

What mixcr command created that output? From their docs it sounds like outputs are generally just TSV files, so it'd be easy enough to do a read.table or what have you in R and go from there. But they also have a feature to convert things to AIRR format which could be handy too.

In any case filtering the table will involve some kind of of read+filter+write, whether with R or whatever else. Is something like an awk one-liner all you need?

awk '$5 !~ /TRD/ && $7 !~ /TRD/' < file > file2

ADD COMMENT • link 2.0 years ago by Jesse ▴ 850

0

Entering edit mode

this just worked perfectly, thanks a lot!

ADD REPLY • link 2.0 years ago by chi.delta ▴ 40

score 0 · Answer 2 · 2023-01-25

0

Entering edit mode

22 months ago

mizraelson ▴ 60

Hi, just to add, its important to notice that a lot of times tra and trd clones share the same segments (V and J). From our experience only C gene can reliably distinguish between those two.

Also, its worth noticing that MiXCR series 4 supports most of the features of vdjtools. You can read our new documentation portal on available post analysis options: https://docs.milaboratories.com/mixcr/reference/mixcr-postanalysis/

ADD COMMENT • link 22 months ago by mizraelson ▴ 60

0

Entering edit mode

Oh, that's good to know about the segments! (chi.delta, watch out, then, if you're trying to recognize TRD from V+J gene names like I mentioned.) Though, aren't alpha/beta/gamma/delta TCR chains assembled from totally different loci? I'm confused how a beta chain could end up using a V gene from TRD for example. This is probably where my ignorance of TCRs vs IGs is showing though.

ADD REPLY • link 22 months ago by Jesse ▴ 850