filter out T cell clones
2
1
Entering edit mode
2.0 years ago
chi.delta ▴ 40

Dear all,

I just mapped some bulk Seq reads to reference VDJ genes of T cell receptors in order to extract the T cell clonotypes (using mixcr). The resulted clonotypes come in a txt file per sample that looks like this:

count   freq    cdr3nt  cdr3aa  v   d   j   VEnd    DStart  DEnd    JStart
76  0.05846153846153846 TGTGCCTTATCGGGGTACACCGATAAACTCATCTTT    CALSGYTDKLIF    TRDV3   .   TRDJ1   7   -1  -1  6
59  0.045384615384615384    TGTGCTGTGCGGCCTGCCGGGACTGCAAGGCAACTGACCTTT  CAVRPAGTARQLTF  TRAV20  .   TRAJ22  16  -1  -1  22
.....

However, in some cases, I would like to filter out some TRD clones, like for example here the first one that contains a TRDV3 and a TRDJ gene. Can one do this easily directly on the txt file in R? Or is there another way of doing instead of reading the table as a data frame, filtering, and then exporting it as txt again? I eventually import these txt files to vdjtools for further processing.

Any help or idea would be much appreciated

Thanks a lot!

vdjtools TCR mixcr • 1.2k views
ADD COMMENT
0
Entering edit mode
2.0 years ago
Jesse ▴ 850

What mixcr command created that output? From their docs it sounds like outputs are generally just TSV files, so it'd be easy enough to do a read.table or what have you in R and go from there. But they also have a feature to convert things to AIRR format which could be handy too.

In any case filtering the table will involve some kind of of read+filter+write, whether with R or whatever else. Is something like an awk one-liner all you need?

awk '$5 !~ /TRD/ && $7 !~ /TRD/' < file > file2
ADD COMMENT
0
Entering edit mode

this just worked perfectly, thanks a lot!

ADD REPLY
0
Entering edit mode
22 months ago
mizraelson ▴ 60

Hi, just to add, its important to notice that a lot of times tra and trd clones share the same segments (V and J). From our experience only C gene can reliably distinguish between those two.

Also, its worth noticing that MiXCR series 4 supports most of the features of vdjtools. You can read our new documentation portal on available post analysis options: https://docs.milaboratories.com/mixcr/reference/mixcr-postanalysis/

ADD COMMENT
0
Entering edit mode

Oh, that's good to know about the segments! (chi.delta, watch out, then, if you're trying to recognize TRD from V+J gene names like I mentioned.) Though, aren't alpha/beta/gamma/delta TCR chains assembled from totally different loci? I'm confused how a beta chain could end up using a V gene from TRD for example. This is probably where my ignorance of TCRs vs IGs is showing though.

ADD REPLY

Login before adding your answer.

Traffic: 1707 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6