Entering edit mode
8.8 years ago
Prasad
★
1.6k
Hello,
I am new to antibody data. I had used MiGEC tool for CDR3 prediction. Now i have to get the CDR1 and CDR2 sequences (amino acid). CDR(1,2,3) prediction can be done using IMGT_Vquest tool. But the issue is data limit for the submission. I have nearly 4M reads (reads are for B cell antibodies). Any tools or a way to extract these CDR regions from reads(have both nucleotide and amino acid sequences). Can IMGT-Vquest be downloaded??
Thanks
Can you code? I created a position table by pulling the IMGT gene database in fasta from their ftp site (not gapped), and then pulled the CDR's into 3 separate fasta files, and aligned each one to their corresponding full sequence. Using blast tab output, I was able to get the starting and ending position for each CDR, per germline to create the position table by ids. Germid\tFR1 start\tFR1end\tCDR1start\tCDR1end..etc. Then I can blast my sequences to the db, get the germ-line id, and by that I know the positions in which to cut the sequence to get CDRs. IMGT germ-line do not contain CDR3, but you can use J-region sequences to get the ending position of CDR3. Rule of thumb: FR4 is always 10 or 11 AA depending on domain. This sort of emulates IgBLAST, which you can also use.