IgBlast incorrect output for Light Chain
1
1
Entering edit mode
4.5 years ago
fusion.slope ▴ 250

Hello

I am trying to reconstruct the heavy chain and the light chain from two DNA sequences. I run for both purposes the command provided by IgBlastn:

database="my_path/to/db/"   
optional_file="my_path/to/optional_file/"  


igblastn -germline_db_V $database/human_igh_v -germline_db_D $database/human_igh_d -germline_db_J $database/human_igh_j -auxiliary_data $optional_file/human_gl.aux -domain_system imgt -ig_seqtype Ig -organism human -outfmt '7 std qseq sseq btop' -query example.fa -out example.fmt7

when I compare the results for the heavy chain with IgBlastn from the website tool here https://www.ncbi.nlm.nih.gov/igblast/igblast.cgi the results are identical. For the light chain however they are different. What could be the reason? The database was created following the suggestions here:
https://changeo.readthedocs.io/en/version-0.3.11---igblast-junction-fix/examples/igblast.html.

Output from command line for an example of light chain (the heavy chain is correct so i will skip those output):

# IGBLASTN 2.5.1+
# Query: RL0575_B2_no210_RL0575_B2_positive_LC
# Database: /site/ne/home/i0439277/statistical_analysis/sequences_basic/Primer_Cocktail/blastn/kleinstein-immcantation-4425cb7a6101/scripts/database//human_igh_v /site/ne/home/i0439277/statistical_analysis/sequences_basic/Primer_Cocktail/blastn/kleinstein-immcantation-4425cb7a6101/scripts/database//human_igh_d /site/ne/home/i0439277/statistical_analysis/sequences_basic/Primer_Cocktail/blastn/kleinstein-immcantation-4425cb7a6101/scripts/database//human_igh_j
# Domain classification requested: imgt

# V-(D)-J rearrangement summary for query sequence (Top V gene match, Top J gene match, Chain type, stop codon, V-J frame, Productive, Strand).  Multiple equivalent top matches having the same score and percent identity, if present, are separated by a comma.
IGHV3-47*01,IGHV3-47*02 N/A     VL      No      N/A     N/A     +

# V-(D)-J junction details based on top germline gene matches (V end, V-J junction, J start).  Note that possible overlapping nucleotides at VDJ junction (i.e, nucleotides that could be assigned to either rearranging gene) are indicated in parentheses (i.e., (TACT)) but are not included under the V, D, or J gene itself
ATTGT   N/A     N/A

# Alignment summary between query and top germline V gene hit (from, to, length, matches, mismatches, gaps, percent identity)
FR3-IMGT        240     276     37      27      10      0       73
Total   N/A     N/A     37      27      10      0       73

# Hit table (the first field indicates the chain type of the hit)
# Fields: query id, subject id, % identity, alignment length, mismatches, gap opens, gaps, q. start, q. end, s. start, s. end, evalue, bit score, query seq, subject seq, BTOP
# 3 hits found
V       RL0575_B2_no210_RL0575_B2_positive_LC      IGHV3-47*01     72.973  37      10      0       0       240     276     249     285     0.098   28.3CAGCCTCCAGTCTGAGGATGAGGCTGACTATTATTGT    CAGCCTGATAGCTGAGGACATGGCTGTGTATTATTGT   6CGCAATGATG7TCGAAT5ATCG9
V       RL0575_B2_no210_RL0575_B2_positive_LC      IGHV3-47*02     72.973  37      10      0       0       240     276     249     285     0.098   28.3CAGCCTCCAGTCTGAGGATGAGGCTGACTATTATTGT    CAGCCTGATAGCTGAGGACATGGCTGTGTATTATTGT   6CGCAATGATG7TCGAAT5ATCG9
V       RL0575_B2_no210_RL0575_B2_positive_LC      IGHV3-30-2*01   80.000  25      5       0       0       196     220     111     135     0.85    25.2TTCTCAGGCTCCAGTTCTGGGGCTG        TTCCCAGGCTCCAGGGAAGGGGCTG       3TC10TGTGCATA7

Total queries = 1
Total identifiable CDR3 = 0
Total unique clonotypes = 0

# BLAST processed 1 queries

Output from web tool for the same light chain above (the heavy chain is correct so i will skip those output):

Database: imgt.Homo_sapiens.V.f.orf.p; imgt.Homo_sapiens.D.f.orf;

imgt.Homo_sapiens.J.f.orf
           600 sequences; 158,627 total letters



Query= RL0575_B2_no210_RL0575_B2_positive_LC

Length=334
                                                                                                      Score     E
Sequences producing significant alignments:                                                          (Bits)  Value

IGLV4-69*01germline gene                                                                              391     7e-111
IGLV4-69*02germline gene                                                                              388     6e-110
IGLV4-60*03germline gene                                                                              310     2e-86 
IGLJ1*01germline gene                                                                                 66.1    5e-15 
IGLJ6*01germline gene                                                                                 35.3    8e-06 
IGLJ2*01germline gene                                                                                 29.5    5e-04 


Domain classification requested: imgt


V-(D)-J rearrangement summary for query sequence (multiple equivalent top matches, if present, are separated by a comma):
Top V gene match    Top J gene match    Chain type  stop codon  V-J frame   Productive  Strand
IGLV4-69*01 IGLJ1*01    VL  No  In-frame    Yes +


V-(D)-J junction details based on top germline gene matches:
V region end    V-J junction*   J region start
CATTC   C   TGTCT
*: Overlapping nucleotides may exist at V-D-J junction (i.e, nucleotides that could be assigned 
to either rearranging gene).  Such nucleotides are indicated inside a parenthesis (i.e., (TACAT))
 but are not included under the V, D or J gene itself.


Sub-region sequence details:
    Nucleotide sequence Translation Start   End
CDR3    CAGACCTGGGGCACTGGCATTCCTGTC     QTWGTGIPV       277     303 



Alignment summary between query and top germline V gene hit:
     from    to      length      matches     mismatches      gaps    identity(%) 
 FR1-IMGT    5   75      71      67      4   0   94.4 
 CDR1-IMGT   76      96      21      20      1   0   95.2 
 FR2-IMGT    97      147     51      43      8   0   84.3 
 CDR2-IMGT   148     168     21      20      1   0   95.2 
 FR3-IMGT    169     276     108     100     8   0   92.6 
 CDR3-IMGT (germline)    277     298     22      22      0   0   100 
 Total           294     272     22      0   92.5

Any raccomandation is really appreciated.

IgBlast LightChain HeavyChain • 2.2k views
ADD COMMENT
1
Entering edit mode
3.4 years ago
Mayla ▴ 10

I have the same problem! Did you manage to solve it? I tried APIBlast and ElasticBlast but neither is available for igblastn

ADD COMMENT
0
Entering edit mode

The genomes were not configured in the appropriate way. Try to configure the genomes as per the instruction.

ADD REPLY

Login before adding your answer.

Traffic: 2285 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6