Entering edit mode
9.4 years ago
bioinfo
▴
840
Hi, I have run Vmatch and Usearch on the same NGS dataset (Illumina 101 bp PE) against a local database of protein sequences and noticed that there is a significant difference in numbers of hits they generate.
I used the following commands:
VMATCH
vmatch -showdesc 0 -dnavsprot 11 -l 20 -identity 90 -h 2 -q input.fasta dbs/database > vmatch.out
Then I took only the best match. Database was made up using mkvtree in Vmatch.
USEARCH
usearch -usearch_global input.fasta -db db.udb -id 0.90 -threads 16 -blast6out usearch.out -maxaccepts 1
In Vmatch I allowed only 2 mismatches as you can see, whereas in USEARCH I allowed to get hits down to 90% identity, still why there is a huge difference in results?
OUTPUTS (Column means: GENE name [TAB] Normalised counts [TAB] RAW hit counts): all hits
VMATCH OUTPUT
ISCR1 7.99383167412687 4054
ISCR14 1.74291497975711 861
ISCR2 50.7803527891896 23294
ISCR3 0.0991902834008098 49
ISCR4 2.81212121212114 1392
ISCR5 0.222003929273085 113
ISCR6 2.48478701825562 1225
ISCR7 0.0078740157480315 3
ISCR8 0.0854368932038835 44
OXA-10 0.00778210116731518 2
SHV 0.00699300699300699 2
aac(3)-IVa 0.00387596899224806 1
aph(3')-Ia 0.0260012252632179 7
catB2/catB3/catB10 0.109523809523809 23
catB9 0.0909090909090909 19
dfrA1/dfrA15 0.203821656050955 32
dfrA12/dfrA13/dfrA21/dfrA22/dfrA33 0.00606060606060606 1
dfrA16 0.00636942675159236 1
dfrA3 0.296296296296297 48
dfrA6/dfrA31 0.00636942675159236 1
dfrC 0.00628930817610063 1
intI1 0.00891090857708068 3
intI10 1.41562500000001 453
intI2 0.624615384615386 203
intI3 0.401734104046244 139
intI6 0.137704918032787 42
intI7 0.161716171617162 49
intI8 0.0379746835443038 12
intI9 0.912757686598001 330
sul2 0.00738007380073801 2
tet(36) 0.0015625 1
tet(39) 0.00253164556962025 1
tet(C) 0.0277777777777778 11
tet(M) 0.00312989045383412 2
tetB(P) 0.380368098159508 248
vanRN 0.00434782608695652 1
vat(C) 0.00471698113207547 1
vat(F) 0.298642533936652 66
vat(G) 0.0833333333333333 18
USEARCH Output
ISCR1 1.24366471734895 638
ISCR2 30.2932796346512 15002
ISCR3 0.0587044534412955 29
ISCR4 0.0565656565656565 28
ISCR6 0.00202839756592292 1
ISCR7 0.0026246719160105 1
ISCR8 0.0737864077669903 38
aph(3')-Ia 0.029520295202952 8
dfrA1/dfrA15 0.152866242038217 24
dfrA3 0.0123456790123457 2
dfrA6/dfrA31 0.00636942675159236 1
intI1 0.00593471810089021 2
intI10 0.209375 67
intI2 0.116923076923077 38
intI3 0.0173410404624277 6
intI6 0.00327868852459016 1
intI8 0.00949367088607595 3
intI9 0.602232901241181 218
tet(39) 0.00253164556962025 1
tet(C) 0.0227272727272727 9
tet(M) 0.00312989045383412 2