Hello,
I used blastx blasted my query file with a protein db file.
My query sequences more than 140000, so I just want to see aligned query sequences. but the result gives all the query , and their blast result as: aligned... or no hit was found
, respectively. that makes selection of aligned query sequences from the blastx result file a tremendous work. so if I can only extract the aligned query sequences and their alignment information (e- value, score and aligned sequence) would simplify my job a lot.
This is the blastx out file:
Query= comp936_c0_seq10 len=156 path=[335:0-24 360:25-155]
Length=156
***** No hits found *****
Lambda K H a alpha
0.318 0.134 0.401 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 21583458
Query= comp1863_c0_seq1 len=2184 path=[0:0-1278 1279:1279-1279
1280:1280-1303 1304:1304-2183]
Length=2184
Score E
Sequences producing significant alignments: (Bits) Value
FBpp0075807 FBgn0000404 symbol:CycA family:Transcription Cofact... 287 8e-89
> FBpp0075807 FBgn0000404 symbol:CycA family:Transcription Cofactors
species:Drosophila melanogaster
Length=491
Score = 287 bits (735), Expect = 8e-89, Method: Compositional matrix adjust.
Identities = 184/468 (39%), Positives = 261/468 (56%), Gaps = 50/468 (11%)
Frame = -3
Query 1810 MATINIHPDQENRV-PELRqkqannamaaqqKRTGLGLIDHN----KANKAVPKGKQ--P 1652
MA+ IH D N+ P ++ G G + N +AN AV G P
Sbjct 1 MASFQIHQDMSNKENPGIKIPAGVKNTKQPLAVIG-GKAEKNALAPRANFAVLNGNNNVP 59
Query 1651 LKESNLSNAR-VENIHVKEN------RKNVVVPVAQFEAFTVYED--DEQRARIDQKL-R 1502
+ R V N++V EN + NVV V QF+ F+VYED D Q A + L
Sbjct 60 RPAGKVQVFRDVRNLNVDENVEYGAKKSNVVPVVEQFKTFSVYEDNNDTQVAPSGKSLAS 119
Query 1501 LISKSN--VYKGTAEDRFITKTELAEIERkkqlqklKELAEIPAVIEPKCENDPCTPMSI 1328
L+ K N V G + KEL + P D +PMS+
Sbjct 120 LVDKENHDVKFGAGQ---------------------KELVDYDLDSTPMSVTDVQSPMSV 158
Query 1327 EK-LNDENAENDSSQLAEEVIRKNSNVKDL--------FFEMEEYRDDIYAYLREHELRH 1175
++ + +D S E + VK+L F E+ +Y+ DI Y RE E +H
Sbjct 159 DRSILGVIQSSDISVGTETGVSPTGRVKELPPRNDRQRFLEVVQYQMDILEYFRESEKKH 218
Query 1174 RPKPGYIVKQPDVTENMRAVLVDWLVEVTEEYKMQTETLYLAVNFIDRFLSYMSVVRAKL 995
RPKP Y+ +Q D++ NMR++L+DWLVEV+EEYK+ TETLYL+V ++DRFLS M+VVR+KL
Sbjct 219 RPKPLYMRRQKDISHNMRSILIDWLVEVSEEYKLDTETLYLSVFYLDRFLSQMAVVRSKL 278
Query 994 QLVGTAAMFIASKYEEIFPPDVSEFVYITDDTYDKHQVIRMEHLILRVLGFDLSVPTPLT 815
QLVGTAAM+IA+KYEEI+PP+V EFV++TDD+Y K QV+RME +IL++L FDL PT
Sbjct 279 QLVGTAAMYIAAKYEEIYPPEVGEFVFLTDDSYTKAQVLRMEQVILKILSFDLCTPTAYV 338
Query 814 FINATCISAGLTEKTMYLAMYLSEIALLEVEPYLQFLPSVIASSAIALARHTLGEEAWND 635
FIN + + EK Y+ +Y+SE++L+E E YLQ+LPS+++S+++ALARH LG E W
Sbjct 339 FINTYAVLCDMPEKLKYMTLYISELSLMEGETYLQYLPSLMSSASVALARHILGMEMWTP 398
Query 634 SLYKHTGYTLKQLQLCICFLYDMFVKAPNHPQHAIQDKYRSRKYMQVS 491
L + T Y L+ L+ + L A A+++KY Y +V+
Sbjct 399 RLEEITTYKLEDLKTVVLHLCHTHKTAKELNTQAMREKYNRDTYKKVA 446
Lambda K H a alpha
0.318 0.134 0.401 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 472565421
Query= comp1199_c0_seq1 len=1877 path=[19533:0-169 21522:170-173
19704:174-982 21495:983-986 20513:987-1876]
Length=1877
***** No hits found *****
Lambda K H a alpha
0.318 0.134 0.401 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 397904649
Hello Mr.RamRS,
I tried your script , before any change it showed this result:
Then my friend changed the script little bit (I believe he change the line 21
if(scalar(@ARGV) != 2)
), then it gives this:I could not be able to find where is the problem.
Check the usage line, it needs the
-i
flag before the input file name :)Run the script (the version before your friend changed it) like so:
You'd have to change the argparse code if you wanna use input files without the flag.
EDIT: The change your friend made just bypasses the line of code trying to warn you of an imminent failure - it does nothing to address the cause whatsoever :)
I tried that commend line several times too before make any change of the script, and got the same result:
I just fixed it - it should work fine now. Sorry for the inconvenience
yes Sir, it runs perfect now.
no no, there has not been any inconvenience actually. your suggestion and scripts have been great help, thank you for your time and patience.
You're very welcome! Glad I could be of help, and thank you for finding the bug in my code.
That's weird. I guess the script is a bit buggy. I'll work on it and let you know once it is tweaked. It should not take me more than a couple of hours.