NCBI Blast Results parser [Use only score info]
0
0
Entering edit mode
9.5 years ago
clear.choi ▴ 30

I have one output files which is coming from NCBI Blast program.

blastn -db .DB/ABC_Exon23.fasta.db -query ./test.fasta -out ./test.out -num_descriptions 100 -num_alignments 0 -word_size 10

I want to get only score and top score allele lists.

So Blast Output as

BLASTN 2.2.29+

Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb
Miller (2000), "A greedy algorithm for aligning DNA sequences", J
Comput Biol 2000; 7(1-2):203-14.

Database: /shared/MiSeq/ABC_Exon23.fasta
           7,546 sequences; 4,346,452 total letters

Query= BarcodeS14--S14_Cluster0_Phase0_NumReads236

Length=1006
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

  C_08_94                                                               510   7e-145
  C_08_90                                                               510   7e-145
  C_08_75                                                               510   7e-145
  C_08_74                                                               510   7e-145
  C_08_73                                                               510   7e-145
  C_08_69                                                               510   7e-145
  C_08_68                                                               510   7e-145
  C_08_67                                                               510   7e-145
  C_08_55N                                                              510   7e-145
  C_08_53                                                               510   7e-145
  C_08_52N                                                              510   7e-145
  C_08_45                                                               510   7e-145
  C_08_37                                                               510   7e-145
  C_08_28                                                               510   7e-145
  C_08_25                                                               510   7e-145
  C_08_18                                                               510   7e-145
  C_08_17                                                               510   7e-145
  C_08_05                                                               510   7e-145
  C_08_02_08                                                            510   7e-145

<allele lists>

Lambda      K        H
    1.33    0.621     1.12

Gapped
Lambda      K        H
    1.28    0.460    0.850

Effective search space used: 4101954802

  Database: /shared/MiSeq/ABC_Exon23.fasta
    Posted date:  Jul 22, 2014  5:00 PM
  Number of letters in database: 4,346,452
  Number of sequences in database:  7,546

I want to get below score information only.

BarcodeS14--S14_Cluster0_Phase0_NumReads236       C_08_94       510
BarcodeS14--S14_Cluster0_Phase0_NumReads236       C_08_90       510
BarcodeS14--S14_Cluster0_Phase0_NumReads236       C_08_75       510
..... <lists>

Almost all parser need to have alignment information for parsing data. But I need only score and reduce time.

Does anyone has good idea?

Thank you!

NCBI blast • 1.8k views
ADD COMMENT

Login before adding your answer.

Traffic: 1928 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6