Question

NCBI Blast Results parser [Use only score info]

0

Entering edit mode

9.5 years ago

clear.choi ▴ 30

I have one output files which is coming from NCBI Blast program.

blastn -db .DB/ABC_Exon23.fasta.db -query ./test.fasta -out ./test.out -num_descriptions 100 -num_alignments 0 -word_size 10

I want to get only score and top score allele lists.

So Blast Output as

BLASTN 2.2.29+

Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb
Miller (2000), "A greedy algorithm for aligning DNA sequences", J
Comput Biol 2000; 7(1-2):203-14.

Database: /shared/MiSeq/ABC_Exon23.fasta
           7,546 sequences; 4,346,452 total letters

Query= BarcodeS14--S14_Cluster0_Phase0_NumReads236

Length=1006
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

  C_08_94                                                               510   7e-145
  C_08_90                                                               510   7e-145
  C_08_75                                                               510   7e-145
  C_08_74                                                               510   7e-145
  C_08_73                                                               510   7e-145
  C_08_69                                                               510   7e-145
  C_08_68                                                               510   7e-145
  C_08_67                                                               510   7e-145
  C_08_55N                                                              510   7e-145
  C_08_53                                                               510   7e-145
  C_08_52N                                                              510   7e-145
  C_08_45                                                               510   7e-145
  C_08_37                                                               510   7e-145
  C_08_28                                                               510   7e-145
  C_08_25                                                               510   7e-145
  C_08_18                                                               510   7e-145
  C_08_17                                                               510   7e-145
  C_08_05                                                               510   7e-145
  C_08_02_08                                                            510   7e-145

<allele lists>

Lambda      K        H
    1.33    0.621     1.12

Gapped
Lambda      K        H
    1.28    0.460    0.850

Effective search space used: 4101954802

  Database: /shared/MiSeq/ABC_Exon23.fasta
    Posted date:  Jul 22, 2014  5:00 PM
  Number of letters in database: 4,346,452
  Number of sequences in database:  7,546

I want to get below score information only.

BarcodeS14--S14_Cluster0_Phase0_NumReads236       C_08_94       510
BarcodeS14--S14_Cluster0_Phase0_NumReads236       C_08_90       510
BarcodeS14--S14_Cluster0_Phase0_NumReads236       C_08_75       510
..... <lists>

Almost all parser need to have alignment information for parsing data. But I need only score and reduce time.

Does anyone has good idea?

Thank you!

NCBI blast • 1.8k views

ADD COMMENT • link updated 2.0 years ago by Ram 44k • written 9.5 years ago by clear.choi ▴ 30