get a bed file from a blast result
1
0
Entering edit mode
9.7 years ago
mgadrianam ▴ 30

Hi everyone

I want to obtain a bed file from a blast result to as annotation to use in a bedtool and use multicov in order to obtain the count mapping of multiples bam files. I really appeciate you help

Cordially

Adriana

blast alignment rna-seq • 5.3k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Share some of your blast output and I'm sure people here will be able to help.

ADD REPLY
0
Entering edit mode

Thanks you very much for your answers.

This is one of the blast output that I need to convert in bed file. I am grateful for your help

Sincerely
Adriana

hsa-miR-155-5p    23    hsa-miR-155-5p    21    100.00    23    0    0    1    23    25573983    25574005    6e-06    46.1
hsa-miR-3118      23    hsa-miR-3118    21    100.00    23    0    0    1    23    13644806    13644784    6e-06    46.1
hsa-miR-3197      23    hsa-miR-3197    21    100.00    23    0    0    1    23    41167564    41167586    6e-06    46.1
hsa-miR-3687      24    hsa-miR-3687    21    100.00    24    0    0    1    24    8208878    8208901    2e-06    48.1
hsa-miR-3687      24    hsa-miR-3687    21    100.00    24    0    0    1    24    8253083    8253106    2e-06    48.1
hsa-miR-3687      24    hsa-miR-3687    21    100.00    24    0    0    1    24    8391917    8391940    2e-06    48.1
hsa-miR-3687      24    hsa-miR-3687    21    100.00    24    0    0    1    24    8987404    8987427    2e-06    48.1
hsa-miR-548aa    25    hsa-miR-548aa    21    100.00    25    0    0    1    25    14178626    14178650    4e-07    50.1
hsa-miR-548aa    25    hsa-miR-548aa    21    100.00    25    0    0    1    25    14914490    14914466    4e-07    50.1
hsa-miR-548aa    25    hsa-miR-548aa    21    100.00    25    0    0    1    25    15720006    15720030    4e-07    50.1
hsa-miR-548aa    25    hsa-miR-548aa    21    100.00    25    0    0    1    25    18791088    18791064    4e-07    50.1
hsa-miR-548aa    25    hsa-miR-548aa    21    100.00    25    0    0    1    25    23841611    23841635    4e-07    50.1
hsa-miR-548aa    25    hsa-miR-548aa    21    100.00    25    0    0    1    25    33051135    33051111    4e-07    50.1
hsa-miR-548aa    25    hsa-miR-548aa    21    100.00    25    0    0    1    25    37123404    37123428    4e-07    50.1
ADD REPLY
1
Entering edit mode
9.7 years ago
mxs ▴ 530

Hi,

So you got the alignment. A classic tabular format with fields:

Query_id Subject_id, %_identity, aln_length, mismatches, gap_open, q.start, q.end, s.start, s.end, e-value, bit_score

Now a typical bed file usually contains the following columns:

chr chr_start chr_end label_name score strand thick_start thick_end item_rgb

I am going to assume that you only need the first four, right? I am also going to assume that everything is being mapped on chr1 since I don't have this information. so in order create such a bed file all you need to do is run the following command under unix :

perl -lne '/^(.*?)\t.*?\t(\d+)\t(\d+)\t([^\t]*)\t([^\t]*)$/; print "chr1\t$2\t$3\t$1"' blast.in > bed.out

cheers
mxs

UPDATE:

oh sorry + and - strands (do you need that information too ??)

perl -lne '/^(.*?)\t.*?\t(\d+)\t(\d+)\t([^\t]*)\t([^\t]*)$/; print "chr1\t". ($2>$3?$3:$2) . "\t".($2>$3?$2:$3)."\t$1"' blast.in > bed.out
ADD COMMENT
0
Entering edit mode

Hi,

Thanks a lot!

I run the line that you sent me

perl -lne '/^(.*?)\t.*?\t(\d+)\t(\d+)\t([^\t]*)\t([^\t]*)$/; print "chr21\t$2\t$3\t$1"'  blast.in > bed.out

and they give out a bed file:

chr21    25573983    25574005    hsa-miR-155-5p
chr21    13841710    13841688    hsa-miR-574-5p
chr21    35720732    35720754    hsa-miR-802
chr21    15693223    15693245    hsa-miR-548g-5p
chr21    18686157    18686135    hsa-miR-548g-5p
chr21    21601503    21601481    hsa-miR-548g-5p
chr21    24836101    24836123    hsa-miR-548g-5p
chr21    26470188    26470210    hsa-miR-548g-5p
chr21    35552367    35552389    hsa-miR-548g-5p
chr21    35552428    35552406    hsa-miR-548g-5p
chr21    45760015    45760037    hsa-miR-548g-5p
chr21    17891745    17891723    hsa-miR-548h-3p
chr21    19650252    19650274    hsa-miR-548h-3p
chr21    24946891    24946913    hsa-miR-548h-3p

Then I use that bed file to use a multicov in bedtools in order to know the count of alignments from multiple position-sorted and indexed BAM files that overlap, and give me a message like this

[mgadrianam@deneb mapping]$ bedtools multicov -bams
SRR1054203.segemehl.sorted.bam SRR1054204.segemehl.sam.bam.sorted.bam
SRR1054205.segemehl.sam.bam.sorted.bam
SRR1054206.segemehl.sam.bam.sorted.bam
SRR1054207.segemehl.sam.bam.sorted.bam -bed
../genome/chromosome.21_hsa_mature.bed.out
chr21   25573983        25574005        hsa-miR-155-5p  0       0
0       0       0
Error: malformed BED entry at line 2. Start was greater than end. Exiting.

Could you please help me to fix the output, thanks you very much,

Sincerely,
Adriana

ADD REPLY

Login before adding your answer.

Traffic: 1610 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6