Bwa Aln Output In .Bam
3
1
Entering edit mode
12.7 years ago
Dawnoflife ▴ 10

How can I use the bwa aln function and get the output in .bam format.

I know bwasw gives the output in .sam so I just used SAMTOOLS to do the conversion but don't really know how to do it in this case as the output is in .sai.

What I have been trying to do is this:

  1. Create the indexes for all the .FNA files.

  2. Align them using aln command.

  3. Take this output and convert it to .sam.

  4. The sam files are very large to store. Hence convert them to .bam.

  5. Data abstraction, like counting the ranges of values can't be performed on the binary (.bam) file, hence I converted that to a .txt file using SAMTOOLS.

  6. Perform the desired data abstraction on the txt and get the results.

The tricky part is the placement of the files as I have to do this for about 1500 files. My code looks like this:

.././bwa index NC_008765.fna

.././bwa aln NC_008765.fna ../QUERY/updated_635_25bp.fasta > out.sai

../bwa samse NC_008765.fna out.sai ../QUERY/updated_635_25bp.fasta > aln.sam


../samtools-0.1.18/samtools-0.1.18/samtools view -bS -F 4 aln.sam > output.bam


../samtools-0.1.18/samtools-0.1.18/samtools view output2.bam > seemee.txt

awk '{ print $1 }' seemee.txt > comns.txt

sed 's/.\?.$//;s/$/00/;s/^00$/0/' comns.txt | sort | uniq -c | sort -nr > final.txt

I need to do this for all the 1000+ input files and 6 different query files. Does the general steps make sense? Can be it optimized? the code has been running on for 2 hours now and its only about 1/10th of its way through. Thanks for your help!

bwa samtools bam • 21k views
ADD COMMENT
0
Entering edit mode

Dear dawnoflife, you do not need to convert to BAM to, in the end, convert it to a text file. SAM is already a text file, so do what you want to do with your seemee.txt file directly with your aln.sam file.

ADD REPLY
0
Entering edit mode

SAM files are too large. they take up 36 mb as compared to < 1 mb for text files. unless I can perform the -F 4 operation when receiving the sam file itself.

ADD REPLY
0
Entering edit mode

So just do not use "-bS -F 4" but "-S -F 4" , you're output will still be in SAM format (test), and then use it like your seemee.txt file

ADD REPLY
0
Entering edit mode

you could then convert to BAM when you're done for persistent storage.

ADD REPLY
0
Entering edit mode

Is there any way I can pipe the sam file to bam directly and do the data analysis in b/w. I don't want to store the sam file.

ADD REPLY
5
Entering edit mode
12.7 years ago
Vikas Bansal ★ 2.4k

Use ->

bwa samse <in.db.fasta> <in.sai> <in.fq> > <out.sam> for single end reads.

bwa sampe <in.db.fasta> <in1.sai> <in2.sai> <in1.fq> <in2.fq> > <out.sam> for paired end

You can use some other parameters which are described here.

ADD COMMENT
0
Entering edit mode

So we don't use the indexes created with bwa index -a

ADD REPLY
0
Entering edit mode

Yes, the first step is to create index. and then you have 2 options -> aln and bwasw, depending on the type of reads. Since you used aln, it will give you binary file and then you have to use samse or sampe to get sam file.

ADD REPLY
0
Entering edit mode

Thanks for the help!

ADD REPLY
0
Entering edit mode

Actually, if I am trying to convert this bam file back to txt, it is giving me problems. The code below works when I convert the bam files obtained from

 bwasw .

for file in *.bam;  do /samtools-0.1.18/samtools-0.1.18/samtools view $file > ../$file.txt; done

The error I get is:

 EOF marker is absent. The input is probably truncated .

Will really appreciate the help!

ADD REPLY
0
Entering edit mode

Edited the question, I am having troubles converting the bam to .txt. I need to get data from the columns and I was planning on using

awk for it. Thanks for your help!

ADD REPLY
0
Entering edit mode

Have a look at Tony's answer. When you are using samtools view, you have to tell it that your file is in sam format (because by default it take input as bam) by adding -S argument.

ADD REPLY
0
Entering edit mode

I did use the -bS flag when using SamTools. Also how do I put code in the comments? It doesn't seem to work.

ADD REPLY
0
Entering edit mode

Not -bS , just -S because -b would output in BAM !!! Please read the samtools documentation calmly :)

ADD REPLY
3
Entering edit mode
12.7 years ago
toni ★ 2.2k

'bwa aln' does not output in BAM format but in SAM which already is what you called a 'TXT' version of the BAM (binary format).

After that, [?]samtools[?] expects an input in BAM format so you have to specify the '-S' option to let it know that your input is in SAM format.

Synthesis :

bwa sampe <in.db.fasta> <in1.sai> <in2.sai> <in1.fq> <in2.fq> > <out.sam>

samtools view -S out.sam

You should be OK now.

ADD COMMENT
0
Entering edit mode

Is <in.db.fasta> the indexes created with

bwa index ? When i was just using by .fna file for this it did not accept the input. 

ADD REPLY
0
Entering edit mode

NO. As it is suggested by the name, it expects a FASTA file. This is the fasta sequence of your reference genome and this is already an input required by 'bwa index'.

ADD REPLY
0
Entering edit mode

<in.db.fasta> is your reference genome (eg hg18, hg19 in fasta format). The first step is to create index. Use the same file in bwa samse. It should be fine if all other index files are in same directory.

ADD REPLY
0
Entering edit mode

Can I just use .fna since its a FASTA file as well. I'll try it again but the .fna didn't work for me.

ADD REPLY
0
Entering edit mode

We cannot help you if you do not clarify what is not working.

ADD REPLY
0
Entering edit mode

Please see my edited question!

ADD REPLY
1
Entering edit mode
12.7 years ago
Swbarnes2 ★ 1.6k

Are you doing something other than this to make out.bam?

bwa index ref.fa; 
bwa aln ref.fa read1.fq > r1.sai; 
bwa aln ref.fa read2.fq > r2.sai; 
bwa sampe ref.fa r1.sai r2.sai read1.fa read2.fq | samtools view -bSho out.bam -;

samtools view -S will do nothing. It will take a .sam file as input and output another .sam file.

(If you have a large index, like human, there's a slightly different index command)

ADD COMMENT
0
Entering edit mode

yes true, I wrote this without thinking but a simple 'cat' will then be enough

ADD REPLY
0
Entering edit mode

Kindly see the edited question!

ADD REPLY
0
Entering edit mode

Have you tried something like:

samtools view out.bam | awk '{print $1}' | sed 's/.\?.$//;s/$/00/;s/^00$/0/' - | sort | uniq -c | sort -nr > final.txt

And wouldn't it be easier to make an altered fastq and realign rather than altering the .bams?

ADD REPLY
0
Entering edit mode

Have you tried something like: samtools view out.bam | awk '{print $1}' | sed 's/.\?.$//;s/$/00/;s/^00$/0/' | sort | uniq -c | sort -nr > final.txt And wouldn't it be easier to make an altered fastq and realign rather than altering the .bams?

ADD REPLY

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6