Multiple sam to bam followed by raw read count
2
0
Entering edit mode
7.6 years ago
Bioinfonext ▴ 470

I have multiple sam file like this:

sam file location:

/data/SNU_work/Analysis/mapped/mapped

samtool location:

/home/yog/software/samtools-1.3.1/samtools

218_9W_Pa2.sam

218_9W_Pa1.sam

216_7W_Co1.sam

216_7W_Ca2.sam

216_7W_Ca1.sam

I converting them one by one using below commnad:

/home/yog/software/samtools-1.3.1/samtool view  -b 218_9W_Pa2.sam > 218_9W_Pa2.bam

After that I am extracting mapped read from bam file and sorting of bam using below cammnd:

 /home/yog/software/samtools-1.3.1/samtools view -b -F4 218_9W_Pa2.bam > 218_9W_Pa2.mapped.bam


/home/yog/software/samtools-1.3.1/samtools sort 218_9W_Pa2.mapped.bam -o 218_9W_Pa2_mapped_sort.bam

After sorting bam I also want to do index bam file and follwed by raw read count:

  For indexing: /home/yog/software/samtools-1.3.1/samtools index 218_9W_Pa2_mapped_sort.bam



 For read count: /home/yog/software/samtools-1.3.1/samtools idxstats 218_9W_Pa2_mapped_sort.bam > readcount_for_each_bam

Please, can you suggest how can I do all step for all sam files in a single command/scripts?

Thanks

RNA-Seq • 5.5k views
ADD COMMENT
8
Entering edit mode
7.6 years ago

You can add all your flags into a single command.

In addition, the latest samtools does not even need the -S flag as it detects the input type automatically.

samtools view -F 4 -b data.sam > data.bam

To run all of these commands on all SAM file you could automate with:

ls *.sam | xargs -n 1 -I {} sh -c 'samtools view -F 4 -b {} > {}.bam'

Annoyingly this will add another extension .sam.bam so you would need to also apply a batch rename (you can find many examples of that on StackOverflow).

A more elegant solution (and the recommended practice) would be to use GNU Parallel:

ls *.sam | parallel 'samtools view -F 4 -b {} > {.}.bam'
ADD COMMENT
0
Entering edit mode

I removed flag -S from command as you have suggested.

ADD REPLY
0
Entering edit mode

I used this scripts and I able to extract mapped reads from sam file to in the bam format.

ls *.sam | xargs -n 1 -I {} sh -c 'samtools view -F 4 -b {} > {}.bam'

Now can you please suggest how to sort and index all these bam files?

ADD REPLY
1
Entering edit mode

replace the text within the single quotes with the command you wish to execute.

In general, when someone helps you with an advice you need to make a concerted effort to understand what the content consists of and how it works - that way it is easy to generalize and apply to a different situation.

ADD REPLY
0
Entering edit mode

Conceptually quite easy is a for loop

for f in *.bam
do
samtools index $f
done
ADD REPLY
0
Entering edit mode

to avoid .sam.bam you can do for j in *.sam ; do basename $j .sam ;done |sed ':a;N;$!ba;s/\n/ /g' sed is just removing the new lines by spaces

ADD REPLY
0
Entering edit mode
7.6 years ago
badribio ▴ 290

Look here this may help Using Samtools On Many Files Recursively In One Go

ADD COMMENT
2
Entering edit mode

I would not recommend this post as it references an old version of samtools and the commands listed might not work anymore.

ADD REPLY
0
Entering edit mode

thank you for correcting ! I should have mentioned about version difference.

ADD REPLY

Login before adding your answer.

Traffic: 1960 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6