Question

merge multpile bam files for unique samples

0

Entering edit mode

10.2 years ago

ifudontmind_plzz ▴ 200

I want to merge multiple bam files for each unique samples. I have a list with names of bam files corresponding to each sample ID.

Name    ID
SC_R2   AD_1
SC_R3   AD_1
SC_C1   AD_2
SC_C2   AD_2
SC_C3   AD_2

alignment R • 4.3k views

ADD COMMENT • link updated 2.6 years ago by Ram 45k • written 10.2 years ago by ifudontmind_plzz ▴ 200

1

Entering edit mode

So what is the question? Please frame your question properly and what have to tried to achieve it to get good answers.

ADD REPLY • link 10.2 years ago by Sukhi Singh 11k

0

Entering edit mode

Sorry for not being clear.

I downloaded sequencing results (.bam file) for 27 samples from EGA. There are around 400 bam files in total, which has to be merged. I have a list with names of bam files in one column and its corresponding sample ID in other column. I would like to merge these bam files so that finally I have 27 bam files (one bam file for each samples).

For example from the table, there are 5 bam files which corresponds to 2 samples,

AD_1 > SC_R2 + SC_R3

AD_2 > SC_C1 + SC_C2 + SC_C3

Hope I conveyed the message.

ADD REPLY • link updated 2.6 years ago by Ram 45k • written 10.2 years ago by ifudontmind_plzz ▴ 200

1

Entering edit mode

10.2 years ago

Devon Ryan 105k

samtools merge AD_1.bam SC_R2.bam SC_R3.bam
samtools merge AD_2.bam SC_C1.bam SC_C2.bam SC_C3.bam

You could also use wild cards, most likely. Of course, it's likely that there's a bunch of information that you've not mentioned about why the two simple commands that I wrote won't suffice...

ADD COMMENT • link 10.2 years ago by Devon Ryan 105k

Ram · Accepted Answer · 2015-06-12

3

Entering edit mode

10.2 years ago

A. Domingues ★ 2.7k

Untested and taken advantage of what Devon already said:

info="description.tsv" # this is the file with the correspondence filename/sample_ID

# get the sample ID
sample_names=`cut -f2 $info | sort | uniq | tr "\n" " "`

for sample in $sample_names;
do
   bams=`grep $sample $info | cut -f1 | awk '{print $1".bam"}' | tr "\n" " "` # remove the awk statment if the suffix is already present
   echo "aligments for $sample are $bams"
   echo samtools merge $bams
done

# remove the echo in the 3rd line of the loop is testing is successful.

A bit over-elaborate but should do the trick.

ADD COMMENT • link updated 2.6 years ago by Ram 45k • written 10.2 years ago by A. Domingues ★ 2.7k

0

Entering edit mode

is this a bash script?

ADD REPLY • link 10.2 years ago by ifudontmind_plzz ▴ 200

2

Entering edit mode

Yes, it is.

ADD REPLY • link 10.2 years ago by Devon Ryan 105k

1

Entering edit mode

What Devon said :)

ADD REPLY • link 10.2 years ago by A. Domingues ★ 2.7k