merge multpile bam files for unique samples
2
0
Entering edit mode
9.5 years ago

I want to merge multiple bam files for each unique samples. I have a list with names of bam files corresponding to each sample ID.

Name    ID
SC_R2   AD_1
SC_R3   AD_1
SC_C1   AD_2
SC_C2   AD_2
SC_C3   AD_2
alignment R • 3.9k views
ADD COMMENT
1
Entering edit mode

So what is the question? Please frame your question properly and what have to tried to achieve it to get good answers.

ADD REPLY
0
Entering edit mode

Sorry for not being clear.

I downloaded sequencing results (.bam file) for 27 samples from EGA. There are around 400 bam files in total, which has to be merged. I have a list with names of bam files in one column and its corresponding sample ID in other column. I would like to merge these bam files so that finally I have 27 bam files (one bam file for each samples).

For example from the table, there are 5 bam files which corresponds to 2 samples,

AD_1 > SC_R2 + SC_R3

AD_2 > SC_C1 + SC_C2 + SC_C3

Hope I conveyed the message.

ADD REPLY
3
Entering edit mode
9.5 years ago
A. Domingues ★ 2.7k

Untested and taken advantage of what Devon already said:

info="description.tsv" # this is the file with the correspondence filename/sample_ID

# get the sample ID
sample_names=`cut -f2 $info | sort | uniq | tr "\n" " "`

for sample in $sample_names;
do
   bams=`grep $sample $info | cut -f1 | awk '{print $1".bam"}' | tr "\n" " "` # remove the awk statment if the suffix is already present
   echo "aligments for $sample are $bams"
   echo samtools merge $bams
done

# remove the echo in the 3rd line of the loop is testing is successful.

A bit over-elaborate but should do the trick.

ADD COMMENT
0
Entering edit mode

is this a bash script?

ADD REPLY
2
Entering edit mode

Yes, it is.

ADD REPLY
1
Entering edit mode

What Devon said :)

ADD REPLY
1
Entering edit mode
9.5 years ago
samtools merge AD_1.bam SC_R2.bam SC_R3.bam
samtools merge AD_2.bam SC_C1.bam SC_C2.bam SC_C3.bam

You could also use wild cards, most likely. Of course, it's likely that there's a bunch of information that you've not mentioned about why the two simple commands that I wrote won't suffice...

ADD COMMENT

Login before adding your answer.

Traffic: 1517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6