Subtracting one BAM [control] from another BAM [sample].
2
0
Entering edit mode
7.2 years ago
anu014 ▴ 190

Hello Biostars!

Today I am trying to find a general approach that can be used on BAMs to normalise among each other. More specifically, I am trying to subtract the read alignment of control from sample alignment. Is there any tool that can do that?

Please help me out.. Thank you :)

next-gen Assembly ChIP-Seq • 4.7k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Yeah. But, won't BamUtil will give either all unique or common alignments? I want subtraction such as if in sample read alignment count is 12 n in control it's 2, resultant BAM should contain 10 read for that specific position.

ADD REPLY
0
Entering edit mode
7.2 years ago
Sej Modha 5.3k

bamCompare from deepTools can help.

ADD COMMENT
0
Entering edit mode

That's a nice tool. But I want result in BAM format. Is that possible in bamCompare?

ADD REPLY
1
Entering edit mode

Two possible output formats for bamCompare are bedgraph and bigwig.

ADD REPLY
0
Entering edit mode
7.2 years ago

merge you control and case using picard:

java -jar  picard.jar MergeSamFiles I=S1.bam I=S2.bam O=merged.bam

using samjdk http://lindenb.github.io/jvarkit/SamJdk.html and the following script:

 private final List<SAMRecord> buffer= new ArrayList<>();

private int compareRead(final SAMRecord R1,final SAMRecord R2) {
    int i=R1.getContig().compareTo(R2.getContig());
    if(i!=0) return i;
    return R1.getStart() - R2.getStart();
    }

@Override
 public Object apply(final SAMRecord record) {
 if(record.getReadUnmappedFlag()) return false;
 if(buffer.isEmpty() || compareRead(buffer.get(0),record)==0)
    {
    buffer.add(record);
    return null;
    }
 else
    {
    List<SAMRecord> copy= new ArrayList<>();
    if(!buffer.isEmpty())
        {
        long nControls = buffer.stream().filter(R->R.getReadGroup().getSample().equals("control")).count();

        buffer.removeIf(R->R.getReadGroup().getSample().equals("control")); //remove all control
        while(nControls>0 && buffer.size() > nControls)
            {
            buffer.remove(0);
            nControls--;
            }
         copy.addAll(buffer);
         buffer.clear();
        }
      buffer.add(record);
      return copy;
     }
}

usage:

 java -jar dist/samjdk.jar --body -f snippet.txt merged.bam

note:

1) the read are the sam if they're mapped on same chrom/start

2) the last mapped read will be lost.

EDIT: fixed/updated the script

ADD COMMENT
0
Entering edit mode

Hi Pierre, what does it do if control reads are higher than the sample/case reads for a position ?

ADD REPLY
1
Entering edit mode

what does it do if control reads are higher than the sample/case reads for a position ?

nothing is printed

ADD REPLY
0
Entering edit mode

Hi Pierre, Okay, using MergeSamFiles I'll be merging the BAMs . But what's the concept of 'samjdk' ? As commented yesterday, I want subtraction such as if in sample read alignment count is 12 n in control it's 2, resultant BAM should contain 10 read for that specific position. My problem sounds more like normalisation of Sample with Control..

ADD REPLY
0
Entering edit mode

I want subtraction such as if in sample read alignment count is 12 n in control it's 2, resultant BAM should contain 10 read for that specific position.

and that's what is doing this small code.

ADD REPLY
0
Entering edit mode

Cool. I tried like this:

java -jar picard-tools-2.5.0/picard.jar MergeSamFiles I=LshKO_H3K4me1_sorted.bam I=Input_WT_sorted.bam O=merged.bam
java -jar /home/softwares/jvarkit/dist/samjdk.jar --body -f snippet.txt merged.bam
Note : I saved the code given by you as "snippet.txt"

But I got error in step-2 :

ADD REPLY
0
Entering edit mode
[DEBUG][SamJdk] Compiling :
         1  import java.util.*;
         2  import java.util.stream.*;
         3  import java.util.function.*;
         4  import htsjdk.samtools.*;
         5  import htsjdk.samtools.util.*;
         6  import javax.annotation.Generated;
         7  @Generated(value="SamJdk",date="2017-10-17T14:34:06+0530")
         8  public class SamJdkCustom1427454017 extends com.github.lindenb.jvarkit.tools.samjs.SamJdk.AbstractFilter {
         9    public SamJdkCustom1427454017(final SAMFileHeader header) {
        10    super(header);
        11    }
        12     //user's code starts here
        13  private final List<SAMRecord> buffer= new ArrayList<>();
        14  
        15  private int compareRead(final SAMRecord R1,final SAMRecord R2) {
        16      int i=R1.getContig().compareTo(R2.getContig());
        17      if(i!=0) return i;
        18      return R1.getStart() - R2.getStart();
        19      }
        20  
        21  @Override
        22   public Object apply(final SAMRecord record) {
        23   if(record.getReadUnmappedFlag()) return false;
        24   if(buffer.isEmpty() || compareRead(buffer.get(0),record)==0)
        25      {
        26      buffer.add(record);
        27      return null;
        28      }
        29   else
        30      {
        31      List<SAMRecord> copy= new ArrayList<>();
        32      if(!buffer.isEmpty())
        33          {
        34          long nControls = buffer.stream().filter(R->R.getReadGroup().getSample().equals("control")).count();
        35  
        36          buffer.removeIf(R->R.getReadGroup().getSample().equals("control")); //remove all control
        37          while(nControls>0 && buffer.size() > nControls)
        38              {
        39              buffer.remove(0);
        40              nControls--;
        41              }
        42           copy.addAll(buffer);
        43           buffer.clear();
        44          }
        45        buffer.add(record);
        46        return copy;
        47       }
        48  }
        49  
        50     // user's code ends here
        51  }
[SEVERE][SamJdk]null
java.lang.NullPointerException
    at SamJdkCustom1427454017.lambda$apply$0(SamJdkCustom1427454017.java:34)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.LongPipeline.reduce(LongPipeline.java:438)
    at java.util.stream.LongPipeline.sum(LongPipeline.java:396)
    at java.util.stream.ReferencePipeline.count(ReferencePipeline.java:526)
    at SamJdkCustom1427454017.apply(SamJdkCustom1427454017.java:34)
    at com.github.lindenb.jvarkit.tools.samjs.SamJdk.doWork(SamJdk.java:399)
    at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMain(Launcher.java:1156)
    at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMainWithExit(Launcher.java:1314)
    at com.github.lindenb.jvarkit.tools.samjs.SamJdk.main(SamJdk.java:483)
@HD VN:1.5  GO:none SO:coordinate
@SQ SN:chr1 LN:195471971
.
.
@SQ SN:chrUn_JH584304   LN:114452
@PG ID:bowtie2  PN:bowtie2  VN:2.2.6    CL:"/usr/bin/bowtie2-align-s --wrapper basic-0 --local --phred33 -p 30 -x /reference_genomes/mm10_ucsc/mm10 -S /raw/bowtie/LshKO_H3K4me1.sam -U /raw/truncated_files/H3K4me1/LshKO-H3K4-me1_L001_R1_001_truncated.fastq,/raw/truncated_files/H3K4me1/LshKO-H3K4-me1_L002_R1_001_truncated.fastq,/raw/truncated_files/H3K4me1/LshKO-H3K4-me1_L003_R1_001_truncated.fastq,/raw/truncated_files/H3K4me1/LshKO-H3K4-me1_L004_R1_001_truncated.fastq"
@PG ID:bowtie2.1    PN:bowtie2  VN:2.2.6    CL:"/usr/bin/bowtie2-align-s --wrapper basic-0 --local --phred33 -p 30 -x /reference_genomes/mm10_ucsc/mm10 -S /raw/bowtie/Input_WT.sam -U /raw/truncated_files/Input_WT/WT-Input_L001_R1_001_truncated.fastq,/raw/truncated_files/Input_WT/WT-Input_L002_R1_001_truncated.fastq,/raw/truncated_files/Input_WT/WT-Input_L003_R1_001_truncated.fastq,/raw/truncated_files/Input_WT/WT-Input_L004_R1_001_truncated.fastq"
[INFO][Launcher]samjdk Exited with failure (-1)
ADD REPLY
1
Entering edit mode

your two bams are missing a RG tag with a sample name to tell which 'rea'd is control, which read is 'case'.

See picard AddOrReplaceReadGroups or --rg-id <text> in http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml

ADD REPLY
1
Entering edit mode

add one sample must be named "control" to fit with 'snippet.txt'

ADD REPLY
0
Entering edit mode

Hi Pierre, I finally could run all the commands successfully. I tried to see the coverage for all the files (Input, LshKO_without_input & normalised bams). So I used 'bedtools genomecov' & got a position like this:

chrUn_JH584304  114445  114446  7  #norm.bdg
chrUn_JH584304  114445  114446  23  #input.bdg
chrUn_JH584304  114445  114446  13 #LshKO_H3K4me1_without_input.bdg

According to my perception it should be '0' in norm.bdg. Am I missing something?

This was my command to generate Bedgraph files:

bedtools genomecov -ibam /BAMs/Input_WT_sorted.bam -bga > Input.bdg
ADD REPLY
1
Entering edit mode

your looking at the coverage, not a the 'same' reads starting at a givent position.

ADD REPLY
0
Entering edit mode

Okay.. then how to to compare all LshKO_without_input, Input & normalised files?

ADD REPLY

Login before adding your answer.

Traffic: 2684 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6