How to obtain insert sequence from paired-end mapping result (SAM/BAM)
1
0
Entering edit mode
8.2 years ago
gundalav ▴ 380

I have paired-end reads R1 and R2.

|-----R1-----|----------------------------|-----R2-----|
       A                     B                  C

Total insert here is A + B + C. How can I get this sequence from the BAM/SAM output?

The reads are mapped using classical Bowtie2 parameter.

    bowtie2 -x myindex \
   -1 R1.fastq \
   -2 R2.fastq \
   --no-discordant \
   --no-mixed -p 4 \
    --very-sensitive  \
   --no-mixed \
    | samtools view -bS - > output bam
RNA-Seq bowtie2 alignment samtools • 2.9k views
ADD COMMENT
1
Entering edit mode

You can't really get that solely from a SAM/BAM file, since the sequence appropriate for B for a particular fragment is unknown. What is your actual biological goal?

ADD REPLY
0
Entering edit mode

If we know the mapping location of A and C, we can get A+B+C there right? With Picard we can get the A+B+C length using pathtopicard\CollectInsertSizeMetrics.jar I wonder if there is any existing tool to actually extract the fragment from the genome. At the end of the day want to translate A+B+C into protein and derive MHClass-I core from there.

ADD REPLY
0
Entering edit mode

It's unclear if you want the sequence of the fragment or the sequence of the reference genome covered by the fragment. For the latter, sure, that's easy enough with a small bit of scripting (I'd use pysam in python, but to each his/her own). For the former, see my earlier comment.

ADD REPLY
1
Entering edit mode
8.2 years ago
Fabio Marroni ★ 3.0k

Ok, if your insert strt at ChrXXX position Astart and end on the same ChrXXX at position Cend.

Then you create a bedfile and you call it yourbed.bed

ChrXXX Astart Cend

and then use

bedtools getfasta -fi yourfasta.fa -bed yourbed.bed

Done!

However, I re-emphasize what Devon just said: you do not know what the real sequence of B is. You are assuming that your sample has identical sequence to the reference.

ADD COMMENT

Login before adding your answer.

Traffic: 1877 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6