Allow ambiguous sequence in bwa index or bwa mem to capture barcode information
1
0
Entering edit mode
6.7 years ago
yesitsjess • 0

Hi all!

How can I allow for ambiguous matching to a reference sequence? I have a 4 base barcode proceeding my sequences which I need to preserve.

This is my pipeline:

bwa index amp.fa

samtools faidx amp.fa

bwa mem amp.fa file_R1.fastq file_R2.fastq > file.sam

samtools view -bS file.sam > file.bam

samtools sort file.bam > file.sorted.bam

samtools index file.sorted.bam

I then read the sorted BAM file using R with scanBam from Rsamtools and work with it there. Mostly just because I'm a lot more comfortable working with R.

The "amp.fa" file looks like this:

> amp

NNNNATGCATGCATGCATGCATGCATGCATGC

I'd hoped that the Ns would mean any reads aligning to "ATGCATGCATGCATGCATGCATGCATGC" would have the 4 proceeding bases align to "NNNN", so I'd be able to see what they are.

Can anyone suggest an alternative way to do this? Or a tweak to allow the capture of any sequence proceeding position 1 of the know sequence?

Many thanks in advance

sequencing alignment • 1.8k views
ADD COMMENT
2
Entering edit mode
6.7 years ago

Move the barcode to the read name.

ADD COMMENT
0
Entering edit mode

Thanks - sorry, brain clearly not working

ADD REPLY

Login before adding your answer.

Traffic: 2016 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6