Question

extracting fasta sequence from aligned reads

0

Entering edit mode

7.9 years ago

amoltej ▴ 100

Hello everyone, I am working on a data set where I am mapping Hise reads on my reference sequence. I have three different samples. BAM file from all three samples showing nucleotide differences. I would like to get these sequences from the file so that I can align them and study further. I tried to extract fasta file using bedtools and converting bed to fasta. But I guess this only gives back the reference file sequences that is covered in the mapping. I tried extracting reads and then using trinity to cluster them. but for some reason, trinity is avoiding the region where there is sequence changes. could you please suggest me the way to extract sequences from the aligned reads without losing sequence changes in from the reads? thanks in advance

BAM files to extract fasta sequences from [https://drive.google.com/open?id=0B42oFSm-XOAUc05rOFZrQmhBNkk][2]

RNA-Seq BAM fasta • 2.6k views

ADD COMMENT • link updated 7.9 years ago by h.mon 35k • written 7.9 years ago by amoltej ▴ 100

0

Entering edit mode

I tried extracting reads and then using trinity to cluster them

Once you extract the reads (samtools view region and then samtools fasta?) then you can convert them to fasta and then do a multiple sequence alignment with a program you like. You would likely need to edit the alignment to replicate the one from NGS data depending on how many SNP's/indels there are.

ADD REPLY • link 7.9 years ago by GenoMax 152k

score 0 · Answer 1 · 2017-09-04

0

Entering edit mode

7.9 years ago

h.mon 35k

Open all three bam files with a genome browser. I know IGV can open several bam files simultaneously, once you zoom in enough, you will be able to see the sequence of the reads (thus the SNPs) mapping to the particular location you are interested.

ADD COMMENT • link 7.9 years ago by h.mon 35k