Is it possible to retrieve the original reads from a BAM file?
2
0
Entering edit mode
5.2 years ago
elcortegano ▴ 200

For testing purposes, I'm interested into work with a small limited number of reads that align well to a given genomic region, also small, so computation time and memory requirements are as low as possible.

This could be done (I guess) if from the alignment BAM file one could associate the aligned reads with specific reads in the original FASTQ file.

I don't know if this is even possible. So far, I've found nothing.

next-gen bam fastq • 5.1k views
ADD COMMENT
3
Entering edit mode

Yes, reads can be extracted using e.g. samtools fastq. It is recommended to randomize the BAM before doing that as many alignment tools expect random fastq order (for paired-end data). Could be don with samtools collate or samtools sort -n followed by samtools fastq. For other threads on this please use the search function, there are many.

ADD REPLY
1
Entering edit mode

Two questions:

  1. What will samtools fastq do with hard-clipped reads/secondary alignments? Output shortened reads? I assume some kind of filtering should be applied before.

  2. Do you have any reference for the random fastq order requirement? I have not heard of this before. That would be good to know for certain applications...

Edit: formatting

ADD REPLY
2
Entering edit mode

Yes. You can filter your BAM file to separate aligned reads and then convert them back to fastq.

You can first filter your BAM with samtools view region to get the region you need.

Then use reformat.sh from BBMap suite to retrieve reads:

reformat.sh in=your.bam out1=R1.fq.gz out2=R2.fq.gz mappedonly=t pairedonly=t primaryonly=t

If you have single end data then just use out=read.fq.gz instead.

ADD REPLY
1
Entering edit mode

So you would like to convert bam to fastq? Have you googled for that?

ADD REPLY
0
Entering edit mode

Not with that words, my mistake (I'm not native speaker). Now I 've found a way with bedtools, thank you

ADD REPLY
0
Entering edit mode

I'm not native speaker

Don't worry. Most here aren't. Neither WouterDeCoster nor me are.

ADD REPLY
0
Entering edit mode

where you able to retrieve your original reads file frombam files? Can you share the processs and commands? I really need help on it I mixed up my data and need to retrieve from bam files the raw reads.

ADD REPLY
0
Entering edit mode

There are multiple suggestions on how to do this in this thread. Suggest you pick one and try it.

ADD REPLY
2
Entering edit mode
5.2 years ago
elcortegano ▴ 200

Ok, I got a way to do it. You can find it in the following link: https://seqome.com/convert-bam-file-fastq/.

There it describes how bedtools bamtofastq can deal with the task. It will only require a sorted BAM file as input, eg:

bedtools bamtofastq -i input.bam -fq output.fq -fq2
ADD COMMENT
0
Entering edit mode

name-sorted BAM if it is paired-end.

ADD REPLY
0
Entering edit mode
5.2 years ago
ctseto ▴ 310

You can pipe them out using samtools and the sam flags 64 and 128 for read 1 and read 2, and tune the extractables more carefully with convoluted combinations of the SAM flags.

ADD COMMENT

Login before adding your answer.

Traffic: 1801 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6