Hello,
I have multiple .bam files from aligning reads via HISAT2. I would like to break up my bam file into single chromosome files utilizing R and then reading them into R. Question: I saw a couple ways people did this and, being new to bioinformatics/R, am having trouble. Could anyone give me some advice on how to do this? Thank you in advance!
A solution in R, using GenomicAlignment/RSamtools. Requires an indexed bam file and returns a GRanges object containing all reads on that respective chromosome. This version also assumes single-end data. If you have paired-end reads, replace readGAlignments() by readGAlignmentPairs():
# Assumes that the bam is indexed!
require(GenomicAlignments)
ExtractChr <- function(BAM, CHR){
return(granges(
readGAlignments(file = BAM,
param = ScanBamParam(which=
GRanges(seqnames=CHR,
ranges=IRanges(start=1, end=536870912))))))}# Example:
chr1.granges <- ExtractChr(BAM="/path/to/file.bam", CHR="chr1")
Not a solution in R but if you can do this externally then: How Can I Split Bam Into Chromosome (In A Loop) Using Samtools? and How To Split A Bam File By Chromosome
Thank you for your nice answer. The example for extracting chromosome 1 was very helpful as well. I was able to replicate this with no issues.
D.
If the answer below has been helpful you should upvote/accept if (green check mark).