Plotting Genomic Coordinates
1
0
Entering edit mode
16 months ago
joe_genome ▴ 50

Hello,

I'm trying to use the genomicalignments package in r to plot the coordinates of subsampled aligned bam files, these have been sorted and marked already for duplicates.

I want to compare some alignment files from the same sample, I would like to plot a scatterplot to see if there are overlap of reads or the genomic coordinates.

Not sure if this is the way to go.

Still new to bioinformatics/programming :)

ngs genome sequencing • 727 views
ADD COMMENT
2
Entering edit mode
16 months ago
seidel 11k

Not completely clear to me what you actually want, but if you have imported your BAM files into GenomicAlignment objects, you could convert them to GRanges objects and simply make a scatter plot:

library(GenomicRanges)

sample1_gr <- GRanges(sample1)
sample2_gr <- GRanges(sample2)

# scatter plot of start positions
plot(start(sample1_gr), start(sample2_gr))

# draw the diagonal
abline(0,1)

anything on the diagonal is the same. I highly doubt this is what you want.

# How many overlaps exist between samples?
sum(countOverlaps(sample1_gr, sample2_gr, ignore.strand=TRUE))

What question are you asking of your scatter plot? Maybe how many reads are identical between samples?

# resize to single nucleotide
s1 <- resize(sample1_gr, 1, fix="start")
s2 <- resize(sample2_gr, 1, fix="start")
sum(countOverlaps(s1, s2))
ADD COMMENT
0
Entering edit mode

Thanks for the response @seidel :) I actually want to know if there is a correspondence on the diagonal, I then want to remove the reads that have a matching start and end to see if there is any type of base shift done during alignment etc. Thanks, this points me in the right direction.

ADD REPLY

Login before adding your answer.

Traffic: 2718 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6