I would like to ask how to establish a grange object with double seqnames columns as well as start and end columns for strore interaction data?
I would like to ask how to establish a grange object with double seqnames columns as well as start and end columns for strore interaction data?
Assuming you are using R then you could use GenomicAlignments.
Example:
## import your reads
gr <- readGAlignmentPairs('your_hic_reads.bam')
## look at them
head(gr)
GAlignmentPairs object with 99420 pairs, strandMode=1, and 0 metadata columns:
seqnames strand : ranges -- ranges
<Rle> <Rle> : <IRanges> -- <IRanges>
[1] chr2 * : 5035-5184 -- 32376-32457
[2] <NA> * : 5089-5122 -- 2700963-2701112
## here you see that seqnames can either be a single value or <NA>. If it's a single value then the two reads are on the same chromosome, if it's NA then they are on different chromosomes
## let's subset and look at the different chrom
granges(gr[2],on.discordant.seqnames='split')
[1] chr1 55089-5122 +
[2] chr2 2700963-2701112 -
At this point you have would need to split the reads manually. The GRanges object cannot support two seqnames side-by-side (i.e. in "columns"), you could however convert each pairs of reads into a GRanges
[see below] object and then merge all reads into a giant GRangesList
. For the seqnames()
call to return the two different chromosomes, you need to convert the alignment object to GRanges with on.discordant.seqnames='split'
. Then we build a final data.frame to your specifications [note we no longer have GRanges]. Here is an example but it's not IMO the best way to do it - so I won't include suggest it.
## pull example discordant read
b <- gr[2]
## duplicate the read and then split them the two reads (R1 and R2)
bb <- granges(c(b,b),on.discordant.seqnames='split')
## remember which read is which
c('read1','read2') -> bb$read
## merge into giant data.frame
b.df <- cbind(data.frame(subset(bb,read == 'read1')),data.frame(subset(bb,read == 'read2')))
## check it
head(b.df)
seqnames start end width strand read seqnames start end width strand
1 chr1 5089 5122 34 + read1 chr2 2700963 2701112 150 -
2 chr1 5089 5122 34 + read1 chr2 2700963 2701112 150 -
read
1 read2
2 read2
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Please be more specific by giving example data and more context. What files do you have and how is the output supposed to look like?