Question

Identify Overlapping And Non Overlapping Regions For Paired-End Data

0

Entering edit mode

11.2 years ago

ancient_learner ▴ 680

gene1            gene2
chr1    25    30    chr1    34    37
chr1    15    20    chr1    25    28
chr1    80    90    chr1    10    13

gene1            gene2
chr1    25    30    chr1    36    39
chr1    15    20    chr1    18    20
chr1    80    90    chr1    19    22

common gene1 uniq gene2 (when we compare file 1 with file2)
chr1    15    20    chr1    25    28
chr1    80    90    chr1    10    13

common gene1 uniq gene2 (when we compare file2 with file1)

chr1    15    20    chr1    18    20
chr1    80    90    chr1    19    22

common gene1 common gene2 
 chr1    25    30    chr1    34     37  chr1    25    30    chr1    36    39

common in gene1 gene2 i was able to do with bedtools pairToPair. buth i have problem with common gene1 and uniq gene2

bedtools perl awk • 3.3k views

ADD COMMENT • link 11.2 years ago by ancient_learner ▴ 680

0

Entering edit mode

It's really unclear what you're trying to do. What does any of this have to do with paired-end data? Why are your "genes" 5-10bp? What is the input and what is the goal?

ADD REPLY • link 11.2 years ago by Devon Ryan 105k

0

Entering edit mode

its an example not the real data. so need not to be a problem at all. i know genes cannot be of 10 bps. if you consider the 2 files the gene1 and gene2(of which dummy positions were given) are interacting partners. The positions for gene1 are all common in both files only varying ones are positions of gene2.

ADD REPLY • link 11.2 years ago by ancient_learner ▴ 680

1

Entering edit mode

Please learn how to actually ask a coherent question in the future.

Reading between the lines, it seems that the first and second 3 lines of coordinates you posted are from two different files, for which you want to look at various types of intersections. However, the coordinates specified by the first 3 columns of each file are the same between the two (but are they repeated?), so should presumably by ignored other than in output. It seems that coordinates intersect if they overlap by at least 1bp.

If that's correct, this would seem to be a trivial perl/python/whatever program to write. Just parse things line by line for each file and print output dependent upon the comparison. If that's not sufficient for your needs, then you'll need to provide more information. We don't read minds here.

ADD REPLY • link 11.2 years ago by Devon Ryan 105k

5

Entering edit mode

"Please learn how to actually ask a coherent question in the future." ... "We don't read minds here."

I'm not sure what purpose these words serve. Why not respond with kinder, encouraging and respectful words--even if an OP's question may be inherently problematic? (It clearly goes without saying that my lack of understanding an OP's question doesn't strictly imply that the problem lies with the OP's question.)

I'm certainly guilty of uttering many obtuse statements--and will, most likely, continue to do so. Perhaps, however, I've just been lucky to have said them to knowledgeable individuals who have constructively and courteously replied with words which encouraged me to carefully refactor these statements.

ADD REPLY • link 11.2 years ago by Kenosis ★ 1.3k

0

Entering edit mode

Yeah, I could have been much nicer in my reply. Having said that, a good bit of insolence can also help people along, since it decreases needless back-and-forth (though I used more insolence than I should have in this case).

ADD REPLY • link 11.2 years ago by Devon Ryan 105k

0

Entering edit mode

Take a look at Quick Programming Challenge: Calculate Common And Unique Regions From A List Of Chromosome Segments on computing common and unique interval ranges. In particular, Quick Programming Challenge: Calculate Common And Unique Regions From A List Of Chromosome Segments using the IRanges R package that seems to do exactly what you want, if I understand correctly.

ADD REPLY • link 11.2 years ago by SES 8.6k