Distance metrics of loci of 2 types in 2 GFFs (bedtools?)
1
1
Entering edit mode
9.4 years ago
Anand Rao ▴ 640

Hi folks!

I have two GFF3 files, one with annotated genes. Another with predicted transposons.

I want to find the numerical distribution of the distances between transposons and genes.

I already found the distance amongst just transposons, and amongst just genes - using the 'spacing' sub-command of bedtools 2.2.4.0, however 'spacing' cannot compare two GFF files.

For comparing spacing of features in 2 separate GFFs, bedtools 2.24.0 has the reldist sub-command, but this does not yield absolute distance in base pairs, but only relative distance distribution, so its not so useful to me...

So I am not sure if there is an off the shelf option in bedtools that can help me answer this question. Would any of you have a simple solution to my problem? It does not have to be using bedtools... Thank you!

bedtools GFF • 1.9k views
ADD COMMENT
0
Entering edit mode

Thank you dariober.

At the http://bedtools.readthedocs.org/en/latest/content/tools/closest.html, there are several options, and I wonder if I should be using:

bedtools closest -a A.gff3 -b B.gff3 -s -d # for closest distance of A to B, on same strand

and

bedtools closest -a A.gff3 -b B.gff3 -S -d # for closest distance of A to B, on opposite strand

Does that seem right to you?

I ask because I wonder if I should instead use other options or additional options, but for my question, I suspect -s -d and -S -d should be appropriate. Do you agree? Thanks again!

ADD REPLY
0
Entering edit mode

I think you have to figure this out yourself depending on the question you are asking. All these options are indeed quite confusing (not because the tool is badly written but because the biology is complex!).

ADD REPLY
3
Entering edit mode
9.4 years ago

Maybe closestBed with -d option is what you want?

Tool:    bedtools closest (aka closestBed)
Version: v2.23.0
Summary: For each feature in A, finds the closest
         feature (upstream or downstream) in B.
...
    -d    In addition to the closest feature in B, 
          report its distance to A as an extra column.
          - The reported distance for overlapping features will be 0.
ADD COMMENT

Login before adding your answer.

Traffic: 808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6