Reference Code backup Executable files
Calculate the relative distance distribution between two feature files.
bedtools reldist [OPTIONS] -a <BED/GFF/VCF> -b <BED/GFF/VCF>
This tool is part of the bedtools
suite.
Traditional methods of comparing two sets of genomic intervals often rely on the quantity or ratio of overlapping intervals. However, these methods may overlook spatial correlations between the two sets, where intervals are consistently close to each other but rarely intersect. For instance, enhancers and transcription start sites are typically in close proximity, but seldom overlap, much like two sets of random intervals.
To address this, Favorov et al introduced a relative distance metric that captures the distribution of relative distances between each interval in one set and its two nearest intervals in the other set. If the two sets are not spatially correlated, the relative distances would be uniformly distributed between 0 and 0.5. However, if the intervals are closer than what would be expected by chance, the distribution of observed relative distances would lean towards lower values. reldist
is an implementation of this idea.
~~~~~~~~~~~~~~~r=20~~~~~~~~~~~~~~~~ A: ==== ====== B: ===== ~~~d1=3~~~|~~~~~~~~~~d2=17~~~~~~~~~|
In the above case, the reldist is calculated as $\frac{\min(d_1,d_2)}{r}=\frac{3}{20}$
By default, bedtools reldist
reports the distribution of relative distances between two sets of intervals. The output reports the frequency of each relative distance (ranging from 0.0 to 0.5). If the two sets of intervals are randomly distributed with respect to one another, each relative distance “bin” with be roughly equally represented (i.e., a uniform distribution). For example, consider the relative distance distance distribution for exons and AluY elements:
$ bedtools reldist \-a data/refseq.chr1.exons.bed.gz
\-b data/
aluY.chr1.bed.gz | head -n 5 0.00 164 43408 0.004 0.01 551 43408 0.013 0.02 598 43408 0.014 0.03 637 43408 0.015 0.04 793 43408 0.018
In contrast, consider the relative distance distribution observed between exons and conserved elements:
$ bedtools reldist \-a data/refseq.chr1.exons.bed.gz
\-b data/gerp.chr1.bed.gz
| head -n 5 reldist count total fraction 0.00 20629 43422 0.475 0.01 2629 43422 0.061 0.02 1427 43422 0.033 0.03 985 43422 0.023
Moreover, if one compares the relative distances for one set against itself, every interval should be expected to overlap an interval in the other set (itself). As such, the relative distances will all be 0.0:
$ bedtools reldist \-a data/refseq.chr1.exons.bed.gz
\-b data/refseq.chr1.exons.bed.gz
reldist count total fraction 0.00 43424 43424 1.000