Reference Code backup Executable files
Examines a "window" around each feature in A and reports all features in B that overlap the window. For each overlap the entire entry in A and B are reported.
bedtools window [OPTIONS] -a <bed/gff/vcf> -b <bed/gff/vcf>
This tool is part of the bedtools
suite and it's also known as windowBed
.
Similar to bedtools intersect
, window
searches for overlapping features in A and B. However, window
adds a specified number (1000, by default) of base pairs upstream and downstream of each feature in A. In effect, this allows features in B that are near features in A to be detected.
A: ========= |~~~4kb~~~| |~~~~~~~~~9kb~~~~~~~~~| B: === ===== ======= intersect === window === =====-w 5000
window === ===== =======-w 10000
-l 500
for a negative-stranded feature will add 500 bp downstream. (not enabled by default)grep -v
Note: By default, the -l and -r options ignore strand. If you want to define upstream and downstream based on strand, use the -sw option with the -l and -r options.
By default, bedtools window
adds 1000 bp upstream and downstream of each A feature and searches for features in B that overlap this “window”. If an overlap is found in B, both the original A feature and the original B feature are reported.
$ cat A.bed chr1 100 200 $ cat B.bed chr1 500 1000 chr1 1300 2000 $ bedtools window-a A.bed
-b B.bed
chr1 100 200 chr1 500 1000
Instead of using the default window size of 1000bp, one can define a custom, symmetric window around each feature in A using the -w option. One should specify the window size in base pairs. For example, a window of 5kb should be defined as -w 5000
.
For example (note that in contrast to the default behavior, the second B entry is reported):
$ cat A.bed chr1 100 200 $ cat B.bed chr1 500 1000 chr1 1300 2000 $ bedtools window-a A.bed
-b B.bed
-w 5000
chr1 100 200 chr1 500 1000 chr1 100 200 chr1 1300 2000
One can also define asymmetric windows where a differing number of bases are added upstream and downstream of each feature using the -l (upstream) and -r (downstream) options.
For example (note the difference between -l 200
and -l 300
):
$ cat A.bed chr1 1000 2000 $ cat B.bed chr1 500 800 chr1 10000 20000 $ bedtools window-a A.bed
-b B.bed
-l 200
-r 20000
chr1 1000 2000 chr1 10000 20000 $ bedtools window-a A.bed
-b B.bed
-l 300
-r 20000
chr1 1000 2000 chr1 500 800 chr1 1000 2000 chr1 10000 20000
Especially when dealing with gene annotations or RNA-seq experiments, you may want to define asymmetric windows based on “strand”. For example, you may want to screen for overlaps that occur within 5000 bp upstream of a gene (e.g. a promoter region) while screening only 1000 bp downstream of the gene. By enabling the -sw (stranded windows) option, the windows are added upstream or downstream according to strand. For example, imagine one specifies -l 5000
, -r 1000
as well as the -sw option. In this case, forward stranded (+) features will screen 5000 bp to the left (that is, lower genomic coordinates) and 1000 bp to the right (that is, higher genomic coordinates). By contrast, reverse stranded (-) features will screen 5000 bp to the right (that is, higher genomic coordinates) and 1000 bp to the left (that is, lower genomic coordinates).
For example (note the difference between -l 200
and -l 300
):
$ cat A.bed chr1 10000 20000 A.forward 1 + chr1 10000 20000 A.reverse 1 - $ cat B.bed chr1 1000 8000 B1 chr1 24000 32000 B2 $ bedtools window-a A.bed
-b B.bed
-l 5000
-r 1000
-sw chr1 10000 20000 A.forward 1 + chr1 1000 8000 B1 chr1 10000 20000 A.reverse 1 - chr1 24000 32000 B2