Reference Code backup Executable files
Annotates one BED/VCF/GFF file with the coverage and number of overlaps observed from multiple other BED/VCF/GFF files.
annotateBed [OPTIONS] -i <BED/GFF/VCF> -files FILE1 FILE2 FILE3 ... FILEn
This tool is part of the bedtools
suite.
annotateBed
(also known as bedtools annotate
) annotates one BED/VCF/GFF file with the coverage and number of overlaps observed from multiple other BED/VCF/GFF files. In this way, it allows one to ask to what degree one feature coincides with multiple other feature types with a single command.
The following examples use a query file (variants.bed and specified as -i variants.bed
) and three database files (genes.bed, conserve.bed, and known_var.bed):
$ cat variants.bed chr1 100 200 nasty 1 - chr2 500 1000 ugly 2 + chr3 1000 5000 big 3 - $ cat genes.bed chr1 150 200 geneA 1 + chr1 175 250 geneB 2 + chr3 0 10000 geneC 3 - $ cat conserve.bed chr1 0 10000 cons1 1 + chr2 700 10000 cons2 2 - chr3 4000 10000 cons3 3 + $ cat known_var.bed chr1 0 120 known1 - chr1 150 160 known2 - chr2 0 10000 known3 +
By default, the fraction of each feature covered by each annotation file is reported after the complete feature in the file to be annotated.
$ annotateBed -i variants.bed -files genes.bed conserve.bed known_var.bed chr1 100 200 nasty 1 - 0.500000 1.000000 0.300000 chr2 500 1000 ugly 2 + 0.000000 0.600000 1.000000 chr3 1000 5000 big 3 - 1.000000 0.250000 0.000000
By using the -names option, you can add column names to the features.
$ bedtools annotate -i varaints.bed -files genes.bed conserv.bed known_var.bed -names Variants Conservation Known-varaints # Variants Conservation Known-varaints chr1 100 200 nasty 1 - 0.500000 1.000000 0.300000 chr2 500 1000 ugly 2 + 0.000000 0.600000 1.000000 chr3 1000 5000 big 3 - 1.000000 0.250000 0.000000
By turning on the -counts switch, the annotate
tool returns the number of overlapping features in each database bed files.
$ annotateBed -counts -i variants.bed -files genes.bed conserve.bed known_var.bed chr1 100 200 nasty 1 - 2 1 2 chr2 500 1000 ugly 2 + 0 1 1 chr3 1000 5000 big 3 - 1 1 0
By turning on the -both switch, the annotate
tool returns both the counts and fractions covered from each database bed files.
$ annotateBed -both -i variants.bed -files genes.bed conserve.bed known_var.bed #chr start end name score +/- cnt1 pct1 cnt2 pct2 cnt3 pct3 chr1 100 200 nasty 1 - 2 0.500000 1 1.000000 2 0.300000 chr2 500 1000 ugly 2 + 0 0.000000 1 0.600000 1 1.000000 chr3 1000 5000 big 3 - 1 1.000000 1 0.250000 0 0.000000
By turning on the -s switch, the annotate
tool only considers features on the same strand in each database bed files.
$ annotateBed -s -i variants.bed -files genes.bed conserve.bed known_var.bed chr1 100 200 nasty 1 - 0.000000 0.000000 0.000000 chr2 500 1000 ugly 2 + 0.000000 0.000000 0.000000 chr3 1000 5000 big 3 - 1.000000 0.000000 0.000000
By turning on the -S switch, the annotate
tool considers features on the opposite strand in each database bed files.
$ annotateBed -S -i variants.bed -files genes.bed conserve.bed known_var.bed chr1 100 200 nasty 1 - 0.500000 1.000000 0.300000 chr2 500 1000 ugly 2 + 0.000000 0.600000 1.000000 chr3 1000 5000 big 3 - 0.000000 0.250000 0.000000