Reference Code backup Executable files
Annotates alignments in a BAM file based on their overlaps with regions defined in BED/GFF/VCF files
bedtools tag [OPTIONS] -i <BAM> -files FILE1 .. FILEn -labels LAB1 .. LABn
This tool is part of the bedtools
suite and is also known as tagBed
.
In the following example, we will add a tag (YB, can be overrided by using the -tag option) to alignments in the bam file test.bam, if the alignments overlap with regions in bed files test1.bed or test2.bed (as specified by the -files options). Tag values are the source of regions the alignments overlap with. For example, if an alignment with a region defined in test1.bed and we label this file as s1 (as defined by the -labels option), then the alignment will have a tag value of YB:Z:s1.
$ bedtools tag-i test.bam
-files test1.bed test2.bed
-labels s1 s2
> tagged.bam $ samtools view tagged.bam | head # this alignment doesn't overlap with any regions in test1.bed or test2.bed example1.41109452 16 chr1 16223 255 30M * 0 0 GACAGTCTCAGTTGCACACACGAGCCAGCA GHGIGF>IGIIIHGFEDFFFFHFFFFFCC@ NH:i:1 HI:i:1 AS:i:29 nM:i:0 NM:i:0 MD:Z:30 jM:B:c,-1 jI:B:i,-1 # this alignment overlaps with regions in test1.bed example2.40005100 0 chr1 139013 255 30M * 0 0 GAGTAAGTTTTGGGCCCGGAGATGATGTCC BBCDDDDEHHHHHJJJJJJIJIJIJJIJJJ NH:i:1 HI:i:1 AS:i:29 nM:i:0 NM:i:0 MD:Z:30 jM:B:c,-1 jI:B:i,-1 YB:Z:s1 # this alignment overlaps with regions in test2.bed example3.17421922 0 chr1 804895 255 30M * 0 0 AGAAAACACCGGGGAAGTCCAGCCTGCACG CCCFFFFFHHHHHJJJGIJJJJJJJJJJJJ NH:i:1 HI:i:1 AS:i:29 nM:i:0 NM:i:0 MD:Z:30 jM:B:c,-1 jI:B:i,-1 YB:Z:s2 # this alignment overlaps with regions in both test1.bed and test2.bed example4.1423869 16 chr1 267979 255 30M * 0 0 TTTCTCCTCAGTTTCTCTGTGCAGCACCAG GJIJIJJJJJJJIGHFFBFGHHFFDDF@C@ NH:i:1 HI:i:1 AS:i:29 nM:i:0 NM:i:0 MD:Z:30 jM:B:c,-1 jI:B:i,-1 YB:Z:s1;s2
In the above example, we assigned two labels (s1 and s2) to regions defined in test1.bed and test2.bed respectively. If you want to use regions' names (the fourth column in a bed file) as the tag value, you can use the -names options:
$ bedtools tag-i test.bam
-files s1.bed s2.bed
-names | samtools view | head example2.40005100 0 chr1 139013 255 30M * 0 0 GAGTAAGTTTTGGGCCCGGAGATGATGTCC BBCDDDDEHHHHHJJJJJJIJIJIJJIJJJ NH:i:1 HI:i:1 AS:i:29 nM:i:0 NM:i:0 MD:Z:30 jM:B:c,-1 jI:B:i,-1 YB:Z:EH38E2776521
Similarly, if you want to use regions' scores (the fifth column in a bed file), use the -scores option instead