Reference Code backup Executable files
Mask a fasta file based on feature coordinates
bedtools maskfasta [OPTIONS] -fi <fasta> -fo <fasta> -bed <bed/gff/vcf>
This tool is part of the bedtools
suite and it's also known as maskFastaFromBed
.
bedtools maskfasta
masks sequences in a FASTA file based on intervals defined in a feature file. The headers in the input FASTA file must exactly match the chromosome column in the feature file. This may be useful for creating your own masked genome file based on custom annotations or for masking all but your target regions when aligning sequence data from a targeted capture experiment.
bedtools maskfasta
will mask a FASTA file based on the intervals in a BED file. The newly masked FASTA file is written to the output FASTA file.
$ cat test.fa >chr1 AAAAAAAACCCCCCCCCCCCCGCTACTGGGGGGGGGGGGGGGGGG $ cat test.bed chr1 5 10 $ bedtools maskfasta-fi test.fa
-bed test.bed
-fo test.fa.out
$ cat test.fa.out >chr1 AAAAANNNNNCCCCCCCCCCGCTACTGGGGGGGGGGGGGGGGGG
Using the -soft option, one can optionally “soft-mask” the FASTA file.
$ cat test.fa >chr1 AAAAAAAACCCCCCCCCCCCCGCTACTGGGGGGGGGGGGGGGGGG $ cat test.bed chr1 5 10 $ bedtools maskfasta-fi test.fa
-bed test.bed
-fo test.fa.out
-soft $ cat test.fa.out >chr1 AAAAAaaaccCCCCCCCCCCGCTACTGGGGGGGGGGGGGGGGGG
Using the -mc option, one can optionally choose a masking character to each base that will be masked by the BED file.
$ cat test.fa >chr1 AAAAAAAACCCCCCCCCCCCCGCTACTGGGGGGGGGGGGGGGGGG $ cat test.bed chr1 5 10 $ bedtools maskfasta-fi test.fa
-bed test.bed
-fo test.fa.out
-mc X
$ cat test.fa.out >chr1 AAAAAXXXXXCCCCCCCCCCGCTACTGGGGGGGGGGGGGGGGGG