Reference Code backup Executable files
Adds or replaces read group tags in a file.
samtools addreplacerg [-r rg line | -R rg ID] [-m mode] [-l level] [-o out.bam] <input.bam>
Read Groups (RGs) are important in bioinformatics as they contain information about the sequencing run, sample, library, and platform. This information is essential for downstream analyses, such as variant calling, where read groups can be used to handle batch effects or systematic errors related to specific sequencing runs. Some read aligners support the incorporation of read groups in the alignment step, for example, you can specify the --outSAMattrRGline option of STAR
. If the read groups are not incorporated in the alignment or you want to replace read groups, tools like samtools addreplacerg
and AddOrReplaceReadGroups
from Picard can be very handy.
To add a new read group to the header and apply it to the reads, you can use the -r option. This command adds a new read group with ID ‘fish’, library ‘1334’, and sample ‘alpha’ to the input BAM file and writes the output to ‘output.bam’:
$ samtools addreplacerg-r 'ID:fish'
-r 'LB:1334'
-r 'SM:alpha'
-o output.bam
input.bam
If you want to replace an existing read group ID with a new one, you can use the -R option. However, please note that the addreplacerg
command can only affect one @RG per call.
Remember, the -r option allows you to specify a read group line to append to the header and applies it to the reads specified by the -m option. If repeated it automatically adds in tabs between invocations. The -m option allows you to choose the mode. If you choose orphan_only then existing RG tags are not overwritten, if you choose overwrite_all, existing RG tags are overwritten.