Sam/Bam Manipulation

bamtools: Function: converts BAM to a number of other formats

Usage: bamtools convert -format json -in myData.bam -out myData.json
bamtools: Function: Create index for BAM file

Usage: bamtools index -i <BAM FILE>
java -jar picard.jar: Function: Collects hybrid-selection (HS) metrics for a SAM or BAM file. This tool takes a SAM/BAM file input and collects metrics that are specific for sequence datasets generated through hybrid-selection. Hybrid-selection (HS) is the most commonly used technique to capture exon-specific sequences for targeted sequencing experiments such as exome sequencing; for more information, please see the corresponding GATK Dictionary entry.

Usage: java -jar picard.jar CollectHsMetrics I=input.bam O=hs_metrics.txt R=reference_sequence.fasta BAIT_INTERVALS=bait.interval_list TARGET_INTERVALS=target.interval_list
java -jar picard.jar: Function: Chart the distribution of quality scores.

Usage: java -jar picard.jar QualityScoreDistribution I=input.bam O=qual_score_dist.txt CHART=qual_score_dist.pdf
junction_saturation.py: Function: It’s very important to check if current sequencing depth is deep enough to perform alternative splicing analyses. For a well annotated organism, the number of expressed genes in particular tissue is almost fixed so the number of splice junctions is also fixed. The fixed splice junctions can be predetermined from reference gene model. All (annotated) splice junctions should be rediscovered from a saturated RNA-seq data, otherwise, downstream alternative splicing analysis is problematic because low abundance splice junctions are missing. This module checks for saturation by resampling 5%, 10%, 15%, ..., 95% of total alignments from BAM or SAM file, and then detects splice junctions from each subset and compares them to reference gene model.

Usage: junction_saturation.py -i Pairend_nonStrandSpecific_36mer_Human_hg19.bam -r hg19.refseq.bed12 -o output
java -jar picard.jar: Function: Merges multiple VCF or BCF files into one VCF file. Input files must be sorted by their contigs and, within contigs, by start position. The input files must have the same sample and contig lists. An index file is created and a sequence dictionary is required by default.

Usage: java -jar picard.jar MergeVcfs
java -jar picard.jar: Function: Converts a BED file to a Picard Interval List. This tool provides easy conversion from BED to the Picard interval_list format which is required by many Picard processing tools. Note that the coordinate system of BED files is such that the first base or position in a sequence is numbered "0", while in interval_list files it is numbered "1".BED files contain sequence data displayed in a flexible format that includes nine optional fields, in addition to three required fields within the annotation tracks. The required fields of a BED file include:

Usage: java -jar picard.jar chrom - The name of the chromosome (e.g. chr20) or scaffold (e.g. scaffold10671) chromStart - The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered "0" chromEnd - The ending position of the feature in the chromosome or scaffold. The chromEnd base is not included in the display of the feature. For example, the first 100 bases of a chromosome are defined as chromStart=0, chromEnd=100, and span the bases numbered 0-99.
java -jar picard.jar: Function: Asserts the provided gzip file's (e.g., BAM) last block is well-formed; RC 100 otherwise

Usage: java -jar picard.jar CheckTerminatorBlock
java -jar picard.jar: Function: Merge alignment data from a SAM or BAM with data in an unmapped BAM file. This tool produces a new SAM or BAM file that includes all aligned and unaligned reads and also carries forward additional read attributes from the unmapped BAM (attributes that are otherwise lost in the process of alignment). The purpose of this tool is to use information from the unmapped BAM to fix up aligner output. The resulting file will be valid for use by other Picard tools. For simple BAM file merges, use MergeSamFiles. Note that MergeBamAlignment expects to find a sequence dictionary in the same directory as REFERENCE_SEQUENCE and expects it to have the same base name as the reference FASTA except with the extension ".dict". If the output sort order is not coordinate, then reads that are clipped due to adapters or overlapping will not contain the NM, MD, or UQ tags.

Usage: java -jar picard.jar MergeBamAlignment ALIGNED=aligned.bam UNMAPPED=unmapped.bam O=merge_alignments.bam R=reference_sequence.fasta
java -jar picard.jar: Function: Reverts the original base qualities and adds the mate cigar tag to read-group BAMs.

Usage: java -jar picard.jar RevertOriginalBaseQualitiesAndAddMateCigar
java -jar picard.jar: Function: Takes a SAM or BAM file and separates all the reads into one SAM or BAM file per library name. Reads that do not have a read group specified or whose read group does not have a library name are written to a file called 'unknown.' The format (SAM or BAM) of the output files matches that of the input file.

Usage: java -jar picard.jar SplitSamByLibrary
java -jar picard.jar: Function: DEPRECATED: Use CollectHsMetrics instead. Calculates a set of Hybrid Selection specific metrics from an aligned SAMor BAM file. If a reference sequence is provided, AT/GC dropout metrics will be calculated, and the PER_TARGET_COVERAGE option can be used to output GC and mean coverage information for every target.

Usage: java -jar picard.jar CalculateHsMetrics
java -jar picard.jar: Function: Collects Illumina Basecalling metrics for a sequencing run.

Usage: java -jar picard.jar CollectIlluminaBasecallingMetrics BASECALLS_DIR=/BaseCalls/ LANE=001 READ_STRUCTURE=25T8B25T INPUT=barcode_list.txt
samtools reheader: Function: Copies header from source dataset into target dataset using samtools reheader command.

Usage: samtools reheader [-iP] in.header.sam in.bam
java -jar picard.jar: Function: Create BFQ files from a BAM file for use by the maq aligner. BFQ is a binary version of the FASTQ file format. This tool creates bfq files from a BAM file for use by the maq aligner.

Usage: java -jar picard.jar BamToBfq I=input.bam ANALYSIS_DIR=analysis_dir OUTPUT_FILE_PREFIX=output_file_1 PAIRED_RUN=false