Category

Genome Variant Analysis


Usage

java -jar GenomeAnalysisTK.jar -R reference.fasta -T HaplotypeCaller -I sample1.bam --emitRefConfidence GVCF [--dbsnp dbSNP.vcf] [-L targets.interval_list] -o output.raw.snps.indels.g.vcf


Manual

Argument name(s)Default valueSummary
Optional Inputs
--alleles
noneSet of alleles to use in genotyping
--dbsnp
 -D
nonedbSNP file
Optional Outputs
--activeRegionOut
 -ARO
NAOutput the active region to this IGV formatted file
--activityProfileOut
 -APO
NAOutput the raw activity profile results in IGV format
--graphOutput
 -graph
NAWrite debug assembly graph information to this file
--out
 -o
stdoutFile to which variants should be written
Optional Parameters
--contamination_fraction_to_filter
 -contamination
0.0Fraction of contamination to aggressively remove
--genotyping_mode
 -gt_mode
DISCOVERYSpecifies how to determine the alternate alleles to use for genotyping
--group
 -G
[StandardAnnotation, StandardHCAnnotation]One or more classes/groups of annotations to apply to variant calls
--heterozygosity
 -hets
0.001Heterozygosity value used to compute prior likelihoods for any locus
--heterozygosity_stdev
 -heterozygosityStandardDeviation
0.01Standard deviation of eterozygosity for SNP and indel calling.
--indel_heterozygosity
 -indelHeterozygosity
1.25E-4Heterozygosity for indel calling
--maxReadsInRegionPerSample
10000Maximum reads in an active region
--min_base_quality_score
 -mbq
10Minimum base quality required to consider a base for calling
--minReadsPerAlignmentStart
 -minReadsPerAlignStart
10Minimum number of reads sharing the same alignment start for each genomic location in an active region
--sample_name
 -sn
NAName of single sample to use from a multi-sample bam
--sample_ploidy
 -ploidy
2Ploidy per sample. For pooled data, set to (Number of samples in each pool * Sample Ploidy).
--standard_min_confidence_threshold_for_calling
 -stand_call_conf
10.0The minimum phred-scaled confidence threshold at which variants should be called
Optional Flags
--annotateNDA
 -nda
falseAnnotate number of alleles observed
--useNewAFCalculator
 -newQual
falseUse new AF model instead of the so-called exact model
Advanced Inputs
--activeRegionIn
 -AR
NAUse this interval list file as the active regions to process
--comp
[]Comparison VCF file
Advanced Outputs
--bamOutput
 -bamout
NAFile to which assembled haplotypes should be written
Advanced Parameters
--activeProbabilityThreshold
 -ActProbThresh
0.002Threshold for the probability of a profile state being active.
--activeRegionExtension
NAThe active region extension; if not provided defaults to Walker annotated default
--activeRegionMaxSize
NAThe active region maximum size; if not provided defaults to Walker annotated default
--annotation
 -A
[]One or more specific annotations to apply to variant calls
--bamWriterType
CALLED_HAPLOTYPESWhich haplotypes should be written to the BAM
--bandPassSigma
NAThe sigma of the band pass filter Gaussian kernel; if not provided defaults to Walker annotated default
--contamination_fraction_per_sample_file
 -contaminationFile
NAContamination per sample
--emitRefConfidence
 -ERC
falseMode for emitting reference confidence scores
--excludeAnnotation
 -XA
[]One or more specific annotations to exclude
--gcpHMM
10Flat gap continuation penalty for use in the Pair HMM
--GVCFGQBands
 -GQB
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 70, 80, 90, 99]Exclusive upper bounds for reference confidence GQ bands (must be in [1, 100] and specified in increasing order)
--indelSizeToEliminateInRefModel
 -ERCIS
10The size of an indel to check for in the reference model
--input_prior
 -inputPrior
[]Input prior for calls
--kmerSize
[10, 25]Kmer size to use in the read threading assembler
--max_alternate_alleles
 -maxAltAlleles
6Maximum number of alternate alleles to genotype
--max_genotype_count
 -maxGT
1024Maximum number of genotypes to consider at any site
--max_num_PL_values
 -maxNumPLValues
100Maximum number of PL values to output
--maxNumHaplotypesInPopulation
128Maximum number of haplotypes to consider for your population
--maxReadsInMemoryPerSample
30000Maximum reads per sample given to traversal map() function
--maxTotalReadsInMemory
10000000Maximum total reads given to traversal map() function
--minDanglingBranchLength
4Minimum length of a dangling branch to attempt recovery
--minPruning
2Minimum support to not prune paths in the graph
--numPruningSamples
1Number of samples that must pass the minPruning threshold
--output_mode
 -out_mode
EMIT_VARIANTS_ONLYWhich type of calls we should output
--pcr_indel_model
 -pcrModel
CONSERVATIVEThe PCR indel model to use
--phredScaledGlobalReadMismappingRate
 -globalMAPQ
45The global assumed mismapping rate for reads
Advanced Flags
--allowNonUniqueKmersInRef
falseAllow graphs that have non-unique kmers in the reference
--allSitePLs
falseAnnotate all sites with PLs
--consensus
false1000G consensus mode
--debug
falsePrint out very verbose debug information about each triggering active region
--disableOptimizations
falseDon't skip calculations in ActiveRegions with no variants
--doNotRunPhysicalPhasing
falseDisable physical phasing
--dontIncreaseKmerSizesForCycles
falseDisable iterating over kmer sizes when graph cycles are detected
--dontTrimActiveRegions
falseIf specified, we will not trim down the active region from the full region (active + extension) to just the active interval for genotyping
--dontUseSoftClippedBases
falseDo not analyze soft clipped bases in the reads
--emitDroppedReads
 -edr
falseEmit reads that are dropped for filtering, trimming, realignment failure
--forceActive
falseIf provided, all bases will be tagged as active
--useAllelesTrigger
 -allelesTrigger
falseUse additional trigger on variants found in an external alleles file
--useFilteredReadsForAnnotations
falseUse the contamination-filtered read maps for the purposes of annotating variants


Share your experience or ask a question