Quantify cell subpopulation proportions in bulk tissue expression profiles by utilizing predefined signature genes or automatically extracted from single-cell transcriptomes or sorted cell populations.
docker run <bind_mounts> cibersortxfractions [Options]
CIBERSORTx
is an analytical tool developed by Newman et al. to impute gene expression profiles and provide an estimation of the abundances of member cell types in a mixed cell population, using gene expression data. It allows users to process gene expression data representing a bulk admixture of different cell types, along with a signature matrix file that enumerates the genes defining the expression profile for each cell type of interest. For the latter, users can either use existing/curated signature matrices for reference cell types, or can create custom signature gene files by providing the reference gene expression profiles of pure cell populations. Moreover, given the increasing use of single cell transcriptome sequencing, CIBERSORTx
also provides the option to derive signature matrices from single-cell RNA sequencing data. The fractions module of CIBERSORTx
enumerates the proportions of distinct cell subpopulations in bulk tissue expression profiles. Unlike its predecessor, CIBERSORTx
supports deconvolution of bulk RNA-Seq data using signature genes derived from either single cell transcriptomes or sorted cell populations.
docker
or singularity
is required to run this tool. You can run
docker pull cibersortx/fractions
to obtain a copy of this tool. You also need a token that you will provide every time you run the CIBERSORTx
executables. You can obtain the token from the CIBERSORTx website.
CIBERSORTx
, optional for creating a custom signature matrix only]. Formatting requirements:
CIBERSORTx
will assume that data are in log space, and will anti-log all expression values by $2^x$.CIBERSORTx
will add an unique identifier to each redundant gene symbol, however we recommend that users remove redundancy prior to file upload.CIBERSORTx
performs a feature selection and therefore typically does not use all genes in the signature matrix. It is generally ok if some genes are missing from the user’s mixture file. If less than 50% of signature matrix genes overlap, CIBERSORTx
will issue a warning.CIBERSORTx
provides two options to address platform-speicifc variations (e.g., between scRNA-seq and RNA-seq). Enabling these options requires a minimum of three mixtures samples, and more than ten mixtures is recommended.
CIBERSORTx
. CIBERSORTx
will automatically normalize the input data such that the sum of all normalized reads are the same for each transcriptome. If a gene length-normalized expression matrix is provided (e.g., RPKM), then the signature matrix will be in TPM (transcripts per million). If a count matrix is provided, the signature matrix will be in CPM (counts per million). Regardless of the input, the signature matrix and mixture files should be represented in the same normalization space. CIBERSORTx
will assume that data are in log space, and will anti-log all expression values by $2^x$.--single_cell TRUE
, the phenotype classes file will be built by CIBERSORTx
, and is not required as input. [required, if --single_cell FALSE
].--single_cell TRUE
: 300].--single_cell TRUE
: 500].--single_cell TRUE
: 0.01].Avoid special symbols in gene names; otherwise, you may see error messages like:
In fread(X_file, header = F, sep = "\t") : File '/src/outdir//temp.Fractions.coreSVR.X.tsv' has size 0. Returning a NULL data.table. Warning message: In fread(Y_file, header = F, sep = "\t") : File '/src/outdir//temp.Fractions.coreSVR.Y.tsv' has size 0. Returning a NULL data.table. Error: $ operator is invalid for atomic vectors In addition: Warning message: In mclapply(1:svn_itor, res, mc.cores = svn_itor) : all scheduled cores encountered errors in user code Execution halted
CIBERSORTx
for building the custom signature matrix.Value for {1} will be the file prefix of the --refsample, value for {2} will be the maximum condition number as specified by --k.max.
This example builds a signature matrix from single cell RNA sequencing data from NSCLC PBMCs and enumerates the proportions of the different cell types in a RNA-seq dataset profiled from whole blood using S-mode batch correction.
docker run -v absolute/path/to/input/dir:/src/data -v absolute/path/to/output/dir:/src/outdir cibersortx/fractions \--username email_address_registered_on_CIBERSORTx_website
\--token token_obtained_from_CIBERSORTx_website
\--single_cell TRUE
\--refsample Fig2ab-NSCLC_PBMCs_scRNAseq_refsample.txt
\--mixture Fig2b-WholeBlood_RNAseq.txt
\--fraction 0
--rmbatchSmode TRUE
This example builds a signature matrix from single cell RNA sequencing data from HNSCC tumors (Puram et al., Cell, 2017) and enumerates the proportions of the different cell types in bulk HNSCC tumors reconstituted from single cell RNA-Seq data.
docker run -v absolute/path/to/input/dir:/src/data -v absolute/path/to/output/dir:/src/outdir cibersortx/fractions \--username email_address_registered_on_CIBERSORTx_website
\--token token_obtained_from_CIBERSORTx_website
\--single_cell TRUE
\--refsample scRNA-Seq_reference_HNSCC_Puram_et_al_Fig2cd.txt
\--mixture mixture_HNSCC_Puram_et_al_Fig2cd.txt
This example builds a signature matrix from single cell RNA sequencing data from melanoma (Tirosh et al., Science, 2016) and enumerates the proportions of the different cell types in bulk melanoma tumors reconstituted from single cell RNA-Seq data.
docker run -v absolute/path/to/input/dir:/src/data -v absolute/path/to/output/dir:/src/outdir cibersortx/fractions \--username email_address_registered_on_CIBERSORTx_website
\--token token_obtained_from_CIBERSORTx_website
\--single_cell TRUE
\--refsample scRNA-Seq_reference_melanoma_Tirosh_SuppFig_3b-d.txt
\--mixture mixture_melanoma_Tirosh_SuppFig_3b-d.txt
This examples builds a signature matrix from sorted cell populations profiled on microarray, and enumerated cell proportions in bulk samples from microarray.
docker run -v absolute/path/to/input/dir:/src/data -v absolute/path/to/output/dir:/src/outdir cibersortx/fractions \--username email_address_registered_on_CIBERSORTx_website
\--token token_obtained_from_CIBERSORTx_website
\--refsample reference_purified_GSE11103.txt
\--phenoclasses phenoclasses_GSE11103.txt
\--mixture mixture_GSE11103.txt
--QN TRUE