Model evaluation
Extract latent representations
scVI.get_latent_representation
— Functionget_latent_representation(m::scVAE, countmatrix::Matrix;
cellindices=nothing, give_mean::Bool=true
)
Computes the latent representation of an scVAE
model on input count data by applying the scVAE
encoder.
Returns the mean (default) or a sample of the latent representation (can be controlled by give_mean
keyword argument).
Arguments
m::scVAE
:scVAE
model from which the encoder is applied to get the latent representationcountmatrix::Matrix
: matrix of counts (e.g.,countmatrix
field of anAnnData
object), which is to be embedded with thescVAE
model encoder. Is assumed to be in a (cell x gene) format.
Keyword arguments
cellindices=nothing
: optional; indices of cells (=rows) on which to subset thecountmatrix
before embedding itgive_mean::Bool=true
: optional; iftrue
, returns the mean of the latent representation, else returns a sample.
Returns
z
: latent representation of thecountmatrix
data, either the mean or a sample (controlled bygive_mean
keyword argument)
scVI.register_latent_representation!
— Functionregister_latent_representation!(adata::AnnData, m::scVAE; name_latent::String="scVI_latent")
Calculates the latent representation obtained from encoding the countmatrix
of the AnnData
object with a trained scVAE
model by applying the function get_latent_representation(m, adata.X)
. Stored the latent representation in the obsm
field of the input AnnData
object as name_latent
.
Arguments
adata::AnnData
:AnnData
object to which to add the latent representationm::scVAE
: trainedscVAE
model to use for encoding the data
Keyword arguments
name_latent::String="scVI_latent"
: name of the field inadata.obsm
where the latent representation is stored
Returns
- the modified
AnnData
object.
scVI.get_loadings
— Functionget_loadings(dec::scLinearDecoder)
Extracts the loadings of a scLinearDecoder
, specifically corresponding to the weight matrix of the linear scLinearDecoder.factor_regressor
layer. If batch normalisation is applied, the weight matrix is re-scaled according to the accumulated statistics in the batch norm layer (for details, see ?Flux.BatchNorm
).
Arguments
dec
:scLinearDecoder
object
Returns
- the matrix of loadings
Dimension reduction and plotting
scVI.register_umap_on_latent!
— Functionregister_umap_on_latent!(adata::AnnData, m::scVAE; name_latent::String="scVI_latent", name_umap::String="scVI_latent_umap")
Calculates a UMAP (Uniform Manifold Projection and Embedding, McInnes et al. 2018) embedding of the latent representation obtained from encoding the countmatrix
of the AnnData
object with a trained scVAE
model. If a latent representation is already stored in adata.name_latent
, this is used for calculating the UMAP, if not, a latent representation is calculated and registered by calling register_latent_representation!(adata, m)
.
The UMAP is calculated using the Julia package UMAP.jl with default parameters. It is then stored in the name_umap
field of the input AnnData
object.
Arguments
adata::AnnData
:AnnData
object to which to add the UMAP representationm::scVAE
: trainedscVAE
model to use for encoding the data
Keyword arguments
name_latent::String="scVI_latent"
: name of the field inadata.obsm
where the latent representation is storedname_umap::String="scVI_latent_umap"
: name of the field inadata.obsm
where the UMAP representation is stored
Returns
- the modified
AnnData
object.
scVI.plot_umap_on_latent
— Functionfunction plot_umap_on_latent(
m::scVAE, adata::AnnData;
name_latent::String="scVI_latent",
name_latent_umap::String="scVI_latent_umap",
save_plot::Bool=false,
seed::Int=987,
filename::String="UMAP_on_latent.pdf"
)
Plots a UMAP embedding of the latent representation obtained from encoding the countmatrix of the AnnData
object with the scVAE
model. If no UMAP representation is stored in adata.scVI_latent_umap
, it is calculated and registered by calling register_umap_on_latent(adata, m)
.
By default, the cells are color-coded according to the celltypes
field of the AnnData
object.
For plotting, the VegaLite.jl package is used.
Arguments
m::scVAE
: trainedscVAE
model to use for embedding the data with the model encoderadata:AnnData
: data to embed with the model;adata.X
is encoded withm
Keyword arguments
name_latent::String="scVI_latent"
: name of the field inadata.obsm
where the latent representation is storedname_latent_umap::String="scVI_latent_umap"
: name of the field inadata.obsm
where the UMAP representation is storedsave_plot::Bool=true
: whether or not to save the plotfilename::String="UMAP_on_latent.pdf
: filename under which to save the plot. Has no effect ifsave_plot==false
.seed::Int=987
: which random seed to use for calculating UMAP (to ensure reproducibility)
Returns
- the plot
scVI.plot_pca_on_latent
— Functionplot_pca_on_latent(
m::scVAE, adata::AnnData;
save_plot::Bool=false,
filename::String="PCA_on_latent.pdf"
)
Plots a PCA embedding of the latent representation obtained from encoding the countmatrix of the AnnData
object with the scVAE
model. If no latent representation is stored in adata.scVI_latent
, it is calculated and registered by calling register_latent_representation(adata, m)
.
PCA is calculated using the singular value decomposition implementation in LinearAlgebra.jl
, see ?LinearAlgebra.svd
. For details on the PCA implementation, see the source code in the prcomps
function in src/Evaluate.jl
.
By default, the cells are color-coded according to the celltypes
field of the AnnData
object.
For plotting, the VegaLite.jl package is used.
Arguments
m::scVAE
: trainedscVAE
model to use for embedding the data with the model encoderadata:AnnData
: data to embed with the model;adata.X
is encoded withm
Keyword arguments
save_plot::Bool=true
: whether or not to save the plotfilename::String="UMAP_on_latent.pdf
: filename under which to save the plot. Has no effect ifsave_plot==false
.
Returns
- the plot
scVI.plot_latent_representation
— Functionfunction plot_latent_representation(
m::scVAE, adata::AnnData;
name_latent::String="scVI_latent",
plot_title::String="scVI latent representation",
save_plot::Bool=false,
seed::Int=987,
filename::String="UMAP_on_latent.pdf"
)
Plots the latent representation obtained from encoding the countmatrix of the AnnData
object with the scVAE
model. If the dimension of the latent space according to m.n_latent
is > 2, it calculates a UMAP embedding first. In this case, if no UMAP representation is stored in adata.scVI_latent_umap
, it is calculated and registered by calling register_umap_on_latent(adata, m)
.
By default, the cells are color-coded according to the celltypes
column in adata.obs
, if present.
For plotting, the VegaLite.jl package is used.
Arguments
m::scVAE
: trainedscVAE
model to use for embedding the data with the model encoderadata:AnnData
: data to embed with the model;adata.X
is encoded withm
Keyword arguments
name_latent::String="scVI_latent"
: name of the field inadata.obsm
where the latent representation is storedplot_title::String="scVI latent representation"
: title of the plotsave_plot::Bool=true
: whether or not to save the plotfilename::String="UMAP_on_latent.pdf
: filename under which to save the plot. Has no effect ifsave_plot==false
.seed::Int=987
: which random seed to use for calculating UMAP (to ensure reproducibility)
Returns
- the plot
Sampling from the trained model
scVI.sample_from_prior
— Functionsample_from_prior(m::scVAE, adata::AnnData, n_samples::Int; sample_library_size::Bool=false)
Samples from the prior N(0,1) distribution of the latent representation of a trained scVAE
model. Calculates the library size based on the countmatrix
of the input AnnData
object and either samples from it or uses the mean. Subsequently draws n_samples
from the generative distribution defined by the decoder based on the samples from the prior and the library size.
Returns the samples from the model.
Arguments
m::scVAE
: trainedscVAE
model from which to sampleadata::AnnData
:AnnData
object based on which to calculate the library sizen_samples::Int
: number of samples to draw
Keyword arguments
sample_library_size::Bool=false
: whether or not to sample from the library size. Iffalse
, the mean of the observed library size is used.
Returns
- matrix of prior samples from the model
scVI.sample_from_posterior
— Functionsample_from_posterior(m::scVAE, adata::AnnData)
Samples from the posterior distribution of the latent representation of a trained scVAE
model. Calculates the latent posterior mean and variance and the library size based on the countmatrix
of the input AnnData
object and samples from the posterior. Subsequently samples from the generative distribution defined by the decoder based on the samples of the latent representation and the library size.
Returns the samples from the model.
Arguments
m::scVAE
: trainedscVAE
model from which to sampleadata::AnnData
:AnnData
object based on which to calculate the latent posterior
Returns
- matrix of posterior samples from the model
Both prior and posterior sampling are based on the following more low-level function, which is not exported but can be called as scVI.decodersample
:
scVI.decodersample
— Functiondecodersample(m::scVAE, z::AbstractMatrix{S}, library::AbstractMatrix{S}) where S <: Real
Samples from the generative distribution defined by the decoder of the scVAE
model based on values of the latent variable z
. Depending on whether z
is sampled from the prior or posterior, the function can be used to realise both prior and posterior sampling, see sample_from_posterior()
and sample_from_prior
for details.
The distribution ((zero-inflated) negative binomial or Poisson) is parametrised by mu
, theta
and zi
(logits of dropout parameter). The implementation is adapted from the corresponding scvi tools
function
Arguments
m::scVAE
:scVAE
model from which the decoder is used for samplingz::AbstractMatrix
: values of the latent representation to use as input for the decoderlibrary::AbstractMatrix
: library size values that are used for scaling in the decoder (either corresponding to the observed or the model-encoded library size)
Returns
- matrix of samples from the generative distribution defined by the decoder of the
scVAE
model