The scVAE model
Encoder
The implementation is based on the Python implementation of the scvi-tools encoder.
scVI.scEncoder — Typemutable struct scEncoderJulia implementation of the encoder of a single-cell VAE model corresponding to the scvi-tools encoder. Collects all information on the encoder parameters and stores the basic encoder and mean and variance encoders. Can be constructed using keywords.
Keyword arguments
encoder:Flux.Chainof fully connected layers realising the first part of the encoder (before the split in mean and variance). For details, see the source code ofFC_layersinsrc/Utils.mean_encoder:Flux.Densefully connected layer realising the latent mean encodern_input: input dimension = number of genes/featuresn_hidden: number of hidden units to use in each hidden layern_output: output dimension of the encoder = dimension of latent spacen_layers: number of hidden layers in encoder and decodervar_activation: whether or not to use an activation function for the variance layer in the encodervar_encoder:Flux.Densefully connected layer realising the latent variance encodervar_eps: numerical stability constant to add to the variance in the reparameterisation of the latent representationz_transformation: whether to apply asoftmaxtransformation the latent z if assuming a lognormal instead of a normal distribution
scVI.scEncoder — MethodscEncoder(
n_input::Int,
n_output::Int;
activation_fn::Function=relu, # to use in FC_layers
bias::Bool=true,
n_hidden::Union{Int,Vector{Int}}=128,
n_layers::Int=1,
distribution::Symbol=:normal,
dropout_rate::Float32=0.1f0,
use_activation::Bool=true,
use_batch_norm::Bool=true,
use_layer_norm::Bool=false,
var_activation=nothing,
var_eps::Float32=Float32(1e-4)
)Constructor for an scVAE encoder. Initialises an scEncoder object according to the input parameters. Julia implementation of the scvi-tools encoder.
Arguments:
n_input: input dimension = number of genes/featuresn_output: output dimension of the encoder = latent space dimension
Keyword arguments:
activation_fn: function to use as activation in all encoder neural network layersbias: whether or not to use bias parameters in the encoder neural network layersn_hidden: number of hidden units to use in each hidden layer (if anIntis passed, this number is used in all hidden layers, alternatively an array ofInts can be passed, in which case the kth element corresponds to the number of units in the kth layer.n_layers: number of hidden layers in encoderdistribution:whether to use a:normalor lognormal (:ln) distribution for the latent zdropout_rate: dropout to use in all encoder layers. Setting the rate to 0.0 corresponds to no dropout.use_activation: whether or not to use an activation function in the encoder neural network layers; iffalse, overrides choice inactication_fnuse_batch_norm: whether or not to apply batch normalization in the encoder layersuse_layer_norm: whether or not to apply layer normalization in the encoder layersvar_activation: whether or not to use an activation function for the variance layer in the encodervar_eps: numerical stability constant to add to the variance in the reparameterisation of the latent representation
Decoder
The implementation is based on the Python implementation of the scvi-tools decoder.
scVI.scDecoder — Typemutable struct scDecoder <: AbstractDecoderJulia implementation of the decoder for a single-cell VAE model corresponding to the scvi-tools decoder. Collects all information on the decoder parameters and stores the decoder parts. Can be constructed using keywords.
Keyword arguments
n_input: input dimension = dimension of latent spacen_hidden: number of hidden units to use in each hidden layer (if anIntis passed, this number is used in all hidden layers, alternatively an array ofInts can be passed, in which case the kth element corresponds to the number of units in the kth layer.n_output: output dimension of the decoder = number of genes/featuresn_layers: number of hidden layers in decoderpx_decoder:Flux.Chainof fully connected layers realising the first part of the decoder (before the split in mean, dispersion and dropout decoder). For details, see the source code ofFC_layersinsrc/Utils.px_dropout_decoder: if the generative distribution is zero-inflated negative binomial (gene_likelihood = :zinbin thescVAEmodel construction):Flux.Denselayer, elsenothing.px_r_decoder: decoder for the dispersion parameter. If generative distribution is not some (zero-inflated) negative binomial, it isnothing. Else, it is a parameter vector or aFlux.Dense, depending on whether the dispersion is estimated per gene (dispersion = :gene), or per gene and cell (dispersion = :gene_cell)px_scale_decoder: decoder for the mean of the reconstruction,Flux.Chainof aDenselayer followed bysoftmaxactivationuse_batch_norm: whether or not to apply batch normalization in the decoder layersuse_layer_norm: whether or not to apply layer normalization in the decoder layers
scVI.scDecoder — MethodscDecoder(n_input, n_output;
activation_fn::Function=relu,
bias::Bool=true,
dispersion::Symbol=:gene,
dropout_rate::Float32=0.0f0,
gene_likelihood::Symbol=:zinb,
n_hidden::Union{Int,Vector{Int}}=128,
n_layers::Int=1,
use_activation::Bool=true,
use_batch_norm::Bool=true,
use_layer_norm::Bool=false
)Constructor for an scVAE decoder. Initialises an scDecoder object according to the input parameters. Julia implementation of the scvi-tools decoder.
Arguments:
n_input: input dimension of the decoder = latent space dimensionn_output: output dimension = number of genes/features in the data
Keyword arguments:
activation_fn: function to use as activation in all decoder neural network layersbias: whether or not to use bias parameters in the decoder neural network layersdispersion: whether to estimate the dispersion parameter for the (zero-inflated) negative binomial generative distribution per gene (:gene) or per gene and cell (:gene_cell)dropout_rate: dropout to use in all decoder layers. Setting the rate to 0.0 corresponds to no dropout.n_hidden: number of hidden units to use in each hidden layer (if anIntis passed, this number is used in all hidden layers, alternatively an array ofInts can be passed, in which case the kth element corresponds to the number of units in the kth layer.n_layers: number of hidden layers in decoderuse_activation: whether or not to use an activation function in the decoder neural network layers; iffalse, overrides choice inactication_fnuse_batch_norm: whether or not to apply batch normalization in the decoder layersuse_layer_norm: whether or not to apply layer normalization in the decoder layers
VAE model
The implementation is a basic version of the scvi-tools VAE object.
scVI.scVAE — Typemutable struct scVAEJulia implementation of the single-cell Variational Autoencoder model corresponding to the scvi-tools VAE object. Collects all information on the model parameters such as distribution choices and stores the model encoder and decoder. Can be constructed using keywords.
Keyword arguments
n_input::Ind: input dimension = number of genes/featuresn_batch::Int=0: number of batches in the datan_hidden::Int=128: number of hidden units to use in each hidden layern_latent::Int=10: dimension of latent spacen_layers::Int=1: number of hidden layers in encoder and decoderdispersion::Symbol=:gene: can be either:geneor:gene-cell. The Pythonscvi-toolsoptions:gene-batchandgene-labelare planned, but not supported yet.is_trained::Bool=false: indicating whether the model has been trained or notdropout_rate: Dropout to use in the encoder and decoder layers. Setting the rate to 0.0 corresponds to no dropout.gene_likelihood::Symbol=:zinb: which generative distribution to parameterize in the decoder. Can be one of:nb(negative binomial),:zinb(zero-inflated negative binomial), or:poisson(Poisson).latent_distribution::Symbol=:normal: whether or not to log-transform the input data in the encoder (for numerical stability)library_log_means::Union{Nothing, Vector{Float32}}: log-transformed means of library size; has to be provided when not using observed library size, but encoding itlibrary_log_vars::Union{Nothing, Vector{Float32}}: log-transformed variances of library size; has to be provided when not using observed library size, but encoding itlog_variational: whether or not to log-transform the input data in the encoder (for numerical stability)loss_registry::Dict=Dict(): dictionary in which to record the values of the different loss components (reconstruction error, KL divergence(s)) during traininguse_observed_lib_size::Bool=true: whether or not to use the observed library size (iffalse, library size is calculated by a dedicated encoder)z_encoder::scEncoder: Encoder struct of the VAE model for latent representation; seescEncoderl_encoder::Union{Nothing, scEncoder}: Encoder struct of the VAE model for the library size (ifuse_observed_lib_size==false), seescEncoderdecoder::AbstractDecoder: Decoder struct of the VAE model; seescDecoder
scVI.scVAE — MethodscVAE(n_input::Int;
activation_fn::Function=relu, # to be used in all FC_layers instances
bias::Symbol=:both, # whether to use bias in all linear layers of all FC instances
dispersion::Symbol=:gene,
dropout_rate::Float32=0.1f0,
gene_likelihood::Symbol=:zinb,
latent_distribution::Symbol=:normal,
library_log_means=nothing,
library_log_vars=nothing,
log_variational::Bool=true,
n_batch::Int=1,
n_hidden::Union{Int,Vector{Int}}=128,
n_latent::Int=10,
n_layers::Int=1,
use_activation::Symbol=:both,
use_batch_norm::Symbol=:both,
use_layer_norm::Symbol=:none,
use_observed_lib_size::Bool=true,
var_activation=nothing,
var_eps::Float32=Float32(1e-4),
seed::Int=1234
)Constructor for the scVAE model struct. Initialises an scVAE model with the parameters specified in the input arguments. Basic Julia implementation of the scvi-tools VAE object.
Arguments:
n_input: input dimension = number of genes/features
Keyword arguments
activation_fn: function to use as activation in all neural network layers of encoder and decoderbias: whether or not to use bias parameters in the neural network layers of encoder and decoderdispersion: can be either:geneor:gene-cell. The Pythonscvi-toolsoptions:gene-batchandgene-labelare planned, but not supported yet.dropout_rate: Dropout to use in the encoder and decoder layers. Setting the rate to 0.0 corresponds to no dropout.gene_likelihood: which generative distribution to parameterize in the decoder. Can be one of:nb(negative binomial),:zinb(zero-inflated negative binomial), or:poisson(Poisson).library_log_means: log-transformed means of library size; has to be provided when not using observed library size, but encoding itlibrary_log_vars: log-transformed variances of library size; has to be provided when not using observed library size, but encoding itlog_variational: whether or not to log-transform the input data in the encoder (for numerical stability)n_batch: number of batches in the datan_hidden: number of hidden units to use in each hidden layer (if anIntis passed, this number is used in all hidden layers, alternatively an array ofInts can be passed, in which case the kth element corresponds to the number of units in the kth layer.n_latent: dimension of latent spacen_layers: number of hidden layers in encoder and decoderuse_activation: whether or not to use an activation function in the neural network layers of encoder and decoder; iffalse, overrides choice inactication_fnuse_batch_norm: whether to apply batch normalization in the encoder/decoder layers; can be one of:encoder,:decoder,both,:noneuse_layer_norm: whether to apply layer normalization in the encoder/decoder layers; can be one of:encoder,:decoder,both,:noneuse_observed_lib_size: whether or not to use the observed library size (iffalse, library size is calculated by a dedicated encoder)var_activation: whether or not to use an activation function for the variance layer in the encodervar_eps: numerical stability constant to add to the variance in the reparameterisation of the latent representationseed: random seed to use for initialization of model parameters; to ensure reproducibility.