This Vignette demonstrates using the equivariant estimation
procedures in the R
package tensr
. We will
simulate some data then explore the performance differences between the
uniformly minimum risk equivariant estimator (UMREE), the maximum
likelihood estimator (MLE), multiway Takemura’s estimator, and a moment
based estimator. Then we will demonstrate extracting posterior
summaries. Finally, we will demonstrate the equivariance of the Bayes
rule.
First, we generate a data array X of dimension 10 × 10 × 10. Each mode of X will have an AR-1(0.7) covariance
structure. That is cov(vec(X)) = Σ3 ⊗ Σ2 ⊗ Σ1,
where vec(⋅) is the same think as
c
in R
and just “vectorizes” X, and “⊗” is the Kronecker product. Hence Σi, j = 0.7|i − j|.
set.seed(360)
p <- c(10, 10, 10) #Dimension of tensor.
n <- length(p) #Order of the tensor.
cov_list <- list()
normalized_cov_list <- list() #Same as cov_list, but scaled down to determinant 1.
sig2_total <- 1
rho <- 0.7
for (mode_index in 1:n) {
cov_list[[mode_index]] <-
rho ^ abs(outer(1:p[mode_index], 1:p[mode_index], "-"))
scale_cov <- det(cov_list[[mode_index]]) ^ (1 / p[mode_index])
sig2_total <- sig2_total * scale_cov
normalized_cov_list[[mode_index]] <- cov_list[[mode_index]] / scale_cov
}
##Generate data with above covariance structure.
X <- atrans(array(rnorm(prod(p)), dim = p), lapply(cov_list, mhalf))
To get the UMREE, we first apply the function equi_mcmc
.
This will give us posterior draws of the inverse of the lower-triangular
Cholesky square root of each component covariance matrix. This is with
respect to using the right Haar measure over a product group of
lower-triangular matrices as our prior. Using the output from
equi_mcmc
, we apply the function
get_equi_bayes
to obtain the Bayes rule under multiway
Stein’s loss.
mcmc_out <- equi_mcmc(X, 1000)
bayes_rule <- get_equi_bayes(mcmc_out$Phi_inv, mcmc_out$sigma)
cov_umree <- bayes_rule$Sig_hat
#Estimate of the "standard deviation" form for the total variation parameter.
sig_umree <- bayes_rule$b
We can improve on the UMREE by calculating UMREEs for different rotations of the data array, then “average” these UMREEs. The more averagings the better in terms of statistical risk. This estimator is called multiway Takemura’s estimator.
## Number of averagings = 1
## Number of averagings = 2
## Number of averagings = 3
We use holq
to obtain the MLE for comparison. The output
of holq
is fed into mle_from_holq
to obtain
the MLEs of each component covariance matrix.
holq_x <- holq(X, print_diff = FALSE)
mle_x <- mle_from_holq(holq_x)
cov_mle <- mle_x$cov_mle
#The "standard deviation" form for the total variation parameter.
sig_mle <- mle_x$sig_mle
We also obtain a “moment-based” estimator for comparison. These are
just the sample covariance matrices of each mode, scaled down to have
unit determinant. The estimate of the total variation parameter is the
MLE conditional on the component covariance matrices. This is not
implemented in tensr
because it performs poorly.
get_moment <- function(X) {
p <- dim(X)
n <- length(p)
cov_moment <- list()
cov_chol_inv <- list() ## used for getting scale est
for (mode_index in 1:n) {
cov_temp <- mat(X, mode_index) %*% t(mat(X, mode_index))
cov_moment[[mode_index]] <- cov_temp/(det(cov_temp)^(1/p[mode_index]))
cov_chol_inv[[mode_index]] <- t(backsolve(chol(cov_moment[[mode_index]]),
diag(p[mode_index])))
}
sig_moment <- sqrt(sum(atrans(X, cov_chol_inv)^2)/prod(p))
return(list(cov_moment = cov_moment, sig_moment = sig_moment))
}
moment_x <- get_moment(X)
cov_moment <- moment_x$cov_moment
sig_moment <- moment_x$sig_moment
We can now compare the multiway Stein’s losses for each of the
estimators using multi_stein_loss_cov
. The risk performance
should be, in decreasing order of performance, multiway Takemura’s
estimator, UMREE, MLE, moment. Of course, for any individual
realization, the loss ordering may be different.
umree_loss <- multi_stein_loss_cov(cov_umree, normalized_cov_list, sig_umree, sqrt(sig2_total))
tak_loss <- multi_stein_loss_cov(cov_takemura, normalized_cov_list, sig_takemura,
sqrt(sig2_total))
mle_loss <- multi_stein_loss_cov(cov_mle, normalized_cov_list, sig_mle, sqrt(sig2_total))
moment_loss <- multi_stein_loss_cov(cov_moment, normalized_cov_list, sig_moment,
sqrt(sig2_total))
cat(
"Multiway Takemura's Loss:", round(tak_loss), "\n",
" UMREE Loss:", round(umree_loss), "\n",
" MLE Loss:", round(mle_loss), "\n",
" Moment Loss:", round(moment_loss), "\n"
)
## Multiway Takemura's Loss: 206
## UMREE Loss: 209
## MLE Loss: 235
## Moment Loss: 3631
You can also get the samples from the posterior if you want to
perform a fully Bayesian analysis. Just insert the output of
equi_bayes
into convert_cov
to get these
posterior samples.
cov_post[[2]]
contains the posterior samples of the
total variation parameter. cov_post[[1]][[i]][,,j]
contains
the jth posterior sample of
the ith component covariance
matrix. Here are some trace plots of the posterior samples.
par(cex.axis = 1, cex.lab = 1, cex.axis = 1, mar = c(2.4,2.6,2,0.2), mgp = c(1.5,0.5,0))
## trace plot of total variation parameter
plot(cov_post[[2]], type = "l", ylab = expression(sigma^2),
xlab = "Iteration", main = "Trace Plot")
abline(h = sig2_total, lty = 2, col = 2)
legend("topright", "True Parameter", lty = 2, col = 2, bty = "n")
## some more trace plots
random_trace <- function(cov_post, true_cov, p) {
n <- length(p)
mode_look <- sample(1:n, 1)
x_look <- sample(1:p[mode_look], size = 1)
y_look <- sample(1:p[mode_look], size = 1)
plot(cov_post[[1]][[mode_look]][x_look, y_look, ], type = "l",
xlab = "Iteration",
ylab = bquote(Sigma[.(mode_look)]["["][.(x_look)][","][.(y_look)]["]"]),
main = "Traceplot")
abline(h = true_cov[[mode_look]][x_look, y_look], lty = 2, col = 2)
legend("topright", "True Parameter", lty = 2, col = 2, bty = "n")
}
for(index in 1:4){
random_trace(cov_post, normalized_cov_list, p)
}
Here are the 95% posterior credible intervals of the covariance parameters, sorted in lexicographical ordering of the row and column indices.
par(cex.axis = 1, cex.lab = 1, cex.axis = 1, mar = c(2.4,2.6,2,0.2), mgp = c(1.5,0.5,0))
quantile_list <- list()
for (mode_index in 1:n) {
quantile_list[[mode_index]] <- apply(cov_post[[1]][[mode_index]], c(1, 2), quantile,
probs = c(0.025, 0.5, 0.975))
}
par(cex.main = 0.7)
for (mode_index in 1:n) {
upper_val <- c(quantile_list[[mode_index]][3, , ][lower.tri(diag(p[mode_index]),
diag = TRUE)])
lower_val <- c(quantile_list[[mode_index]][1, , ][lower.tri(diag(p[mode_index]),
diag = TRUE)])
median_val <- c(quantile_list[[mode_index]][2, , ][lower.tri(diag(p[mode_index]),
diag = TRUE)])
num_cred <- p[mode_index] * (p[mode_index] + 1)/2
plot(c(), xlim = c(1, num_cred), ylim = c(min(lower_val), max(upper_val)),
xaxt = "n",
xlab = "Parameter", ylab = "Value",
main = substitute(z~~Sigma[y],
list(z = "Medians and 95% Credible Intervals for",
y = mode_index)))
arrows(1:num_cred, lower_val, 1:num_cred, upper_val, length = 0)
points(1:num_cred, median_val, pch = 16, col = 2, cex = 0.4)
points(1:num_cred, median_val, pch = 1, col = 1, cex = 0.4)
abline(h = 0, lty = 2, col = 2)
v_at <- cumsum(p[mode_index]:2) + 0.5
abline(v = v_at, lty = 2)
}
We finish with a small example of how the UMREE is equivariant. First, we generate a group element of a product group of lower triangular matrices.
A_list <- list()
for (mode_index in 1:n) {
A_temp <- matrix(0, nrow = p[mode_index], ncol = p[mode_index])
A_temp[lower.tri(A_temp)] <- rnorm(p[mode_index] * (p[mode_index] - 1)/2)
diag(A_temp) <- rgamma(p[mode_index], shape = 1, rate = 1)
A_temp <- A_temp/prod(diag(A_temp))^(1/p[mode_index])
A_list[[mode_index]] <- A_temp
}
a_scale <- 10
Now we transform X.
Now we get the UMREEs of the original and the transformed data.
mcmc_out <- equi_mcmc(X, 10000)
bayes_rule <- get_equi_bayes(mcmc_out$Phi_inv, mcmc_out$sigma)
cov_x <- bayes_rule$Sig_hat
sig_x <- bayes_rule$b
mcmc_out <- equi_mcmc(X_transformed, 10000)
bayes_rule <- get_equi_bayes(mcmc_out$Phi_inv, mcmc_out$sigma)
cov_x_t <- bayes_rule$Sig_hat
sig_x_t <- bayes_rule$b
The UMREE of transformed data should equal the transformed UMREE of original data.
#These two quantities should be close.
cat(" Total Variation of transformed data:", sig_x_t, "\n",
"Transformation of total variation of original data:", sig_x * a_scale,
"\n")
## Total Variation of transformed data: 3.906237
## Transformation of total variation of original data: 3.905766
for (mode_index in 1:n) {
par(ask = TRUE)
x_val <- cov_x_t[[mode_index]][lower.tri(diag(p[mode_index]))]
y_val <- (A_list[[mode_index]] %*% cov_x[[mode_index]] %*%
t(A_list[[mode_index]]))[lower.tri(diag(p[mode_index]))]
plot(x_val, y_val, xlab = "Transform of UMREE",
ylab = "UMREE of Transformed Data",
main = "Covariance Estimates")
mtext("(Should Lie on Line if Equivariant)")
abline(c(0, 1))
par(ask = FALSE)
}
Gerard, D., & Hoff, P. (2015). Equivariant minimax dominators of the MLE in the array normal model. Journal of Multivariate Analysis, 137, 32-49. [Link to JMVA] [Link to arXiv]