Title: | Infinite Mixtures of Infinite Factor Analysers and Related Models |
---|---|
Description: | Provides flexible Bayesian estimation of Infinite Mixtures of Infinite Factor Analysers and related models, for nonparametrically clustering high-dimensional data, introduced by Murphy et al. (2020) <doi:10.1214/19-BA1179>. The IMIFA model conducts Bayesian nonparametric model-based clustering with factor analytic covariance structures without recourse to model selection criteria to choose the number of clusters or cluster-specific latent factors, mostly via efficient Gibbs updates. Model-specific diagnostic tools are also provided, as well as many options for plotting results, conducting posterior inference on parameters of interest, posterior predictive checking, and quantifying uncertainty. |
Authors: | Keefe Murphy [aut, cre] |
Maintainer: | Keefe Murphy <[email protected]> |
License: | GPL (>= 3) |
Version: | 2.2.0 |
Built: | 2025-02-22 03:55:19 UTC |
Source: | https://github.com/keefe-murphy/imifa |
A package for Bayesian nonparametric clustering of high-dimensional data sets, providing functions for fitting, diagnostic tools and plotting for Infinite Mixtures of Infinite Factor Analysers and the full suite of related models introduced by Murphy et al. (2020) <doi:10.1214/19-BA1179>. Allows model based clustering with factor analytic covariance structures without recourse to model selection criteria to choose the number of clusters or cluster-specific latent factors. Model-specific diagnostic tools are also provided, as well as many options for plotting results, conducting posterior inference on parameters of interest, posterior predictive checking, and quantifying uncertainty.
Package
IMIFA
2.2.0
2023-12-12 (this version), 2017-02-02 (original release)
GPL (>= 3)
The three most important functions in the IMIFA package are: mcmc_IMIFA
, for fitting the model, get_IMIFA_results
, for extracting results from objects of the "IMIFA"
class generated by mcmc_IMIFA
, and the dedicated plotting function plot.Results_IMIFA
, for plotting results pertaining to parameters of inferential interest from objects of class "Results_IMIFA"
generated by get_IMIFA_results
.
Other functions also exist, e.g. for simulating data from a multivariate mixture of factor analysers, many functions for soliciting good priors, and many functions related to plotting.
mcmc_IMIFA
:
This function estimates models in the IMIFA family under the Bayesian paradigm. Most importantly, one must specify the method
in the form of an acronym (e.g. "MIFA"
for Mixtures of Infinite Factor Analysers) and ranges of values for range.G, the number of clusters, and range.Q, the number(s) of (cluster-specific) latent factors as required by said method.
get_IMIFA_results
:
Raw simulation objects generated by mcmc_IMIFA() are passed to this function in order to extract results of interest and conduct further post-processing if necessary.
plot.Results_IMIFA
:
Results obtained from get_IMIFA_Results are passed to this function with the type of plot desired specified by plot.meth
(e.g. "trace"
) and the parameter of interest specified by param
(e.g. "loadings"
).
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
Further details and examples are given in the associated vignette document:vignette("IMIFA", package = "IMIFA")
Keefe Murphy [aut, cre], Cinzia Viroli [ctb], Isobel Claire Gormley [ctb]
Maintainer: Keefe Murphy - <[email protected]>
Useful links:
Supplies a list of arguments for use in mcmc_IMIFA
pertaining to the use of the Bayesian Nonparametric Pitman-Yor / Dirichlet process priors with the infinite mixture models "IMFA"
and "IMIFA"
. Certain arguments related to the Dirichlet concentration parameter for the overfitted mixtures "OMFA"
and "OMIFA"
can be supplied in this manner also.
bnpControl(learn.alpha = TRUE, alpha.hyper = c(2L, 4L), discount = 0, learn.d = TRUE, d.hyper = c(1L, 1L), ind.slice = TRUE, rho = 0.75, trunc.G = NULL, kappa = 0.5, IM.lab.sw = TRUE, thresh = FALSE, exchange = FALSE, zeta = NULL, tune.zeta = list(...), ...)
bnpControl(learn.alpha = TRUE, alpha.hyper = c(2L, 4L), discount = 0, learn.d = TRUE, d.hyper = c(1L, 1L), ind.slice = TRUE, rho = 0.75, trunc.G = NULL, kappa = 0.5, IM.lab.sw = TRUE, thresh = FALSE, exchange = FALSE, zeta = NULL, tune.zeta = list(...), ...)
learn.alpha |
|
alpha.hyper |
|
discount |
The discount parameter used when generalising the Dirichlet process to the Pitman-Yor process. Defaults to 0, but typically must lie in the interval [0, 1). If greater than zero, The special case of |
learn.d |
Logical indicating whether the |
d.hyper |
Hyperparameters for the Beta(a,b) prior on the |
ind.slice |
Logical indicating whether the independent slice-efficient sampler is to be employed (defaults, typically, to |
rho |
Parameter controlling the rate of geometric decay for the independent slice-efficient sampler, s.t. |
trunc.G |
The maximum number of allowable and storable clusters under the |
kappa |
The spike-and-slab prior distribution on the |
IM.lab.sw |
Logical indicating whether the two forced label switching moves are to be implemented (defaults to |
thresh |
Logical indicating whether the threshold of Fall and Barat (2014) should be incorporated into the slice sampler. See the reference for details. This is an experimental feature (defaults to |
exchange |
Logical indicating whether the exchangeable slice sampler of Fall and Barat (2014) should be used instead. See the reference for details. This argument can work with or without |
zeta |
|
tune.zeta |
A list with the following named arguments, used for tuning
At least one If diminishing adaptation is invoked, the posterior mean |
... |
Catches unused arguments. |
The crucial concentration parameter alpha
is documented within the main mcmc_IMIFA
function, and is relevant to all of the "IMIFA"
, "IMFA"
, "OMIFA"
, and "OMFA"
methods.
All arguments here are relevant to the "IMFA"
and "IMIFA"
methods, but the following are also related to the "OMFA"
and "OMIFA"
methods, and may behave differently in those instances: learn.alpha
, alpha.hyper
, zeta
, and tune.zeta
.
A named list in which the names are the names of the arguments related to the BNP prior(s) and the values are the values supplied to the arguments.
Certain supplied arguments will be subject to further checks within mcmc_IMIFA
. G_priorDensity
and G_moments
can help with soliciting sensible DP/PYP priors.
Under the "IMFA"
and "IMIFA"
methods, a Pitman-Yor process prior is specified by default. A Dirichlet process prior can be easily invoked when the discount
is fixed at 0
and learn.d=FALSE
. The normalized stable process can also be specified as a prior distribution, as a special case of the Pitman-Yor process, when alpha
remains fixed at 0
and learn.alpha=FALSE
(provided the discount
is fixed at a strictly positive value or learn.d=TRUE
). The special case of the Pitman-Yor process with negative discount
is also allowed as an experimental feature for which caution is advised, though learn.d
and learn.alpha
are forced to FALSE
and TRUE
, respectively, in this instance.
Keefe Murphy - <[email protected]>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
Kalli, M., Griffin, J. E. and Walker, S. G. (2011) Slice sampling mixture models, Statistics and Computing, 21(1): 93-105.
Fall, M. D. and Barat, E. (2014) Gibbs sampling methods for Pitman-Yor mixture models, hal-00740770v2.
mcmc_IMIFA
, G_priorDensity
, G_moments
, mixfaControl
, mgpControl
, storeControl
bnpctrl <- bnpControl(learn.d=FALSE, ind.slice=FALSE, alpha.hyper=c(3, 3)) # data(olive) # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, BNP=bnpctrl) # Alternatively specify these arguments directly # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, learn.d=FALSE, # ind.slice=FALSE, alpha.hyper=c(3, 3))
bnpctrl <- bnpControl(learn.d=FALSE, ind.slice=FALSE, alpha.hyper=c(3, 3)) # data(olive) # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, BNP=bnpctrl) # Alternatively specify these arguments directly # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, learn.d=FALSE, # ind.slice=FALSE, alpha.hyper=c(3, 3))
Data on the chemical composition of coffee samples collected from around the world, comprising 43 samples from 29 countries. Each sample is either of the Arabica or Robusta variety. Twelve of the thirteen chemical constituents reported in the study are given. The omitted variable is total chlorogenic acid; it is generally the sum of the chlorogenic, neochlorogenic and isochlorogenic acid values.
data(coffee)
data(coffee)
A data frame with 43 observations and 14 columns. The first two columns contain Variety (either Arabica or Robusta) and Country, respectively, while the remaining 12 columns contain the chemical properties.
Streuli, H. (1973). Der heutige Stand der Kaffee-Chemie, In Proceedings of the 6th International Colloquium on Coffee Chemistry, June 4-June 9, 1973, Bogata, Colombia, pp. 61-72. Paris, France: Association Scientifique International du Café (ASIC).
data(coffee, package="IMIFA") pairs(coffee[,-(1:2)], col=coffee$Variety)
data(coffee, package="IMIFA") pairs(coffee[,-(1:2)], col=coffee$Variety)
Calculate the a priori expected number of clusters (G_expected
) or the variance of the number of clusters (G_variance
) under a PYP or DP prior for a sample of size N
at given values of the concentration parameter alpha
and optionally also the Pitman-Yor discount
parameter. Useful for soliciting sensible priors (or fixed values) for alpha
or discount
under the "IMFA"
and "IMIFA"
methods for mcmc_IMIFA
. Additionally, for a given sample size N
and given expected number of clusters EG
, G_calibrate
elicits a value for the concentration parameter alpha
or the discount
parameter.
G_expected(N, alpha, discount = 0, MPFR = TRUE) G_variance(N, alpha, discount = 0, MPFR = TRUE) G_calibrate(N, EG, alpha = NULL, discount = 0, MPFR = TRUE, ...)
G_expected(N, alpha, discount = 0, MPFR = TRUE) G_variance(N, alpha, discount = 0, MPFR = TRUE) G_calibrate(N, EG, alpha = NULL, discount = 0, MPFR = TRUE, ...)
N |
The sample size. |
alpha |
The concentration parameter. Must be specified (though not for |
discount |
The discount parameter for the Pitman-Yor process. Must be less than 1, but typically lies in the interval [0, 1). Defaults to 0 (i.e. the Dirichlet process). When |
MPFR |
Logical indicating whether the high-precision libraries |
EG |
The prior expected number of clusters. Must exceed |
... |
Additional arguments passed to |
All arguments are vectorised. Users can also consult G_priorDensity
in order to solicit sensible priors.
For G_calibrate
, only one of alpha
or discount
can be supplied, and the function elicits a value for the opposing parameter which achieves the desired expected number of clusters EG
for the given sample size N
. By default, a value for alpha
subject to discount=0
(i.e. the Dirichlet process) is elicited. Note that alpha
may not be a positive integer multiple of discount
as it should be if discount
is negative. See Examples below.
The expected number of clusters under the specified prior conditions (G_expected
), or the variance of the number of clusters (G_variance
), or the concentration parameter alpha
or discount
parameter achieving a particular expected number of clusters (G_calibrate
).
G_variance
requires use of the Rmpfr
and gmp
libraries for non-zero discount
values. G_expected
requires these libraries only for the alpha=0
case. These libraries are strongly recommended (but they are not required) for G_calbirate
when discount
is non-zero, but they are required when alpha=0
is supplied. Despite the high precision arithmetic used, the functions can still be unstable for large N
and/or extreme values of alpha
and/or discount
. See the argument MPFR
.
Keefe Murphy - <[email protected]>
De Blasi, P., Favaro, S., Lijoi, A., Mena, R. H., Prunster, I., and Ruggiero, M. (2015) Are Gibbs-type priors the most natural generalization of the Dirichlet process?, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(2): 212-229.
Yamato, H. and Shibuya, M. (2000) Moments of some statistics of Pitman sampling formula, Bulletin of Informatics and Cybernetics, 32(1): 1-10.
G_priorDensity
, Rmpfr
, uniroot
# Certain examples require the use of the Rmpfr library suppressMessages(require("Rmpfr")) G_expected(N=50, alpha=19.23356, MPFR=FALSE) G_variance(N=50, alpha=19.23356, MPFR=FALSE) G_expected(N=50, alpha=c(19.23356, 12.21619, 1), discount=c(0, 0.25, 0.7300045), MPFR=FALSE) G_variance(N=50, alpha=c(19.23356, 12.21619, 1), discount=c(0, 0.25, 0.7300045), MPFR=c(FALSE, TRUE, TRUE)) # Examine the growth rate of the DP DP <- sapply(c(1, 5, 10), function(i) G_expected(1:200, alpha=i, MPFR=FALSE)) matplot(DP, type="l", xlab="N", ylab="G") # Examine the growth rate of the PYP PY <- sapply(c(0.25, 0.5, 0.75), function(i) G_expected(1:200, alpha=1, discount=i)) matplot(PY, type="l", xlab="N", ylab="G") # Other special cases of the PYP are also facilitated G_expected(N=50, alpha=c(27.1401, 0), discount=c(-27.1401/100, 0.8054448)) G_variance(N=50, alpha=c(27.1401, 0), discount=c(-27.1401/100, 0.8054448)) # Elicit values for alpha under a DP prior G_calibrate(N=50, EG=25) # Elicit values for alpha under a PYP prior # G_calibrate(N=50, EG=25, discount=c(-27.1401/100, 0.25, 0.7300045)) # Elicit values for discount under a PYP prior # G_calibrate(N=50, EG=25, alpha=c(12.21619, 1, 0), maxiter=2000)
# Certain examples require the use of the Rmpfr library suppressMessages(require("Rmpfr")) G_expected(N=50, alpha=19.23356, MPFR=FALSE) G_variance(N=50, alpha=19.23356, MPFR=FALSE) G_expected(N=50, alpha=c(19.23356, 12.21619, 1), discount=c(0, 0.25, 0.7300045), MPFR=FALSE) G_variance(N=50, alpha=c(19.23356, 12.21619, 1), discount=c(0, 0.25, 0.7300045), MPFR=c(FALSE, TRUE, TRUE)) # Examine the growth rate of the DP DP <- sapply(c(1, 5, 10), function(i) G_expected(1:200, alpha=i, MPFR=FALSE)) matplot(DP, type="l", xlab="N", ylab="G") # Examine the growth rate of the PYP PY <- sapply(c(0.25, 0.5, 0.75), function(i) G_expected(1:200, alpha=1, discount=i)) matplot(PY, type="l", xlab="N", ylab="G") # Other special cases of the PYP are also facilitated G_expected(N=50, alpha=c(27.1401, 0), discount=c(-27.1401/100, 0.8054448)) G_variance(N=50, alpha=c(27.1401, 0), discount=c(-27.1401/100, 0.8054448)) # Elicit values for alpha under a DP prior G_calibrate(N=50, EG=25) # Elicit values for alpha under a PYP prior # G_calibrate(N=50, EG=25, discount=c(-27.1401/100, 0.25, 0.7300045)) # Elicit values for discount under a PYP prior # G_calibrate(N=50, EG=25, alpha=c(12.21619, 1, 0), maxiter=2000)
Plots the prior distribution of the number of clusters under a Pitman-Yor / Dirichlet process prior, for a sample of size N
at given values of the concentration parameter alpha
and optionally also the discount
parameter. Useful for soliciting sensible priors (or fixed values) for alpha
or discount
under the "IMFA"
and "IMIFA"
methods for mcmc_IMIFA
.
G_priorDensity(N, alpha, discount = 0, show.plot = TRUE, type = "h")
G_priorDensity(N, alpha, discount = 0, show.plot = TRUE, type = "h")
N |
The sample size. |
alpha |
The concentration parameter. Must be specified and must be strictly greater than |
discount |
The discount parameter for the Pitman-Yor process. Must be less than 1, but typically lies in the interval [0, 1). Defaults to 0 (i.e. the Dirichlet process). When |
show.plot |
Logical indicating whether the plot should be displayed (default = |
type |
The type of plot to be drawn, as per |
All arguments are vectorised. Users can also consult G_expected
, G_variance
, and G_calibrate
in order to solicit sensible priors.
A plot of the prior distribution if show.plot
is TRUE
. Density values are returned invisibly. Note that the density values may not strictly sum to one in certain cases, as values small enough to be represented as zero may well be returned.
The actual density values are returned invisibly. Therefore, they can be visualised as desired by the user even if show.plot
is FALSE
.
Requires use of the Rmpfr
and gmp
libraries; may encounter difficulty and slowness for large N
, especially with non-zero discount
values. Despite the high precision arithmetic used, the functions can be unstable for small values of discount
.
Keefe Murphy - <[email protected]>
De Blasi, P., Favaro, S., Lijoi, A., Mena, R. H., Prunster, I., and Ruggiero, M. (2015) Are Gibbs-type priors the most natural generalization of the Dirichlet process?, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(2): 212-229.
# Plot Dirichlet process priors for different values of alpha (DP <- G_priorDensity(N=50, alpha=c(3, 10, 25))) # Verify that these alpha/discount values produce Pitman-Yor process priors with the same mean alpha <- c(19.23356, 6.47006, 1) discount <- c(0, 0.47002, 0.7300045) G_expected(N=50, alpha=alpha, discount=discount) # Now plot them to examine tail behaviour as discount increases # Non-zero discount requires loading the "Rmpfr" library suppressMessages(require("Rmpfr")) (PY <- G_priorDensity(N=50, alpha=alpha, discount=discount, type="l")) # Other special cases of the PYP are also facilitated G_priorDensity(N=50, alpha=c(alpha, 27.1401, 0), discount=c(discount, -27.1401/100, 0.8054448), type="b")
# Plot Dirichlet process priors for different values of alpha (DP <- G_priorDensity(N=50, alpha=c(3, 10, 25))) # Verify that these alpha/discount values produce Pitman-Yor process priors with the same mean alpha <- c(19.23356, 6.47006, 1) discount <- c(0, 0.47002, 0.7300045) G_expected(N=50, alpha=alpha, discount=discount) # Now plot them to examine tail behaviour as discount increases # Non-zero discount requires loading the "Rmpfr" library suppressMessages(require("Rmpfr")) (PY <- G_priorDensity(N=50, alpha=alpha, discount=discount, type="l")) # Other special cases of the PYP are also facilitated G_priorDensity(N=50, alpha=c(alpha, 27.1401, 0), discount=c(discount, -27.1401/100, 0.8054448), type="b")
This function post-processes simulations generated by mcmc_IMIFA
for any of the IMIFA family of models. This includes accounting for label switching, and accounting for rotational invariance via Procrustean methods. It can be re-ran at little computational cost in order to extract different models explored by the sampler used for sims
, without having to re-run the model itself. New results objects using different numbers of clusters and different numbers of factors (if visited by the model in question), or using different model selection criteria (if necessary) can be generated with ease. Posterior predictive checking of the appropriateness of the fitted model is also facilitated.
get_IMIFA_results(sims = NULL, burnin = 0L, thinning = 1L, G = NULL, Q = NULL, criterion = c("bicm", "aicm", "dic", "bic.mcmc", "aic.mcmc"), adapt = FALSE, G.meth = c("mode", "median"), Q.meth = c("mode", "median"), conf.level = 0.95, error.metrics = TRUE, vari.rot = FALSE, z.avgsim = FALSE, zlabels = NULL, nonempty = TRUE, ...) ## S3 method for class 'Results_IMIFA' print(x, ...) ## S3 method for class 'Results_IMIFA' summary(object, MAP = TRUE, ...)
get_IMIFA_results(sims = NULL, burnin = 0L, thinning = 1L, G = NULL, Q = NULL, criterion = c("bicm", "aicm", "dic", "bic.mcmc", "aic.mcmc"), adapt = FALSE, G.meth = c("mode", "median"), Q.meth = c("mode", "median"), conf.level = 0.95, error.metrics = TRUE, vari.rot = FALSE, z.avgsim = FALSE, zlabels = NULL, nonempty = TRUE, ...) ## S3 method for class 'Results_IMIFA' print(x, ...) ## S3 method for class 'Results_IMIFA' summary(object, MAP = TRUE, ...)
sims |
An object of class |
burnin |
Optional additional number of iterations to discard. Defaults to 0, corresponding to no additional burnin. See |
thinning |
Optional interval for extra thinning to be applied. Defaults to 1, corresponding to no additional thinning. See |
G |
If this argument is not specified, results will be returned with the optimal number of clusters. If different numbers of clusters were explored in Similarly, this allows retrieval of samples corresponding to a solution, if visited, with |
Q |
If this argument is not specified, results will be returned with the optimal number of factors. If different numbers of factors were explored in Similarly, this allows retrieval of samples corresponding to a solution, if visited, with If adaptation didn't take place during model-fitting, |
criterion |
The criterion to use for model selection, where model selection is only required if more than one model was run under the Note that the first three options here might exhibit bias in favour of zero-factor models for the finite factor |
adapt |
A logical indicating if adaptation should be applied to the stored loadings and scores matrices to truncate the cluster-specific number(s) of non-redundant factors. This argument is only relevant if |
G.meth |
If the object in |
Q.meth |
If the object in |
conf.level |
The confidence level to be used throughout for credible intervals for all parameters of inferential interest, and error metrics if |
error.metrics |
A logical activating or deactivating posterior predictive checking: i.e. controlling whether metrics quantifying a) the posterior predictive reconstruction error (PPRE) between bin counts of the data and bin counts of replicate draws from the posterior distribution & and b) the error between the empirical and estimated covariance matrices should be computed. These are computed for every valid retained iteration (see The Frobenius norm is used in the computation of the PPRE, by default, but the |
vari.rot |
Logical indicating whether the loadings matrix/matrices template(s) should be |
z.avgsim |
Logical (defaults to Note that the MAP clustering is computed conditional on the estimate of the number of clusters (whether that be the modal estimate or the estimate according to Please be warned that this feature requires loading the |
zlabels |
For any method that performs clustering, the true labels can be supplied if they are known in order to compute clustering performance metrics. This also has the effect of ordering the MAP labels (and thus the ordering of cluster-specific parameters) to most closely correspond to the true labels if supplied. |
nonempty |
For |
x , object , MAP , ...
|
Arguments required for the Users can also pass the When Finally, the |
The function also performs post-hoc corrections for label switching, as well as post-hoc Procrustes rotation of loadings matrices and scores, in order to ensure sensible posterior parameter estimates, computes error metrics, constructs credible intervals, and generally transforms the raw sims
object into an object of class "Results_IMIFA"
in order to prepare the results for plotting via plot.Results_IMIFA
.
For the infinite factor methods, iterations where the maximum number of factors was greater than or equal to the maximum of the estimated cluster-specific factors are retained for posterior summaries of the scores, in order to preserve the estimated dimension of the scores matrices. Similarly, these are also the valid iterations used for the computation of the averages and credible intervals for the error metrics. For the finite factor models, all retained iterations are used in both instances (i.e. both for the scores and the error metrics).
In all cases, only iterations with G
non-empty components are retained.
An object of class "Results_IMIFA"
to be passed to plot.Results_IMIFA
for visualising results. Dedicated print
and summary
functions also exist for objects of this class. The object, say x
, is a list of lists, the most important components of which are:
Clust |
Everything pertaining to clustering performance can be found here for all but the More detail is given if known |
Error |
Everything pertaining the model fit assessment can be found here, incl. the distribution of the PPRE values and associated bin counts for the replicate draws, as well as average error metrics (e.g. MSE, RMSE), and credible intervals quantifying the associated uncertainty, between the empirical and estimated covariance matrix/matrices, both of which are also included. |
GQ.results |
Everything pertaining to model choice can be found here, incl. posterior summaries for the estimated number of clusters and estimated number of factors, if applicable to the method employed. Model selection criterion values are also accessible here. |
Means |
Posterior summaries for the means, after conditioning on |
Loadings |
Posterior summaries for the factor loadings matrix/matrices, after conditioning on The number of iterations retained for posterior summaries of the loadings may vary for different clusters for the infinite factor methods, corresponding to iterations where the cluster-specific number of factors was greater than or equal to the modal estimate of the cluster-specific number of factors. |
Scores |
Posterior summaries for the latent factor scores, after conditioning on the maximum of the estimated number of cluster-specific factors. Summaries are given for the single matrix of factor scores. See For the infinite factor methods, iterations where the maximum number of factors was greater than or equal to the maximum of the estimated cluster-specific factors are retained for posterior summaries of the scores, in order to preserve the estimated dimension of the scores matrices. |
Uniquenesses |
Posterior summaries for the uniquenesses, after conditioning on |
The objects Means
, Loadings
, Scores
and Uniquenesses
(if stored when calling mcmc_IMIFA
!) also contain, as well as the posterior summaries, the entire chain of valid samples of each, as well as, for convenience, the last valid samples of each (after conditioning on the modal G
and Q
values, and accounting for label switching, and rotational invariance via Procrustes rotation).
For the "IMIFA"
, "IMFA"
, "OMIFA"
, and "OMFA"
methods, the retained mixing proportions are renormalised after conditioning on the modal G
. This is especially necessary for the computation of the error.metrics
, just note that the values on which posterior inference are conducted will ever so slightly differ from the actually sampled values.
Due to the way the offline label-switching correction is performed, different runs of this function may give very slightly different results in terms of the cluster labellings (and by extension the parameters, which are permuted in the same way), but only if the chain was run for an extremely small number of iterations, well below the number required for convergence, and samples of the cluster labels match poorly across iterations (particularly if the number of clusters suggested by those sampled labels is high).
Keefe Murphy - <[email protected]>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
plot.Results_IMIFA
, mcmc_IMIFA
, Zsimilarity
, scores_MAP
, sim_IMIFA_model
, Procrustes
, varimax
, norm
, mgpControl
, storeControl
# data(coffee) # data(olive) # Run a MFA model on the coffee data over a range of clusters and factors. # simMFAcoffee <- mcmc_IMIFA(coffee, method="MFA", range.G=2:3, range.Q=0:3, n.iters=1000) # Accept all defaults to extract the optimal model. # resMFAcoffee <- get_IMIFA_results(simMFAcoffee) # Instead let's get results for a 3-cluster model, allowing Q be chosen by aic.mcmc. # resMFAcoffee2 <- get_IMIFA_results(simMFAcoffee, G=3, criterion="aic.mcmc") # Run an IMIFA model on the olive data, accepting all defaults. # simIMIFAolive <- mcmc_IMIFA(olive, method="IMIFA", n.iters=10000) # Extract optimum results # Estimate G & Q by the median of their posterior distributions # Construct 90% credible intervals and try to return the similarity matrix. # resIMIFAolive <- get_IMIFA_results(simIMIFAolive, G.meth="median", Q.meth="median", # conf.level=0.9, z.avgsim=TRUE) # summary(resIMIFAolive) # Simulate new data from the above model # newdata <- sim_IMIFA_model(resIMIFAolive) # Fit an IFA model without adaptation and examine results with and without post-hoc adaptation # Use the "SC" criterion for determining the number of active factors # simIFAnoadapt <- mcmc_IMIFA(olive, method="IFA", n.iters=10000, adapt=FALSE) # resIFAnoadapt <- get_IMIFA_results(simIFAnoadapt) # resIFApostadapt <- get_IMIFA_results(simIFAnoadapt, adapt=TRUE, active.crit="SC") # Compare to an IFA model with adaptive Gibbs sampling # simIFAadapt <- mcmc_IMIFA(coffee, method="IFA", n.iters=10000, active.crit="BD") # resIFAadapt <- get_IMIFA_results(simIFAadapt) # plot(resIFAnoadapt, "GQ") # plot(resIFApostadapt, "GQ") # plot(resIFAadapt, "GQ")
# data(coffee) # data(olive) # Run a MFA model on the coffee data over a range of clusters and factors. # simMFAcoffee <- mcmc_IMIFA(coffee, method="MFA", range.G=2:3, range.Q=0:3, n.iters=1000) # Accept all defaults to extract the optimal model. # resMFAcoffee <- get_IMIFA_results(simMFAcoffee) # Instead let's get results for a 3-cluster model, allowing Q be chosen by aic.mcmc. # resMFAcoffee2 <- get_IMIFA_results(simMFAcoffee, G=3, criterion="aic.mcmc") # Run an IMIFA model on the olive data, accepting all defaults. # simIMIFAolive <- mcmc_IMIFA(olive, method="IMIFA", n.iters=10000) # Extract optimum results # Estimate G & Q by the median of their posterior distributions # Construct 90% credible intervals and try to return the similarity matrix. # resIMIFAolive <- get_IMIFA_results(simIMIFAolive, G.meth="median", Q.meth="median", # conf.level=0.9, z.avgsim=TRUE) # summary(resIMIFAolive) # Simulate new data from the above model # newdata <- sim_IMIFA_model(resIMIFAolive) # Fit an IFA model without adaptation and examine results with and without post-hoc adaptation # Use the "SC" criterion for determining the number of active factors # simIFAnoadapt <- mcmc_IMIFA(olive, method="IFA", n.iters=10000, adapt=FALSE) # resIFAnoadapt <- get_IMIFA_results(simIFAnoadapt) # resIFApostadapt <- get_IMIFA_results(simIFAnoadapt, adapt=TRUE, active.crit="SC") # Compare to an IFA model with adaptive Gibbs sampling # simIFAadapt <- mcmc_IMIFA(coffee, method="IFA", n.iters=10000, active.crit="BD") # resIFAadapt <- get_IMIFA_results(simIFAadapt) # plot(resIFAnoadapt, "GQ") # plot(resIFApostadapt, "GQ") # plot(resIFAadapt, "GQ")
Samples cluster labels for N observations from G clusters efficiently using log-probabilities and the so-called Gumbel-Max trick, without requiring that the log-probabilities be normalised; thus redundant computation can be avoided.
gumbel_max(probs, slice = FALSE)
gumbel_max(probs, slice = FALSE)
probs |
An N x G matrix of unnormalised probabilities on the log scale, where N is he number of observations that require labels to be sampled and G is the number of active clusters s.t. sampled labels can take values in |
slice |
A logical indicating whether or not the indicator correction for slice sampling has been applied to |
Computation takes place on the log scale for stability/underflow reasons (to ensure negligible probabilities won't cause computational difficulties); in any case, many functions for calculating multivariate normal densities already output on the log scale.
A vector of N sampled cluster labels, with the largest label no greater than G.
Though the function is available for standalone use, note that no checks take place, in order to speed up repeated calls to the function inside mcmc_IMIFA
.
If the normalising constant is required for another reason, e.g. to compute the log-likelihood, it can be calculated by summing the output obtained by calling rowLogSumExps
on probs
.
Keefe Murphy - <[email protected]>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
Yellott, J. I. Jr. (1977) The relationship between Luce's choice axiom, Thurstone's theory of comparative judgment, and the double exponential distribution, Journal of Mathematical Psychology, 15(2): 109-144.
# Create weights for 3 components G <- 3 weights <- seq_len(G) # Call gumbel_max() repeatedly to obtain samples of the labels, zs iters <- 10000 zs <- replicate(iters, gumbel_max(probs=log(weights))) # Compare answer to the normalised weights tabulate(zs, nbins=G)/iters (normalised <- as.numeric(weights/sum(weights))) # Simulate a matrix of Dirichlet weights & the associated vector of N labels N <- 400 G <- 8 sizes <- seq(from=85, to=15, by=-10) weights <- matrix(rDirichlet(N * G, alpha=1, nn=sizes), byrow=TRUE, nrow=N, ncol=G) (zs <- gumbel_max(probs=log(weights)))
# Create weights for 3 components G <- 3 weights <- seq_len(G) # Call gumbel_max() repeatedly to obtain samples of the labels, zs iters <- 10000 zs <- replicate(iters, gumbel_max(probs=log(weights))) # Compare answer to the normalised weights tabulate(zs, nbins=G)/iters (normalised <- as.numeric(weights/sum(weights))) # Simulate a matrix of Dirichlet weights & the associated vector of N labels N <- 400 G <- 8 sizes <- seq(from=85, to=15, by=-10) weights <- matrix(rDirichlet(N * G, alpha=1, nn=sizes), byrow=TRUE, nrow=N, ncol=G) (zs <- gumbel_max(probs=log(weights)))
Using only base graphics, this function appends a colour key legend for heatmaps produced by, for instance, plot_cols
or image
.
heat_legend(data, cols = NULL, breaks = NULL, cex.lab = 1, ...)
heat_legend(data, cols = NULL, breaks = NULL, cex.lab = 1, ...)
data |
Either the data with which the heatmap was created or a vector containing its minimum and maximum values. Missing values are ignored. |
cols |
The colour palette used when the heatmap was created. By default, the same |
breaks |
Optional argument giving the break-points for the axis labels. |
cex.lab |
Magnification of axis annotation, indicating the amount by which plotting text and symbols should be scaled relative to the default of 1. |
... |
Catches unused arguments. |
Modifies an existing plot by adding a colour key legend.
image
, plot_cols
, mat2cols
, is.cols
# Generate a matrix and plot it with a legend data <- matrix(rnorm(50), nrow=10, ncol=5) cols <- heat.colors(12)[12:1] par(mar=c(5.1, 4.1, 4.1, 3.1)) plot_cols(mat2cols(data, col=cols)) heat_legend(data, cols); box(lwd=2)
# Generate a matrix and plot it with a legend data <- matrix(rnorm(50), nrow=10, ncol=5) cols <- heat.colors(12)[12:1] par(mar=c(5.1, 4.1, 4.1, 3.1)) plot_cols(mat2cols(data, col=cols)) heat_legend(data, cols); box(lwd=2)
Show the NEWS
file of the IMIFA
package.
IMIFA_news()
IMIFA_news()
The IMIFA
NEWS
file, provided the session is interactive.
IMIFA_news()
IMIFA_news()
Checks if the supplied vector contains valid colours.
is.cols(cols)
is.cols(cols)
cols |
A vector of colours, usually as a character string. |
A logical vector of length length(cols)
which is TRUE
for entries which are valid colours and FALSE
otherwise.
all(is.cols(1:5)) all(is.cols(heat.colors(30))) any(!is.cols(c("red", "green", "aquamarine")))
all(is.cols(1:5)) all(is.cols(heat.colors(30))) any(!is.cols(c("red", "green", "aquamarine")))
Tests whether all eigenvalues of a symmetric matrix are positive (or strictly non-negative) to check for positive-definiteness and positive-semidefiniteness, respectively. If the supplied matrix doesn't satisfy the test, the nearest matrix which does can optionally be returned.
is.posi_def(x, tol = NULL, semi = FALSE, make = FALSE)
is.posi_def(x, tol = NULL, semi = FALSE, make = FALSE)
x |
A matrix, assumed to be real and symmetric. |
tol |
Tolerance for singular values and for absolute eigenvalues - only those with values larger than tol are considered non-zero. (default: tol = |
semi |
Logical switch to test for positive-semidefiniteness when |
make |
Logical switch to return the nearest matrix which satisfies the test - if the test has been passed, this is of course just |
If isTRUE(make)
, a list with two components:
check |
A logical value indicating whether the matrix satisfies the test. |
X.new |
The nearest matrix which satisfies the test (which may just be the input matrix itself.) |
Otherwise, only the logical value indicating whether the matrix satisfies the test is returned.
x <- cov(matrix(rnorm(100), nrow=10, ncol=10)) is.posi_def(x) #FALSE is.posi_def(x, semi=TRUE) #TRUE Xnew <- is.posi_def(x, semi=FALSE, make=TRUE)$X.new identical(x, Xnew) #FALSE identical(x, is.posi_def(x, semi=TRUE, make=TRUE)$X.new) #TRUE
x <- cov(matrix(rnorm(100), nrow=10, ncol=10)) is.posi_def(x) #FALSE is.posi_def(x, semi=TRUE) #TRUE Xnew <- is.posi_def(x, semi=FALSE, make=TRUE)$X.new identical(x, Xnew) #FALSE identical(x, is.posi_def(x, semi=TRUE, make=TRUE)$X.new) #TRUE
Returns the maximum number of latent factors in a factor analysis model for data of dimension P
which actually achieves dimension reduction in terms of the number of covariance parameters. This Ledermann bound is given by the largest integer smaller than or equal to the solution of
.
Ledermann(P, isotropic = FALSE, int = TRUE)
Ledermann(P, isotropic = FALSE, int = TRUE)
P |
Integer number of variables in data set. This argument is vectorised. |
isotropic |
Logical indicating whether uniquenesses are constrained to be isotropic, in which case the bound is simply |
int |
Logical indicating if the result should be returned as an integer by applying the |
The Ledermann bound when istropic
is FALSE
is given by .
The Ledermann bound, a non-negative integer obtained using floor
, or a vector of length(P)
such bounds.
It has also been argued that the number of factors should not exceed floor((P - 1)/2)
, which is a necessarily stricter condition.
Anderson, T. W. and Rubin, H. (1956) Statistical inference in factor analysis. In Neyman, J. (Ed.), Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 3.5: Contributions to Econometrics, Industrial Research, and Psychometry, University of California Press, Berkeley, CA, U.S.A., pp. 111-150.
Ledermann(c(25, 50, 100)) floor((c(25, 50, 100) - 1) / 2) # stricter bounds data(olive) P <- ncol(olive[,-(1:2)]) Ledermann(P) Ledermann(P, int=FALSE) floor((P - 1)/2) # stricter bound
Ledermann(c(25, 50, 100)) floor((c(25, 50, 100) - 1) / 2) # stricter bounds data(olive) P <- ncol(olive[,-(1:2)]) Ledermann(P) Ledermann(P, int=FALSE) floor((P - 1)/2) # stricter bound
Functions to draw pseudo-random numbers from, or calculate the expectation of, left-truncated gamma distributions (see Details below).
rltrgamma(n, shape, rate = 1, trunc = 1) exp_ltrgamma(shape, rate = 1, trunc = 1, inverse = FALSE)
rltrgamma(n, shape, rate = 1, trunc = 1) exp_ltrgamma(shape, rate = 1, trunc = 1, inverse = FALSE)
n |
Number of observations to generate. |
shape |
Shape parameter for the desired gamma distribution. Must be strictly positive |
rate |
Rate parameter for the desired gamma distribution. Must be strictly positive. |
trunc |
The point of left truncation (corresponding to |
inverse |
A logical indicating whether to calculate the expectation for a right-truncated inverse gamma distribution instead of a left-truncated gamma distribution. Defaults to |
The left-truncated gamma distribution has PDF:
for , and
, where
and
are the
shape
and rate
parameters, respectively, is the cutoff point at which
trunc
ation occurs, and is the upper incomplete gamma function.
For rltrgamma
, a vector of length n
giving draws from the left-truncated gamma distribution with the specified shape
and rate
parameters, and truncation point trunc
.
For exp_ltrgamma
, the expected value of a left-truncated (inverse) gamma distribution.
rltrgamma
is invoked internally for the "IFA"
, "MIFA"
, "OMIFA"
, and "IMIFA"
models to draw column shrinkage parameters for all but the first loadings column under the MGP prior when truncated=TRUE
(which is not the default) is supplied to mgpControl
, at the expense of slightly longer run times. exp_ltrgamma
is used within MGP_check
to check the validity of the MGP hyperparameters when truncated=TRUE
(which is again, not the default). Both functions always assume trunc=1
for these internal usages.
Note also that no arguments are recycled, i.e. all arguments must be of length 1
.
Keefe Murphy - <[email protected]>
Dagpunar, J. S. (1978) Sampling of variates from a truncated gamma distribution, Statistical Computation and Simulation, 8(1): 59-64.
# Generate left-truncated Ga(3.1, 2.1, 1) variates rltrgamma(n=10, shape=3.1, rate=2.1) # Calculate the expectation of a Ga(3.1, 2.1, 1) distribution exp_ltrgamma(shape=3.1, rate=2.1) # Calculate the expectation of an inverse gamma distribution right-truncated at 2 exp_ltrgamma(shape=3.1, rate=2.1, trunc=2, inverse=TRUE)
# Generate left-truncated Ga(3.1, 2.1, 1) variates rltrgamma(n=10, shape=3.1, rate=2.1) # Calculate the expectation of a Ga(3.1, 2.1, 1) distribution exp_ltrgamma(shape=3.1, rate=2.1) # Calculate the expectation of an inverse gamma distribution right-truncated at 2 exp_ltrgamma(shape=3.1, rate=2.1, trunc=2, inverse=TRUE)
Converts a matrix to a hex colour code representation for plotting using plot_cols
. Used internally by plot.Results_IMIFA
for plotting posterior mean loadings heatmaps.
mat2cols(mat, cols = NULL, compare = FALSE, byrank = FALSE, breaks = NULL, na.col = "#808080FF", transparency = 1, ...)
mat2cols(mat, cols = NULL, compare = FALSE, byrank = FALSE, breaks = NULL, na.col = "#808080FF", transparency = 1, ...)
mat |
Either a matrix or, when |
cols |
The colour palette to be used. The default palette uses |
compare |
Logical switch used when desiring comparable colour representations (usually for comparable heat maps) across multiple matrices. Ensures plots will be calibrated to a common colour scale so that, for instance, the colour on the heat map of an entry valued at 0.7 in Matrix A corresponds exactly to the colour of a similar value in Matrix B. When |
byrank |
Logical indicating whether to convert the matrix itself or the sample ranks of the values therein. Defaults to |
breaks |
Number of gradations in colour to use. Defaults to |
na.col |
Colour to be used to represent missing data. Will be checked for validity by |
transparency |
A factor in [0, 1] modifying the opacity for overplotted lines. Defaults to 1 (i.e. no transparency). Only relevant when |
... |
Catches unused arguments. |
A matrix of hex colour code representations, or a list of such matrices when compare
is TRUE
.
plot_cols
, heat_legend
, is.cols
, cut
# Generate a colour matrix using mat2cols() mat <- matrix(rnorm(100), nrow=10, ncol=10) mat[2,3] <- NA cols <- heat.colors(12)[12:1] (matcol <- mat2cols(mat, cols=cols)) # Use plot_cols() to visualise the colours matrix par(mar=c(5.1, 4.1, 4.1, 3.1)) plot_cols(matcol) # Add a legend using heat_legend() heat_legend(mat, cols=cols); box(lwd=2) # Try comparing heat maps of multiple matrices mat1 <- cbind(matrix(rnorm(100, sd=c(4,2)), nr=50, nc=2, byrow=TRUE), 0.1) mat2 <- cbind(matrix(rnorm(150, sd=c(7,5,3)), nr=50, nc=3, byrow=TRUE), 0.1) mat3 <- cbind(matrix(rnorm(50, sd=1), nr=50, nc=1, byrow=TRUE), 0.1) mats <- list(mat1, mat2, mat3) colmats <- mat2cols(mats, cols=cols, compare=TRUE) par(mfrow=c(2, 3), mar=c(1, 2, 1, 2)) # Use common palettes (top row) plot_cols(colmats[[1]]); heat_legend(range(mats), cols=cols); box(lwd=2) plot_cols(colmats[[2]]); heat_legend(range(mats), cols=cols); box(lwd=2) plot_cols(colmats[[3]]); heat_legend(range(mats), cols=cols); box(lwd=2) # Use uncommon palettes (bottom row) plot_cols(mat2cols(mat1, cols=cols)); heat_legend(range(mat1), cols=cols); box(lwd=2) plot_cols(mat2cols(mat2, cols=cols)); heat_legend(range(mat2), cols=cols); box(lwd=2) plot_cols(mat2cols(mat3, cols=cols)); heat_legend(range(mat3), cols=cols); box(lwd=2)
# Generate a colour matrix using mat2cols() mat <- matrix(rnorm(100), nrow=10, ncol=10) mat[2,3] <- NA cols <- heat.colors(12)[12:1] (matcol <- mat2cols(mat, cols=cols)) # Use plot_cols() to visualise the colours matrix par(mar=c(5.1, 4.1, 4.1, 3.1)) plot_cols(matcol) # Add a legend using heat_legend() heat_legend(mat, cols=cols); box(lwd=2) # Try comparing heat maps of multiple matrices mat1 <- cbind(matrix(rnorm(100, sd=c(4,2)), nr=50, nc=2, byrow=TRUE), 0.1) mat2 <- cbind(matrix(rnorm(150, sd=c(7,5,3)), nr=50, nc=3, byrow=TRUE), 0.1) mat3 <- cbind(matrix(rnorm(50, sd=1), nr=50, nc=1, byrow=TRUE), 0.1) mats <- list(mat1, mat2, mat3) colmats <- mat2cols(mats, cols=cols, compare=TRUE) par(mfrow=c(2, 3), mar=c(1, 2, 1, 2)) # Use common palettes (top row) plot_cols(colmats[[1]]); heat_legend(range(mats), cols=cols); box(lwd=2) plot_cols(colmats[[2]]); heat_legend(range(mats), cols=cols); box(lwd=2) plot_cols(colmats[[3]]); heat_legend(range(mats), cols=cols); box(lwd=2) # Use uncommon palettes (bottom row) plot_cols(mat2cols(mat1, cols=cols)); heat_legend(range(mat1), cols=cols); box(lwd=2) plot_cols(mat2cols(mat2, cols=cols)); heat_legend(range(mat2), cols=cols); box(lwd=2) plot_cols(mat2cols(mat3, cols=cols)); heat_legend(range(mat3), cols=cols); box(lwd=2)
Carries out Gibbs sampling for all models from the IMIFA family, facilitating model-based clustering with dimensionally reduced factor-analytic covariance structures, with automatic estimation of the number of clusters and cluster-specific factors as appropriate to the method employed. Factor analysis with one group (FA/IFA), finite mixtures (MFA/MIFA), overfitted mixtures (OMFA/OMIFA), infinite factor models which employ the multiplicative gamma process (MGP) shrinkage prior (IFA/MIFA/OMIFA/IMIFA), and infinite mixtures which employ Pitman-Yor / Dirichlet Process Mixture Models (IMFA/IMIFA) are all provided.
mcmc_IMIFA(dat, method = c("IMIFA", "IMFA", "OMIFA", "OMFA", "MIFA", "MFA", "IFA", "FA", "classify"), range.G = NULL, range.Q = NULL, MGP = mgpControl(...), BNP = bnpControl(...), mixFA = mixfaControl(...), alpha = NULL, storage = storeControl(...), ...) ## S3 method for class 'IMIFA' print(x, ...) ## S3 method for class 'IMIFA' summary(object, ...)
mcmc_IMIFA(dat, method = c("IMIFA", "IMFA", "OMIFA", "OMFA", "MIFA", "MFA", "IFA", "FA", "classify"), range.G = NULL, range.Q = NULL, MGP = mgpControl(...), BNP = bnpControl(...), mixFA = mixfaControl(...), alpha = NULL, storage = storeControl(...), ...) ## S3 method for class 'IMIFA' print(x, ...) ## S3 method for class 'IMIFA' summary(object, ...)
dat |
A matrix or data frame such that rows correspond to observations ( |
method |
An acronym for the type of model to fit where:
In principle, of course, one could overfit the |
range.G |
Depending on the method employed, either the range of values for the number of clusters, or the conservatively high starting value for the number of clusters. Defaults to (and must be!) For the If |
range.Q |
Depending on the method employed, either the range of values for the number of latent factors or, for methods ending in IFA, the conservatively high starting value for the number of cluster-specific factors, in which case the default starting value is For methods ending in IFA, different clusters can be modelled using different numbers of latent factors (incl. zero); for methods not ending in IFA it is possible to fit zero-factor models, corresponding to simple diagonal covariance structures. For instance, fitting the If See |
MGP |
A list of arguments pertaining to the multiplicative gamma process (MGP) shrinkage prior and adaptive Gibbs sampler (AGS). For use with the infinite factor models |
BNP |
A list of arguments pertaining to the Bayesian Nonparametric Pitman-Yor / Dirichlet process priors, for use with the infinite mixture models |
mixFA |
A list of arguments pertaining to all other aspects of model fitting, e.g. MCMC settings, cluster initialisation, and hyperparameters common to every |
alpha |
Depending on the method employed, either the hyperparameter of the Dirichlet prior for the cluster mixing proportions, or the Pitman-Yor / Dirichlet process concentration parameter. Defaults to
See |
storage |
A vector of named logical indicators governing storage of parameters of interest for all models in the IMIFA family. Defaults are set by a call to |
... |
An alternative means of passing control parameters directly via the named arguments of |
x , object
|
Object of class |
Creates a raw object of class "IMIFA"
from which the optimal/modal model can be extracted by get_IMIFA_results
. Dedicated print
and summary
functions exist for objects of class "IMIFA"
.
A list of lists of lists of class "IMIFA"
to be passed to get_IMIFA_results
. If the returned object is x
, candidate models are accessible via subsetting, where x
is of the following form:
x[[1:length(range.G)]][[1:length(range.Q)]]
.
However, these objects of class "IMIFA" should rarely if ever be manipulated by hand - use of the get_IMIFA_results
function is strongly advised.
Further control over the specification of advanced function arguments can be obtained with recourse to the following functions:
mgpControl
Supply arguments (with defaults) pertaining to the multiplicative gamma process (MGP) shrinkage prior and adaptive Gibbs sampler (AGS). For use with the infinite factor models "IFA"
, "MIFA"
, "OMIFA"
, and "IMIFA"
only.
bnpControl
Supply arguments (with defaults) pertaining to the Bayesian Nonparametric Pitman-Yor / Dirichlet process priors, for use with the infinite mixture models "IMFA"
and "IMIFA"
. Certain arguments related to the Dirichlet concentration parameter for the overfitted mixtures "OMFA"
and "OMIFA"
can be supplied in this manner also.
mixfaControl
Supply arguments (with defaults) pertaining to all other aspects of model fitting (e.g. MCMC settings, cluster initialisation, and hyperparameters common to every method
in the IMIFA
family.
storeControl
Supply logical indicators governing storage of parameters of interest for all models in the IMIFA family. It may be useful not to store certain parameters if memory is an issue (e.g. for large data sets or for a large number of MCMC iterations after burnin and thinning).
Note however that the named arguments of these functions can also be supplied directly. Parameter starting values are obtained by simulation from the relevant prior distribution specified in these control functions, though initial means and mixing proportions are computed empirically.
Keefe Murphy - <[email protected]>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
Bhattacharya, A. and Dunson, D. B. (2011) Sparse Bayesian infinite factor models, Biometrika, 98(2): 291-306.
Kalli, M., Griffin, J. E. and Walker, S. G. (2011) Slice sampling mixture models, Statistics and Computing, 21(1): 93-105.
Rousseau, J. and Mengersen, K. (2011) Asymptotic Behaviour of the posterior distribution in overfitted mixture models, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(5): 689-710.
McNicholas, P. D. and Murphy, T. B. (2008) Parsimonious Gaussian mixture models, Statistics and Computing, 18(3): 285-296.
get_IMIFA_results
, mixfaControl
, mgpControl
, bnpControl
, storeControl
, Ledermann
# data(olive) # data(coffee) # Fit an IMIFA model to the olive data. Accept all defaults. # simIMIFA <- mcmc_IMIFA(olive, method="IMIFA") # summary(simIMIFA) # Fit an IMIFA model assuming a Pitman-Yor prior. # Control the balance between the DP and PY priors using the kappa parameter. # simPY <- mcmc_IMIFA(olive, method="IMIFA", kappa=0.75) # summary(simPY) # Fit a MFA model to the scaled olive data, with isotropic uniquenesses (i.e. MPPCA). # Allow diagonal covariance as a special case where range.Q = 0. # Don't store the scores. Accept all other defaults. # simMFA <- mcmc_IMIFA(olive, method="MFA", n.iters=10000, range.G=3:6, range.Q=0:3, # score.switch=FALSE, centering=FALSE, uni.type="isotropic") # Fit a MIFA model to the centered & scaled coffee data, w/ cluster labels initialised by K-Means. # Note that range.Q doesn't need to be specified. Allow IFA as a special case where range.G=1. # simMIFA <- mcmc_IMIFA(coffee, method="MIFA", n.iters=10000, range.G=1:3, z.init="kmeans") # Fit an IFA model to the centered and pareto scaled olive data. # Note that range.G doesn't need to be specified. We can optionally supply a range.Q starting value. # Enforce additional shrinkage using alpha.d1, alpha.d2, prop, and eps (via mgpControl()). # simIFA <- mcmc_IMIFA(olive, method="IFA", n.iters=10000, range.Q=4, scaling="pareto", # alpha.d1=2.5, alpha.d2=4, prop=0.6, eps=0.12) # Fit an OMIFA model to the centered & scaled coffee data. # Supply a sufficiently small alpha value. Try varying other hyperparameters. # Accept the default value for the starting number of factors, # but supply a value for the starting number of clusters. # Try constraining uniquenesses to be common across both variables and clusters. # simOMIFA <- mcmc_IMIFA(coffee, method="OMIFA", range.G=10, psi.alpha=3, # phi.hyper=c(2, 1), alpha=0.8, uni.type="single")
# data(olive) # data(coffee) # Fit an IMIFA model to the olive data. Accept all defaults. # simIMIFA <- mcmc_IMIFA(olive, method="IMIFA") # summary(simIMIFA) # Fit an IMIFA model assuming a Pitman-Yor prior. # Control the balance between the DP and PY priors using the kappa parameter. # simPY <- mcmc_IMIFA(olive, method="IMIFA", kappa=0.75) # summary(simPY) # Fit a MFA model to the scaled olive data, with isotropic uniquenesses (i.e. MPPCA). # Allow diagonal covariance as a special case where range.Q = 0. # Don't store the scores. Accept all other defaults. # simMFA <- mcmc_IMIFA(olive, method="MFA", n.iters=10000, range.G=3:6, range.Q=0:3, # score.switch=FALSE, centering=FALSE, uni.type="isotropic") # Fit a MIFA model to the centered & scaled coffee data, w/ cluster labels initialised by K-Means. # Note that range.Q doesn't need to be specified. Allow IFA as a special case where range.G=1. # simMIFA <- mcmc_IMIFA(coffee, method="MIFA", n.iters=10000, range.G=1:3, z.init="kmeans") # Fit an IFA model to the centered and pareto scaled olive data. # Note that range.G doesn't need to be specified. We can optionally supply a range.Q starting value. # Enforce additional shrinkage using alpha.d1, alpha.d2, prop, and eps (via mgpControl()). # simIFA <- mcmc_IMIFA(olive, method="IFA", n.iters=10000, range.Q=4, scaling="pareto", # alpha.d1=2.5, alpha.d2=4, prop=0.6, eps=0.12) # Fit an OMIFA model to the centered & scaled coffee data. # Supply a sufficiently small alpha value. Try varying other hyperparameters. # Accept the default value for the starting number of factors, # but supply a value for the starting number of clusters. # Try constraining uniquenesses to be common across both variables and clusters. # simOMIFA <- mcmc_IMIFA(coffee, method="OMIFA", range.G=10, psi.alpha=3, # phi.hyper=c(2, 1), alpha=0.8, uni.type="single")
Checks the hyperparameters for the multiplicative gamma process (MGP) shrinkage prior in order to ensure that the property of cumulative shrinkage (in expectation) holds, i.e. checks whether growing mass is assigned to small neighbourhoods of zero as the column index increases.
MGP_check(ad1, ad2, Q = 3L, phi.shape = NULL, phi.rate = NULL, sigma.shape = NULL, sigma.rate = NULL, bd1 = 1, bd2 = 1, truncated = FALSE, inverse = TRUE)
MGP_check(ad1, ad2, Q = 3L, phi.shape = NULL, phi.rate = NULL, sigma.shape = NULL, sigma.rate = NULL, bd1 = 1, bd2 = 1, truncated = FALSE, inverse = TRUE)
ad1 , ad2
|
Shape hyperparameters for |
Q |
Number of latent factors. Defaults to |
phi.shape , phi.rate
|
The shape and rate hyperparameters for the gamma prior on the local shrinkage parameters. Not necessary for checking if the cumulative shrinkage property holds, but worth supplying both if the actual a priori expected shrinkage factors are of interest. The default value(s) depends on the value of |
sigma.shape , sigma.rate
|
The shape and rate hyperparameters for the gamma prior on the cluster shrinkage parameters. Not necessary for checking if the cumulative shrinkage property holds, but worth supplying both if the actual a priori expected shrinkage factors are of interest. The default value(s) depends on the value of |
bd1 , bd2
|
Rate hyperparameters for |
truncated |
A logical value indicating whether the version of the MGP prior based on left-truncated gamma distributions is invoked (see |
inverse |
Logical indicator for whether the cumulative shrinkage property is assessed against the induced Inverse Gamma prior, the default, or in terms of the Gamma prior (which is incorrect). This is always |
This is called inside mcmc_IMIFA
for the "IFA"
, "MIFA"
, "OMIFA"
and "IMIFA"
methods. This function is vectorised with respect to the arguments ad1
, ad2
, phi.shape
, phi.rate
, sigma.shape
, sigma.rate
, bd1
and bd2
.
A list of length 2 containing the following objects:
expectation |
The vector (or list of vectors) of actual expected a priori shrinkage factors. |
valid |
A logical (or vector of logicals) indicating whether the cumulative shrinkage property holds (in expectation). |
It is recommended that ad2
be moderately large relative to ad1
, even if valid
can sometimes be TRUE
when this is not the case (e.g. when truncated=TRUE
). Similarly, satisfying this condition is no guarantee that valid
will be TRUE
, unless truncated=TRUE
. Therefore, a warning is returned if ad1 <= ad2
, regardless of the value taken by valid
, when truncated=FALSE
(the default).
Keefe Murphy - <[email protected]>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
Durante, D. (2017). A note on the multiplicative gamma process, Statistics & Probability Letters, 122: 198-204.
Bhattacharya, A. and Dunson, D. B. (2011). Sparse Bayesian infinite factor models, Biometrika, 98(2): 291-306.
Zhang, X., Dunson, D. B., and Carin, L. (2011) Tree-structured infinite sparse factor model. In Getoor, L. and Scheffer, T. (Eds.), Proceedings of the 28th International Conference on Machine Learning (ICML 2011), June 28-July 2, 2011, Bellevue, WA, USA, pp. 785-792. Madison, WI, USA: Omnipress.
# Check if expected shrinkage under the MGP increases with the column index (WRONG approach!). MGP_check(ad1=1.5, ad2=1.8, Q=10, phi.shape=3, inverse=FALSE)$valid #TRUE # Check if the induced IG prior on the MGP column shrinkage parameters # is stochastically increasing, thereby inducing cumulative shrinkage (CORRECT approach!). MGP_check(ad1=1.5, ad2=1.8, Q=10, phi.shape=3, inverse=TRUE)$valid #FALSE # Check again with a parameterisation that IS valid and examine the expected shrinkage values (shrink <- MGP_check(ad1=1.5, ad2=2.8, Q=10, phi.shape=2, phi.rate=0.5, inverse=TRUE)) # Check previously invalid parameterisation again using truncated version of the MGP prior MGP_check(ad1=1.5, ad2=1.8, Q=10, phi.shape=3, truncated=TRUE)$valid #TRUE
# Check if expected shrinkage under the MGP increases with the column index (WRONG approach!). MGP_check(ad1=1.5, ad2=1.8, Q=10, phi.shape=3, inverse=FALSE)$valid #TRUE # Check if the induced IG prior on the MGP column shrinkage parameters # is stochastically increasing, thereby inducing cumulative shrinkage (CORRECT approach!). MGP_check(ad1=1.5, ad2=1.8, Q=10, phi.shape=3, inverse=TRUE)$valid #FALSE # Check again with a parameterisation that IS valid and examine the expected shrinkage values (shrink <- MGP_check(ad1=1.5, ad2=2.8, Q=10, phi.shape=2, phi.rate=0.5, inverse=TRUE)) # Check previously invalid parameterisation again using truncated version of the MGP prior MGP_check(ad1=1.5, ad2=1.8, Q=10, phi.shape=3, truncated=TRUE)$valid #TRUE
Supplies a list of arguments for use in mcmc_IMIFA
pertaining to the use of the multiplicative gamma process (MGP) shrinkage prior and adaptive Gibbs sampler (AGS) for use with the infinite factor models "IFA"
, "MIFA"
, "OMIFA"
, and "IMIFA"
.
mgpControl(alpha.d1 = 2.1, alpha.d2 = 3.1, phi.hyper = c(3, 2), sigma.hyper = c(3, 2), active.crit = c("BD", "SC"), prop = switch(active.crit, BD=0.7, SC=0.99), eps = 0.1, adapt = TRUE, forceQg = FALSE, cluster.shrink = TRUE, truncated = FALSE, b0 = 0.1, b1 = 5e-05, beta.d1 = 1, beta.d2 = 1, start.AGS = 2L, stop.AGS = Inf, delta0g = FALSE, ...)
mgpControl(alpha.d1 = 2.1, alpha.d2 = 3.1, phi.hyper = c(3, 2), sigma.hyper = c(3, 2), active.crit = c("BD", "SC"), prop = switch(active.crit, BD=0.7, SC=0.99), eps = 0.1, adapt = TRUE, forceQg = FALSE, cluster.shrink = TRUE, truncated = FALSE, b0 = 0.1, b1 = 5e-05, beta.d1 = 1, beta.d2 = 1, start.AGS = 2L, stop.AGS = Inf, delta0g = FALSE, ...)
alpha.d1 |
Shape hyperparameter of the column shrinkage on the first column of the loadings according to the MGP shrinkage prior. Passed to |
alpha.d2 |
Shape hyperparameter of the column shrinkage on the subsequent columns of the loadings according to the MGP shrinkage prior. Passed to |
phi.hyper |
A vector of length 2 giving the shape and rate hyperparameters for the gamma prior on the local shrinkage parameters. Passed to |
sigma.hyper |
A vector of length 2 giving the shape and rate hyperparameters for the gamma prior on the cluster shrinkage parameters. Passed to |
active.crit |
A character string indicating which criterion to use to determine the number of active factors during adaptive Gibbs sampling (i.e. only relevant when |
prop |
Only relevant when
|
eps |
Only relevant when |
adapt |
A logical value indicating whether adaptation of the number of cluster-specific factors is to take place when the MGP prior is employed. Defaults to |
forceQg |
A logical indicating whether the upper limit on the number of cluster-specific factors |
cluster.shrink |
A logical value indicating whether to place the prior specified by |
truncated |
A logical value indicating whether the version of the MGP prior based on left-truncated gamma distributions is invoked (see Zhang et al. reference below and additional relevant documentation in |
b0 , b1
|
Intercept & slope parameters for the exponentially decaying adaptation probability:
Defaults to |
beta.d1 |
Rate hyperparameter of the column shrinkage on the first column of the loadings according to the MGP shrinkage prior. Passed to |
beta.d2 |
Rate hyperparameter of the column shrinkage on the subsequent columns of the loadings according to the MGP shrinkage prior. Passed to |
start.AGS |
The iteration at which adaptation under the AGS is to begin. Defaults to |
stop.AGS |
The iteration at which adaptation under the AGS is to stop completely. Defaults to |
delta0g |
Logical indicating whether the |
... |
Catches unused arguments. |
A named list in which the names are the names of the arguments related to the MGP and AGS and the values are the values supplied to the arguments.
Certain supplied arguments will be subject to further checks by MGP_check
to ensure the cumulative shrinkage property of the MGP prior holds according to the given parameterisation.
The adaptive Gibbs sampler (AGS) monitors the prop
of loadings elements within the neighbourhood eps
of 0 and discards columns or simulates new columns on this basis. However, if at any stage the number of group-specific latent factors reaches zero, the decision to add columns is instead based on a simple binary trial with probability 1-prop
, as there are no loadings entries to monitor.
Keefe Murphy - <[email protected]>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
Durante, D. (2017). A note on the multiplicative gamma process, Statistics & Probability Letters, 122: 198-204.
Bhattacharya, A. and Dunson, D. B. (2011) Sparse Bayesian infinite factor models, Biometrika, 98(2): 291-306.
Schiavon, L. and Canale, A. (2020) On the truncation criteria in infinite factor models, Stat, 9:e298.
Zhang, X., Dunson, D. B., and Carin, L. (2011) Tree-structured infinite sparse factor model. In Getoor, L. and Scheffer, T. (Eds.), Proceedings of the 28th International Conference on Machine Learning (ICML 2011), June 28-July 2, 2011, Bellevue, WA, USA, pp. 785-792. Madison, WI, USA: Omnipress.
mcmc_IMIFA
, Ledermann
, MGP_check
, ltrgamma
, mixfaControl
, bnpControl
, storeControl
, get_IMIFA_results
mgpctrl <- mgpControl(phi.hyper=c(2.5, 1), eps=1e-02, truncated=TRUE) # data(olive) # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, MGP=mgpctrl) # Alternatively specify these arguments directly # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, # phi.hyper=c(2.5, 1), eps=1e-02, truncated=TRUE) # Use delta0g for a MIFA model with supplied cluster labels # sim2 <- mcmc_IMIFA(olive, n.iters=5000, method="MIFA", range.G=3, # z.list=olive$area, delta0g=TRUE, alpha.d1=4:2, alpha.d2=5:3 # sigma.hyper=matrix(c(4:6, rep(2, 3)), nrow=2, byrow=TRUE))
mgpctrl <- mgpControl(phi.hyper=c(2.5, 1), eps=1e-02, truncated=TRUE) # data(olive) # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, MGP=mgpctrl) # Alternatively specify these arguments directly # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, # phi.hyper=c(2.5, 1), eps=1e-02, truncated=TRUE) # Use delta0g for a MIFA model with supplied cluster labels # sim2 <- mcmc_IMIFA(olive, n.iters=5000, method="MIFA", range.G=3, # z.list=olive$area, delta0g=TRUE, alpha.d1=4:2, alpha.d2=5:3 # sigma.hyper=matrix(c(4:6, rep(2, 3)), nrow=2, byrow=TRUE))
Supplies a list of arguments for use in mcmc_IMIFA
pertaining to ALL methods in the IMIFA
family: e.g. MCMC settings, cluster initialisation, generic hyperparameters for factor-analytic mixtures, etc.
mixfaControl(n.iters = 25000L, burnin = n.iters/5L, thinning = 2L, centering = TRUE, scaling = c("unit", "pareto", "none"), uni.type = c("unconstrained", "isotropic", "constrained", "single"), psi.alpha = 2.5, psi.beta = NULL, mu.zero = NULL, sigma.mu = 1L, prec.mu = 0.01, sigma.l = 1L, z.init = c("hc", "kmeans", "list", "mclust", "priors"), z.list = NULL, equal.pro = FALSE, uni.prior = c("unconstrained", "isotropic"), mu0g = FALSE, psi0g = FALSE, drop0sd = TRUE, verbose = interactive(), ...)
mixfaControl(n.iters = 25000L, burnin = n.iters/5L, thinning = 2L, centering = TRUE, scaling = c("unit", "pareto", "none"), uni.type = c("unconstrained", "isotropic", "constrained", "single"), psi.alpha = 2.5, psi.beta = NULL, mu.zero = NULL, sigma.mu = 1L, prec.mu = 0.01, sigma.l = 1L, z.init = c("hc", "kmeans", "list", "mclust", "priors"), z.list = NULL, equal.pro = FALSE, uni.prior = c("unconstrained", "isotropic"), mu0g = FALSE, psi0g = FALSE, drop0sd = TRUE, verbose = interactive(), ...)
n.iters |
The number of iterations to run the sampler for. Defaults to 25000. |
burnin |
The number of burn-in iterations for the sampler. Defaults to |
thinning |
The thinning interval used in the simulation. Defaults to 2. No thinning corresponds to 1. Note that chains can also be thinned later, using |
centering |
A logical value indicating whether mean centering should be applied to the data, defaulting to |
scaling |
The scaling to be applied - one of |
uni.type |
This argument specifies the type of constraint, if any, to be placed on the uniquenesses/idiosyncratic variances, i.e. whether a general diagonal matrix or isotropic diagonal matrix is to be assumed, and in turn whether these matrices are constrained to be equal across clusters. The default Constraints may be particularly useful when
The first letter U here corresponds to constraints on loadings (not yet implemented), the second letter corresponds to uniquenesses constrained/unconstrained across clusters, and the third letter corresponds to the isotropic constraint on the uniquenesses. Of course, only the third letter is of relevance for the single-cluster |
psi.alpha |
The shape of the inverse gamma prior on the uniquenesses. Defaults to 2.5. Must be greater than 1 if |
psi.beta |
The scale of the inverse gamma prior on the uniquenesses. Can be either a single scalar parameter, a vector of variable specific scales, or (if Note that optional arguments to |
mu.zero |
The mean of the prior distribution for the mean parameter. Either a scalar, a vector of appropriate dimension, or (if |
sigma.mu |
The covariance of the prior distribution for the cluster mean parameters. Always assumed to be a diagonal matrix, and set to the identity matrix by default. Can also be a scalar by which the identity is multiplied, a vector of appropriate dimension; if supplied as a matrix, only the diagonal elements will be extracted. Specifying |
prec.mu |
A scalar controlling the degree of flatness of the prior for the cluster means by scaling |
sigma.l |
A scalar controlling the diagonal covariance of the prior distribution for the loadings. Defaults to |
z.init |
The method used to initialise the cluster labels. Defaults to model-based agglomerative hierarchical clustering via In any case, unless |
z.list |
A user supplied list of cluster labels. Only relevant if |
equal.pro |
Logical variable indicating whether or not the mixing mixing proportions are to be equal across clusters in the model (default = |
uni.prior |
A switch indicating whether uniquenesses scale hyperparameters are to be |
mu0g |
Logical indicating whether the |
psi0g |
Logical indicating whether the |
drop0sd |
Logical indicating whether to drop variables with no standard deviation (defaults to |
verbose |
Logical indicating whether to print output (e.g. run times) and a progress bar to the screen while the sampler runs. By default is |
... |
Also catches unused arguments. A number of optional arguments can be also supplied here:
|
A named list in which the names are the names of the arguments and the values are the values of the arguments.
Users should be careful to note that data are mean-centered (centering=TRUE
) and unit-scaled (scaling="unit"
) by default when supplying other parameters among the list above, especially those related in any way to psi.hyper
, or to the other control functions mgpControl
and bnpControl
.
Keefe Murphy - <[email protected]>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
McNicholas, P. D. and Murphy, T. B. (2008) Parsimonious Gaussian mixture models, Statistics and Computing, 18(3): 285-296.
mcmc_IMIFA
, psi_hyper
, hc
, kmeans
, Mclust
, mgpControl
, bnpControl
, storeControl
mfctrl <- mixfaControl(n.iters=200, prec.mu=1E-03, sigma.mu=NULL, beta0=1, uni.type="constrained") # data(olive) # sim <- mcmc_IMIFA(olive, "IMIFA", mixFA=mfctrl) # Alternatively specify these arguments directly # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=200, prec.mu=1E-03, # sigma.mu=NULL, beta0=1, uni.type="constrained") # Use mu0g and psi0g for MIFA models with supplied cluster labels # oliveScaled <- as.data.frame(scale(olive[,-(1:2)])) # sim2 <- mcmc_IMIFA(olive, "MIFA", n.iters=200, range.G=c(3, 9), # z.list=list(olive$area, olive$region), mu0g=TRUE, psi0g=TRUE, # mu.zero=list(do.call(cbind, tapply(oliveScaled, olive$area, colMeans)), # do.call(cbind, tapply(oliveScaled, olive$region, colMeans))))
mfctrl <- mixfaControl(n.iters=200, prec.mu=1E-03, sigma.mu=NULL, beta0=1, uni.type="constrained") # data(olive) # sim <- mcmc_IMIFA(olive, "IMIFA", mixFA=mfctrl) # Alternatively specify these arguments directly # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=200, prec.mu=1E-03, # sigma.mu=NULL, beta0=1, uni.type="constrained") # Use mu0g and psi0g for MIFA models with supplied cluster labels # oliveScaled <- as.data.frame(scale(olive[,-(1:2)])) # sim2 <- mcmc_IMIFA(olive, "MIFA", n.iters=200, range.G=c(3, 9), # z.list=list(olive$area, olive$region), mu0g=TRUE, psi0g=TRUE, # mu.zero=list(do.call(cbind, tapply(oliveScaled, olive$area, colMeans)), # do.call(cbind, tapply(oliveScaled, olive$region, colMeans))))
Data on the percentage composition of eight fatty acids found by lipid fraction of 572 Italian olive oils. The data come from three areas; within each area there are a number of constituent regions, of which there are 9 in total.
data(olive)
data(olive)
A data frame with 572 observations and 10 columns. The first columns gives the area (one of Southern Italy, Sardinia, and Northern Italy), the second gives the region, and the remaining 8 columns give the variables. Southern Italy comprises the North Apulia, Calabria, South Apulia, and Sicily regions, Sardinia is divided into Inland Sardinia and Coastal Sardinia and Northern Italy comprises the Umbria, East Liguria, and West Liguria regions.
Forina, M., Armanino, C., Lanteri, S. and Tiscornia, E. (1983). Classification of olive oils from their fatty acid composition, In Martens, H. and H. Russrum Jr. (Eds.), Food Research and Data Analysis: Proceedings from the IUFoST Symposium, September 20-23, 1982, Oslo, Norway, pp. 189-214. London, UK: Applied Science Publishers.
Forina, M. and Tiscornia, E. (1982). Pattern recognition methods in the prediction of Italian olive oil origin by their fatty acid content, Annali di Chimica, 72: 143-155.
data(olive, package="IMIFA") pairs(olive[,-(1:2)], col=olive$area) region <- as.numeric(olive$region) pairs(olive[,-(1:2)], col=ifelse(region < 5, 1, ifelse(region < 7, 2, ifelse(region == 9, 4, 3))))
data(olive, package="IMIFA") pairs(olive[,-(1:2)], col=olive$area) region <- as.numeric(olive$region) pairs(olive[,-(1:2)], col=ifelse(region < 5, 1, ifelse(region < 7, 2, ifelse(region == 9, 4, 3))))
Pareto scaling of a numeric matrix, with or without centering. Observations are scaled by the square-root of their column-wise standard deviations.
pareto_scale(x, centering = TRUE)
pareto_scale(x, centering = TRUE)
x |
A numeric matrix-like object. |
centering |
A logical vector indicating whether centering is to be applied (default= |
The Pareto scaled version of the matrix x
.
Keefe Murphy - <[email protected]>
van den Berg, R. A., Hoefsloot, H. C. J, Westerhuis, J. A., Smilde, A. K., and van der Werf, M.J. (2006) Centering, scaling, and transformations: improving the biological information content of metabolomics data. BMC Genomics, 7(142).
dat <- pareto_scale(olive[,-(1:2)])
dat <- pareto_scale(olive[,-(1:2)])
Estimates the dimension of the 'free' parameters in fully finite factor analytic mixture models, otherwise known as Parsimonious Gaussian Mixture Models (PGMM), typically necessary for the penalty term of various model selection criteria.
PGMM_dfree(Q, P, G = 1L, method = c("UUU", "UUC", "UCU", "UCC", "CUU", "CUC", "CCU", "CCC", "CCUU", "UCUU", "CUCU", "UUCU"), equal.pro = FALSE)
PGMM_dfree(Q, P, G = 1L, method = c("UUU", "UUC", "UCU", "UCC", "CUU", "CUC", "CCU", "CCC", "CCUU", "UCUU", "CUCU", "UUCU"), equal.pro = FALSE)
Q |
The number of latent factors (which can be 0, corresponding to a model with diagonal covariance). This argument is vectorised. |
P |
The number of variables. Must be a single strictly positive integer. |
G |
The number of clusters. This defaults to 1. Must be a single strictly positive integer. |
method |
By default, calculation assumes the |
equal.pro |
Logical variable indicating whether or not the mixing mixing proportions are equal across clusters in the model (default = |
A vector of length length(Q)
giving the total number of parameters, including means and mixing proportions, and not only covariance parameters. Set equal.pro
to FALSE
and subtract G * P
from the result to determine the number of covariance parameters only.
This function is used to calculate the penalty terms for the aic.mcmc
and bic.mcmc
model selection criteria implemented in get_IMIFA_results
for finite factor models (though mcmc_IMIFA
currently only implements the UUU
, UUC
, UCU
, and UCC
covariance structures). The function is vectorised with respect to the argument Q
.
Though the function is available for standalone use, note that no checks take place, in order to speed up repeated calls to the function inside mcmc_IMIFA
.
Keefe Murphy - <[email protected]>
McNicholas, P. D. and Murphy, T. B. (2008) Parsimonious Gaussian mixture models, Statistics and Computing, 18(3): 285-296.
McNicholas, P. D. and Murphy, T. B. (2010) Model-Based clustering of microarray expression data via latent Gaussian mixture models, Bioinformatics, 26(21): 2705-2712.
(UUU <- PGMM_dfree(Q=0:5, P=50, G=3, method="UUU")) (CCC <- PGMM_dfree(Q=0:5, P=50, G=3, method="CCC", equal.pro=TRUE))
(UUU <- PGMM_dfree(Q=0:5, P=50, G=3, method="UUU")) (CCC <- PGMM_dfree(Q=0:5, P=50, G=3, method="CCC", equal.pro=TRUE))
Plots a matrix of colours as a heat map type image or as points. Intended for joint use with mat2cols
.
plot_cols(cmat, na.col = "#808080FF", ptype = c("image", "points"), border.col = "#808080FF", dlabels = NULL, rlabels = FALSE, clabels = FALSE, pch = 15, cex = 3, label.cex = 0.6, ...)
plot_cols(cmat, na.col = "#808080FF", ptype = c("image", "points"), border.col = "#808080FF", dlabels = NULL, rlabels = FALSE, clabels = FALSE, pch = 15, cex = 3, label.cex = 0.6, ...)
cmat |
A matrix of valid colours, with missing values coded as |
na.col |
Colour used for missing |
ptype |
Switch controlling output as either a heat map |
border.col |
Colour of border drawn around the plot. |
dlabels , rlabels , clabels
|
Vector of labels for the diagonals, rows, and columns, respectively. |
pch |
Point type used when |
cex |
Point cex used when |
label.cex |
Govens cex parameter used for labels. |
... |
Further graphical parameters. |
Either an "image"
or "points"
type plot of the supplied colours.
mat2cols
, image
, heat_legend
, is.cols
# Generate a colour matrix using mat2cols() mat <- matrix(rnorm(100), nrow=10, ncol=10) mat[2,3] <- NA cols <- heat.colors(12)[12:1] (matcol <- mat2cols(mat, cols=cols)) # Use plot_cols() to visualise the colours matrix par(mar=c(5.1, 4.1, 4.1, 3.1)) plot_cols(matcol) # Add a legend using heat_legend() heat_legend(mat, cols=cols); box(lwd=2) # Replace colour of exact zero entries: # Often important to call mat2cols() first (to include 0 in the cuts), # then replace relevant entries with NA for plot_cols(), i.e. mat[2,3] <- 0 matcol2 <- mat2cols(mat, cols=cols) plot_cols(replace(matcol2, mat == 0, NA), na.col="blue") heat_legend(mat, cols=cols); box(lwd=2)
# Generate a colour matrix using mat2cols() mat <- matrix(rnorm(100), nrow=10, ncol=10) mat[2,3] <- NA cols <- heat.colors(12)[12:1] (matcol <- mat2cols(mat, cols=cols)) # Use plot_cols() to visualise the colours matrix par(mar=c(5.1, 4.1, 4.1, 3.1)) plot_cols(matcol) # Add a legend using heat_legend() heat_legend(mat, cols=cols); box(lwd=2) # Replace colour of exact zero entries: # Often important to call mat2cols() first (to include 0 in the cuts), # then replace relevant entries with NA for plot_cols(), i.e. mat[2,3] <- 0 matcol2 <- mat2cols(mat, cols=cols) plot_cols(replace(matcol2, mat == 0, NA), na.col="blue") heat_legend(mat, cols=cols); box(lwd=2)
Plotting output and parameters of inferential interest for IMIFA and related models
## S3 method for class 'Results_IMIFA' plot(x, plot.meth = c("all", "correlation", "density", "errors", "GQ", "means", "parallel.coords", "trace", "zlabels"), param = c("means", "scores", "loadings", "uniquenesses", "pis", "alpha", "discount"), g = NULL, mat = TRUE, zlabels = NULL, heat.map = TRUE, show.last = FALSE, palette = NULL, ind = NULL, fac = NULL, by.fac = FALSE, type = c("h", "n", "p", "l"), intervals = TRUE, common = TRUE, partial = FALSE, titles = TRUE, transparency = 0.75, ...)
## S3 method for class 'Results_IMIFA' plot(x, plot.meth = c("all", "correlation", "density", "errors", "GQ", "means", "parallel.coords", "trace", "zlabels"), param = c("means", "scores", "loadings", "uniquenesses", "pis", "alpha", "discount"), g = NULL, mat = TRUE, zlabels = NULL, heat.map = TRUE, show.last = FALSE, palette = NULL, ind = NULL, fac = NULL, by.fac = FALSE, type = c("h", "n", "p", "l"), intervals = TRUE, common = TRUE, partial = FALSE, titles = TRUE, transparency = 0.75, ...)
x |
An object of class |
plot.meth |
The type of plot to be produced for the Special types of plots which don't require a
The argument |
param |
The parameter of interest for any of the following |
g |
Optional argument that allows specification of exactly which cluster the plot of interest is to be produced for. If not supplied, the user will be prompted to cycle through plots for all clusters. Also functions as an index for which plot to return when |
mat |
Logical indicating whether a |
zlabels |
The true labels can be supplied if they are known. If this is not supplied, the function uses the labels that were supplied, if any, to |
heat.map |
A logical which controls plotting posterior mean loadings or posterior mean scores as a heatmap, or else as something akin to |
show.last |
A logical indicator which defaults to |
palette |
An optional colour palette to be supplied if overwriting the default palette set inside the function by |
ind |
Either a single number indicating which variable to plot when |
fac |
Optional argument that provides an alternative way to specify |
by.fac |
Optionally allows (mat)plotting of scores and loadings by factor - i.e. observation(s) (scores) or variable(s) (loadings) for a given factor, respectively, controlled by |
type |
The manner in which the plot is to be drawn, as per the |
intervals |
Logical indicating whether credible intervals around the posterior mean(s) are to be plotted when |
common |
Logical indicating whether plots with Note that this affects the |
partial |
Logical indicating whether plots of type |
titles |
Logical indicating whether default plot titles are to be used ( |
transparency |
A factor in [0, 1] modifying the opacity for overplotted lines. Defaults to 0.75, unless semi-transparency is not supported. Only relevant when |
... |
Other arguments typically passed to |
The desired plot with appropriate output and summary statistics printed to the console screen.
Supplying the argument zlabels
does not have the same effect of reordering the sampled parameters as it does if supplied directly to get_IMIFA_results
.
When mat
is TRUE
and by.fac
is FALSE
(both defaults), the convention for dealing with overplotting for trace
and density
plots when param
is either scores
or loadings
is to plot the last factor first, such that the first factor appears 'on top'.
Keefe Murphy - <[email protected]>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
mcmc_IMIFA
, get_IMIFA_results
, mat2cols
, plot_cols
# See the vignette associated with the package for more graphical examples: # vignette("IMIFA", package = "IMIFA") # data(olive) # simIMIFA <- mcmc_IMIFA(olive, method="IMIFA") # resIMIFA <- get_IMIFA_results(simIMIFA, z.avgsim=TRUE) # Examine the posterior distribution(s) of the number(s) of clusters (G) &/or latent factors (Q) # For the IM(I)FA and OM(I)FA methods, this also plots the trace of the active/non-empty clusters # plot(resIMIFA, plot.meth="GQ") # plot(resIMIFA, plot.meth="GQ", g=2) # Plot clustering uncertainty (and, if available, the similarity matrix) # plot(resIMIFA, plot.meth="zlabels", zlabels=olive$area) # Visualise the posterior predictive reconstruction error # plot(resIMIFA, plot.meth="errors", g=1) # Compare histograms of the data vs. replicate draw from the posterior for the 1st variable # plot(resIMIFA, plot.meth="errors", g=2, ind=1) # Visualise empirical vs. estimated covariance error metrics # plot(resIMIFA, plot.meth="errors", g=3) # Look at the trace, density, posterior mean, and correlation of various parameters of interest # plot(resIMIFA, plot.meth="all", param="means", g=1) # plot(resIMIFA, plot.meth="all", param="means", g=1, ind=2) # plot(resIMIFA, plot.meth="trace", param="scores") # plot(resIMIFA, plot.meth="trace", param="scores", by.fac=TRUE) # plot(resIMIFA, plot.meth="mean", param="loadings", g=1) # plot(resIMIFA, plot.meth="mean", param="loadings", g=1, heat.map=FALSE) # plot(resIMIFA, plot.meth="parallel.coords", param="uniquenesses") # plot(resIMIFA, plot.meth="density", param="pis", intervals=FALSE, partial=TRUE) # plot(resIMIFA, plot.meth="all", param="alpha") # plot(resIMIFA, plot.meth="all", param="discount")
# See the vignette associated with the package for more graphical examples: # vignette("IMIFA", package = "IMIFA") # data(olive) # simIMIFA <- mcmc_IMIFA(olive, method="IMIFA") # resIMIFA <- get_IMIFA_results(simIMIFA, z.avgsim=TRUE) # Examine the posterior distribution(s) of the number(s) of clusters (G) &/or latent factors (Q) # For the IM(I)FA and OM(I)FA methods, this also plots the trace of the active/non-empty clusters # plot(resIMIFA, plot.meth="GQ") # plot(resIMIFA, plot.meth="GQ", g=2) # Plot clustering uncertainty (and, if available, the similarity matrix) # plot(resIMIFA, plot.meth="zlabels", zlabels=olive$area) # Visualise the posterior predictive reconstruction error # plot(resIMIFA, plot.meth="errors", g=1) # Compare histograms of the data vs. replicate draw from the posterior for the 1st variable # plot(resIMIFA, plot.meth="errors", g=2, ind=1) # Visualise empirical vs. estimated covariance error metrics # plot(resIMIFA, plot.meth="errors", g=3) # Look at the trace, density, posterior mean, and correlation of various parameters of interest # plot(resIMIFA, plot.meth="all", param="means", g=1) # plot(resIMIFA, plot.meth="all", param="means", g=1, ind=2) # plot(resIMIFA, plot.meth="trace", param="scores") # plot(resIMIFA, plot.meth="trace", param="scores", by.fac=TRUE) # plot(resIMIFA, plot.meth="mean", param="loadings", g=1) # plot(resIMIFA, plot.meth="mean", param="loadings", g=1, heat.map=FALSE) # plot(resIMIFA, plot.meth="parallel.coords", param="uniquenesses") # plot(resIMIFA, plot.meth="density", param="pis", intervals=FALSE, partial=TRUE) # plot(resIMIFA, plot.meth="all", param="alpha") # plot(resIMIFA, plot.meth="all", param="discount")
For a (N * G
) matrix of posterior cluster membership probabilities, this function creates a (G * G
) posterior confusion matrix, whose hk-th entry gives the average probability that observations with maximum posterior allocation h will be assigned to cluster k.
post_conf_mat(z, scale = TRUE)
post_conf_mat(z, scale = TRUE)
z |
A ( Otherwise, a list of such matrices can be supplied, where each matrix in the list has the same dimensions. |
scale |
A logical indicator whether the PCM should be rescaled by its row sums. When |
A (G * G
) posterior confusion matrix, whose hk-th entry gives the average probability that observations with maximum posterior allocation h will be assigned to cluster k. When scale=TRUE
, the benchmark matrix for comparison is the identity matrix of order G
, corresponding to a situation with no uncertainty in the clustering.
Keefe Murphy - <[email protected]>
Ranciati, S., Vinciotti, V. and Wit, E., (2017) Identifying overlapping terrorist cells from the Noordin Top actor-event network, Annals of Applied Statistics, 14(3): 1516-1534.
get_IMIFA_results
# data(olive) # sim <- mcmc_IMIFA(olive, n.iters=1000) # res <- get_IMIFA_results(sim) # (PCM <- post_conf_mat(res$Clust$post.prob)) # par(mar=c(5.1, 4.1, 4.1, 3.1)) # PCM <- replace(PCM, PCM == 0, NA) # plot_cols(mat2cols(PCM, col=heat.colors(30, rev=TRUE), na.col=par()$bg)); box(lwd=2) # heat_legend(PCM, cols=heat.colors(30, rev=TRUE)) # par(mar=c(5.1, 4.1, 4.1, 2.1))
# data(olive) # sim <- mcmc_IMIFA(olive, n.iters=1000) # res <- get_IMIFA_results(sim) # (PCM <- post_conf_mat(res$Clust$post.prob)) # par(mar=c(5.1, 4.1, 4.1, 3.1)) # PCM <- replace(PCM, PCM == 0, NA) # plot_cols(mat2cols(PCM, col=heat.colors(30, rev=TRUE), na.col=par()$bg)); box(lwd=2) # heat_legend(PCM, cols=heat.colors(30, rev=TRUE)) # par(mar=c(5.1, 4.1, 4.1, 2.1))
This function performs a Procrustes transformation on a matrix X
to minimize the squared distance between X
and another comparable matrix Xstar
.
Procrustes(X, Xstar, translate = FALSE, dilate = FALSE, sumsq = FALSE)
Procrustes(X, Xstar, translate = FALSE, dilate = FALSE, sumsq = FALSE)
X |
The matrix to be transformed. |
Xstar |
The target matrix. |
translate |
Logical value indicating whether |
dilate |
Logical value indicating whether |
sumsq |
Logical value indicating whether the sum of squared differences between |
R
, tt
, and d
are chosen so that:
X.new
is given by:
A list containing:
X.new |
The matrix that is the Procrustes transformed version of |
R |
The rotation matrix. |
t |
The translation vector (if |
d |
The scaling factor (is |
ss |
The sum of squared differences (if |
X
is padded out with columns containing 0
if it has fewer columns than Xstar
.
Borg, I. and Groenen, P. J. F. (1997) Modern Multidimensional Scaling: Theory and Applications. Springer Series in Statistics. New York, NY, USA: Springer-Verlag, pp. 340-342.
# Match two matrices, allowing translation and dilation mat1 <- diag(rnorm(10)) mat2 <- 0.05 * matrix(rnorm(100), 10, 10) + mat1 proc <- Procrustes(X=mat1, Xstar=mat2, translate=TRUE, dilate=TRUE, sumsq=TRUE) # Extract the transformed matrix, rotation matrix, translation vector and scaling factor mat_new <- proc$X.new mat_rot <- proc$R mat_t <- proc$t mat_d <- proc$d # Compare the sum of squared differences to a Procrustean transformation with rotation only mat_ss <- proc$ss mat_ss2 <- Procrustes(X=mat1, Xstar=mat2, sumsq=TRUE)$ss
# Match two matrices, allowing translation and dilation mat1 <- diag(rnorm(10)) mat2 <- 0.05 * matrix(rnorm(100), 10, 10) + mat1 proc <- Procrustes(X=mat1, Xstar=mat2, translate=TRUE, dilate=TRUE, sumsq=TRUE) # Extract the transformed matrix, rotation matrix, translation vector and scaling factor mat_new <- proc$X.new mat_rot <- proc$R mat_t <- proc$t mat_d <- proc$d # Compare the sum of squared differences to a Procrustean transformation with rotation only mat_ss <- proc$ss mat_ss2 <- Procrustes(X=mat1, Xstar=mat2, sumsq=TRUE)$ss
Takes an inverse-Gamma shape hyperparameter, and an inverse covariance matrix (or estimate thereof), and finds data-driven scale hyperparameters in such a way that Heywood problems are avoided for factor analysis or probabilistic principal components analysis (and mixtures thereof).
psi_hyper(shape, dat, type = c("unconstrained", "isotropic"), beta0 = 3, ...)
psi_hyper(shape, dat, type = c("unconstrained", "isotropic"), beta0 = 3, ...)
shape |
A positive shape hyperparameter. |
dat |
The data matrix for which the inverse covariance matrix is to be estimated. If data are to be centered &/or scaled within |
type |
A switch indicating whether a single scale ( |
beta0 |
See Note below. Must be a strictly positive numeric scalar. Defaults to |
... |
Catches unused arguments. Advanced users can also supply the sample covariance matrix of |
Constraining uniquenesses to be isotropic provides the link between factor analysis and the probabilistic PCA model. When used in conjunction with mcmc_IMIFA
with "isotropic" or "single" uniquenesses, type
must be isotropic
, but for "unconstrained" or "constrained" uniquenesses, it's possible to specify either a single scale (type="isotropic"
) or variable-specific scales (type="unconstrained"
).
Used internally by mcmc_IMIFA
when its argument psi_beta
is not supplied.
Either a single scale hyperparameter or ncol(dat)
variable-specific scale hyperparameters.
When N > P
, where N
is the number of observations and P
is the number of variables, the inverse of the sample covariance matrix is used by default.
When N <= P
, the inverse either does not exist or the estimate thereof is highly unstable. Thus, an estimate of the form is used instead.
For unstandardised data, the estimate is instead constructed using a standardised version of the data, and the resulting inverse correlation matrix estimate is scaled appropriately by the diagonal entries of the sample covariance matrix of the original data.
This estimate can also be used in N > P
cases by explicitly supplying beta0
. It will also be used if inverting the sample covariance matrix fails in N > P
cases.
The optional argument beta0
can be supplied to mcmc_IMIFA
via the control function mixfaControl
.
Keefe Murphy - <[email protected]>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
Fruwirth-Schnatter, S. and Lopes, H. F. (2010). Parsimonious Bayesian factor analysis when the number of factors is unknown, Technical Report. The University of Chicago Booth School of Business.
Fruwirth-Schnatter, S. and Lopes, H. F. (2018). Sparse Bayesian factor analysis when the number of factors is unknown, to appear. <arXiv:1804.04231>.
Tipping, M. E. and Bishop, C. M. (1999). Probabilistic principal component analysis, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3): 611-622.
data(olive) olive2 <- olive[,-(1:2)] shape <- 2.5 (scale1 <- psi_hyper(shape=shape, dat=olive2)) # Try again with scaled data olive_S <- scale(olive2, center=TRUE, scale=TRUE) # Use the inverse of the sample covariance matrix (scale2 <- psi_hyper(shape=shape, dat=olive_S)) # Use the estimated inverse covariance matrix (scale3 <- psi_hyper(shape=shape, dat=olive_S, beta0=3)) # In the normalised example, the mean uniquenesses (given by scale/(shape - 1)), # can be interpreted as the prior proportion of the variance that is idiosyncratic (prop1 <- scale1/(shape - 1)) (prop2 <- scale2/(shape - 1))
data(olive) olive2 <- olive[,-(1:2)] shape <- 2.5 (scale1 <- psi_hyper(shape=shape, dat=olive2)) # Try again with scaled data olive_S <- scale(olive2, center=TRUE, scale=TRUE) # Use the inverse of the sample covariance matrix (scale2 <- psi_hyper(shape=shape, dat=olive_S)) # Use the estimated inverse covariance matrix (scale3 <- psi_hyper(shape=shape, dat=olive_S, beta0=3)) # In the normalised example, the mean uniquenesses (given by scale/(shape - 1)), # can be interpreted as the prior proportion of the variance that is idiosyncratic (prop1 <- scale1/(shape - 1)) (prop2 <- scale2/(shape - 1))
Generates samples from the Dirichlet distribution with parameter alpha
efficiently by simulating Gamma(alpha
, 1) random variables and normalising them.
rDirichlet(G, alpha, nn = 0L)
rDirichlet(G, alpha, nn = 0L)
G |
The number of clusters for which weights need to be sampled. |
alpha |
The Dirichlet hyperparameter, either of length 1 or |
nn |
A vector giving the number of observations in each of G clusters so that Dirichlet posteriors rather than priors can be sampled from. This defaults to 0, i.e. simulation from the prior. Must be non-negative. Be warned that this will be recycled if necessary. |
A Dirichlet vector of G
weights which sum to 1.
Though the function is available for standalone use, note that few checks take place, in order to speed up repeated calls to the function inside mcmc_IMIFA
. In particular, alpha
and nn
may be invisibly recycled.
While small values of alpha
have the effect of increasingly concentrating the mass onto fewer components, note that this function may return NaN
for excessively small values of alpha
, when nn=0
; see the details of rgamma
for small shape
values.
Devroye, L. (1986) Non-Uniform Random Variate Generation, New York, NY, USA: Springer-Verlag, p. 594.
(prior <- rDirichlet(G=5, alpha=1)) (posterior <- rDirichlet(G=5, alpha=1, nn=c(20, 41, 32, 8, 12))) (asymmetric <- rDirichlet(G=5, alpha=c(3,4,5,1,2), nn=c(20, 41, 32, 8, 12)))
(prior <- rDirichlet(G=5, alpha=1)) (posterior <- rDirichlet(G=5, alpha=1, nn=c(20, 41, 32, 8, 12))) (asymmetric <- rDirichlet(G=5, alpha=c(3,4,5,1,2), nn=c(20, 41, 32, 8, 12)))
Takes posterior summaries of the overall factor scores matrix and returns lists of sub-matrices corresponding to the G
-cluster MAP partition.
scores_MAP(res, dropQ = FALSE)
scores_MAP(res, dropQ = FALSE)
res |
An object of class |
dropQ |
A logical indicating whether columns of the factor scores matrix should be dropped such that the number of columns in each sub-matrix corresponds to the cluster-specific number of factors (if the number of factors is indeed cluster-specific). When Note that this argument is irrelevant (i.e. always |
Under the models in the IMIFA family, there exists only one factor scores matrix. For the finite factor methods, this has dimensions N * Q
.
For the infinite factor methods ("IFA"
, "MIFA"
, "OMIFA"
, and "IMIFA"
), the factor scores matrix has dimensions N * Qmax
, where Qmax
is the largest of the cluster-specific numbers of latent factors . Entries of this matrix thus may have been padded out with zero entries, as appropriate, prior to the Procrustes rotation-based correction applied within
get_IMIFA_results
(thus now these entries will be near-zero).
In partitioning rows of the factor scores matrix into the same clusters the corresponding observations themselves belong to according to the MAP clustering, the number of columns may vary according to the cluster-specific numbers of latent factors (depending on the value of dropQ
and the method employed).
For models which achieve clustering, a list of lists (say x
) decomposing the posterior mean scores (x$post.eta
), the associated variance estimates (x$var.eta
) and credible intervals (x$ci.eta
), and the last valid sample of the scores (x$last.eta
) into lists of length G
, corresponding to the MAP clustering, with varying or common numbers of cluster-specific factors (depending on the value of dropQ
and the method employed).
For models with only one component, or the "FA"
and "IFA"
methods, scores cannot be decomposed, and posterior summaries of the scores will be returned unchanged.
Keefe Murphy - <[email protected]>
data(coffee) sim <- mcmc_IMIFA(coffee, n.iters=1000) res <- get_IMIFA_results(sim) # Examine the single posterior mean scores matrix res$Scores$post.eta # Decompose into G matrices, common numbers of columns eta <- scores_MAP(res) eta$post.eta # Allow the number of columns be cluster-specific scores_MAP(res, dropQ=TRUE)$post.eta
data(coffee) sim <- mcmc_IMIFA(coffee, n.iters=1000) res <- get_IMIFA_results(sim) # Examine the single posterior mean scores matrix res$Scores$post.eta # Decompose into G matrices, common numbers of columns eta <- scores_MAP(res) eta$post.eta # Allow the number of columns be cluster-specific scores_MAP(res, dropQ=TRUE)$post.eta
This function takes shape and rate parameters of a Gamma distribution and modifies them to achieve the same expected value and variance when the left extent of the support of the distribution is shifted up or down.
shift_GA(shape, rate, shift = 0, param = c("rate", "scale"))
shift_GA(shape, rate, shift = 0, param = c("rate", "scale"))
shape , rate
|
Shape and rate parameters a and b, respectively, of a Gamma(a, b) distribution. Both must be strictly positive. |
shift |
Modifier, such that the Gamma distribution has support on ( |
param |
Switch controlling whether the supplied |
A named vector of length 2, containing the modified shape and rate parameters, respectively.
This function is invoked within mcmc_IMIFA
when discount
is fixed at a non-zero value and learn.alpha=TRUE
.
Keefe Murphy - <[email protected]>
# Shift a Ga(shape=4, rate=2) distribution to the left by 1; # achieving the same expected value of 2 and variance of 1. shift_GA(4, 2, -1)
# Shift a Ga(shape=4, rate=2) distribution to the left by 1; # achieving the same expected value of 2 and variance of 1. shift_GA(4, 2, -1)
Plots an image of a grayscale grid representation of a digit.
show_digit(dat, col = NULL, ...)
show_digit(dat, col = NULL, ...)
dat |
A |
col |
The colour scale to be used. Defaults to |
... |
Additional arguments to be passed to |
The desired image representation of the digit.
Keefe Murphy - <[email protected]>
USPSdigits
, show_IMIFA_digit
, mat2cols
, plot_cols
data(USPSdigits) # Plot the first digit show_digit(USPSdigits$train[1,-1]) # Visualise the overall mean show_digit(colMeans(USPSdigits$train[,-1]))
data(USPSdigits) # Plot the first digit show_digit(USPSdigits$train[1,-1]) # Visualise the overall mean show_digit(colMeans(USPSdigits$train[,-1]))
Plots the posterior mean of a given cluster from an "IMIFA"
-related model fit to a digit data set in the form of a square grayscale grid.
show_IMIFA_digit(res, G = 1, what = c("mean", "last"), dat = NULL, ind = NULL, ...)
show_IMIFA_digit(res, G = 1, what = c("mean", "last"), dat = NULL, ind = NULL, ...)
res |
An object of class |
G |
The index of the cluster for which the posterior mean digit is to be represented. |
what |
A switch controlling whether the |
dat |
The full grayscale grid data set (prior to centering and scaling). Necessary when |
ind |
The index of columns of |
... |
Additional arguments to be passed, via |
This function is a wrapper to show_digit
which supplies the posterior mean digit of a given cluster from a "IMIFA"
model.
The desired image representation of the posterior mean digit (or the last valid sample) from the desired cluster.
Note that both centering and scaling of the original data prior to modelling is accounted for in reconstructing the means, but dat
, if necessary, must be the raw data prior to pre-processing.
Keefe Murphy - <[email protected]>
USPSdigits
, show_digit
, get_IMIFA_results
, mcmc_IMIFA
, mat2cols
, plot_cols
# Load the USPS data and discard peripheral digits data(USPSdigits) ylab <- USPSdigits$train[,1] train <- USPSdigits$train[,-1] ind <- apply(train, 2, sd) > 0.7 dat <- train[,ind] # Fit an IMIFA model (warning: quite slow!) # sim <- mcmc_IMIFA(dat, n.iters=1000, prec.mu=1e-03, z.init="kmeans", # centering=FALSE, scaling="none") # res <- get_IMIFA_results(sim, zlabels=ylab) # Examine the posterior mean image of the first two clusters # show_IMIFA_digit(res, dat=train, ind=ind) # show_IMIFA_digit(res, dat=train, ind=ind, G=2)
# Load the USPS data and discard peripheral digits data(USPSdigits) ylab <- USPSdigits$train[,1] train <- USPSdigits$train[,-1] ind <- apply(train, 2, sd) > 0.7 dat <- train[,ind] # Fit an IMIFA model (warning: quite slow!) # sim <- mcmc_IMIFA(dat, n.iters=1000, prec.mu=1e-03, z.init="kmeans", # centering=FALSE, scaling="none") # res <- get_IMIFA_results(sim, zlabels=ylab) # Examine the posterior mean image of the first two clusters # show_IMIFA_digit(res, dat=train, ind=ind) # show_IMIFA_digit(res, dat=train, ind=ind, G=2)
Functions to simulate data of any size and dimension from a (infinite) mixture of (infinite) factor analysers parameterisation or fitted object.
sim_IMIFA_data(N = 300L, G = 3L, P = 50L, Q = rep(floor(log(P)), G), pis = rep(1/G, G), mu = NULL, psi = NULL, loadings = NULL, scores = NULL, nn = NULL, loc.diff = 2, non.zero = P, forceQg = TRUE, method = c("conditional", "marginal")) sim_IMIFA_model(res, method = c("conditional", "marginal"))
sim_IMIFA_data(N = 300L, G = 3L, P = 50L, Q = rep(floor(log(P)), G), pis = rep(1/G, G), mu = NULL, psi = NULL, loadings = NULL, scores = NULL, nn = NULL, loc.diff = 2, non.zero = P, forceQg = TRUE, method = c("conditional", "marginal")) sim_IMIFA_model(res, method = c("conditional", "marginal"))
N , G , P
|
Desired overall number of observations, number of clusters, and number of variables in the simulated data set. All must be a single integer. |
Q |
Desired number of cluster-specific latent factors in the simulated data set. Can be specified either as a single integer if all clusters are to have the same number of factors, or a vector of length |
pis |
Mixing proportions of the clusters in the data set if |
mu |
True values of the mean parameters, either as a single value, a vector of length |
psi |
True values of uniqueness parameters, either as a single value, a vector of length |
loadings |
True values of the loadings matrix/matrices. Must be supplied in the form of a list of numeric matrices when |
scores |
True values of the latent factor scores, as a |
nn |
An alternative way to specify the size of each cluster, by giving the exact number of observations in each cluster explicitly. Must sum to |
loc.diff |
A parameter to control the closeness of the clusters in terms of the difference in their location vectors. Only relevant if More specifically,
|
non.zero |
Controls the number of non-zero entries in each loadings column (per cluster) only when Must be given as a list of length |
forceQg |
A logical indicating whether the upper limit on the number of cluster-specific factors |
method |
A switch indicating whether the mixture to be simulated from is the conditional distribution of the data given the latent variables (default), or simply the marginal distribution of the data. |
res |
An object of class |
sim_IMIFA_model
is a simple wrapper to sim_IMIFA_data
which uses the estimated parameters of a fitted IMIFA related model, as generated by get_IMIFA_results
. The necessary parameters must have been originally stored via storeControl
in the creation of res
.
Invisibly returns a data.frame
with N
observations (rows) of P
variables (columns). The true values of the parameters which generated these data are also stored as attributes.
N
, G
, P
& Q
will NOT be inferred from the supplied parameters pis
, mu
, psi
, loadings
, scores
& nn
- rather, the parameters' length/dimensions must adhere to the supplied values of N
, G
, P
& Q
.
Missing values are not allowed in any of pis
, mu
, psi
, loadings
, scores
& nn
.
Keefe Murphy - <[email protected]>
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
mcmc_IMIFA
for fitting an IMIFA related model to the simulated data set.
get_IMIFA_results
for generating input for sim_IMIFA_model
.
Ledermann
for details on the upper-bound for Q
. Note that this function accounts for isotropic uniquenesses, if psi
is supplied in that manner, in computing this bound.
# Simulate 100 observations from 3 balanced clusters with cluster-specific numbers of latent factors # Specify isotropic uniquenesses within each cluster # Supply cluster means directly sim_data <- sim_IMIFA_data(N=100, G=3, P=20, Q=c(2, 2, 5), psi=1:3, mu=matrix(rnorm(60, -2 + 1:3, 1), nrow=20, ncol=3, byrow=TRUE)) names(attributes(sim_data)) labels <- attr(sim_data, "Labels") # Visualise the data in two-dimensions plot(cmdscale(dist(sim_data), k=2), col=labels) # Examine the overlap with a pairs plot of 5 randomly chosen variables pairs(sim_data[,sample(1:20, 5)], col=labels) # Fit a MIFA model to this data # tmp <- mcmc_IMIFA(sim_data, method="MIFA", range.G=3, n.iters=5000) # Simulate from this model # res <- get_IMIFA_results(tmp, zlabels=labels) # sim_mod <- sim_IMIFA_model(res)
# Simulate 100 observations from 3 balanced clusters with cluster-specific numbers of latent factors # Specify isotropic uniquenesses within each cluster # Supply cluster means directly sim_data <- sim_IMIFA_data(N=100, G=3, P=20, Q=c(2, 2, 5), psi=1:3, mu=matrix(rnorm(60, -2 + 1:3, 1), nrow=20, ncol=3, byrow=TRUE)) names(attributes(sim_data)) labels <- attr(sim_data, "Labels") # Visualise the data in two-dimensions plot(cmdscale(dist(sim_data), k=2), col=labels) # Examine the overlap with a pairs plot of 5 randomly chosen variables pairs(sim_data[,sample(1:20, 5)], col=labels) # Fit a MIFA model to this data # tmp <- mcmc_IMIFA(sim_data, method="MIFA", range.G=3, n.iters=5000) # Simulate from this model # res <- get_IMIFA_results(tmp, zlabels=labels) # sim_mod <- sim_IMIFA_model(res)
Supplies a list of values for logical switches indicating whether parameters of interest (means, scores, loadings, uniquenesses, and mixing proportions) should be stored when running models from the IMIFA family via mcmc_IMIFA
. It may be useful not to store certain parameters if memory is an issue.
storeControl(mu.switch = TRUE, score.switch = TRUE, load.switch = TRUE, psi.switch = TRUE, pi.switch = TRUE, update.mu = mu.switch, ...)
storeControl(mu.switch = TRUE, score.switch = TRUE, load.switch = TRUE, psi.switch = TRUE, pi.switch = TRUE, update.mu = mu.switch, ...)
mu.switch |
Logical indicating whether the means are to be stored (defaults to |
score.switch |
Logical indicating whether the factor scores are to be stored. As the array containing each sampled scores matrix tends to be amongst the largest objects to be stored, this defaults to Unlike other parameters, the scores need not be stored for posterior predictive checking (see Note below). |
load.switch |
Logical indicating whether the factor loadings are to be stored (defaults to |
psi.switch |
Logical indicating whether the uniquenesses are to be stored (defaults to |
pi.switch |
Logical indicating whether the mixing proportions are to be stored (defaults to |
update.mu |
Logical indicating whether the means should be updated. Only relevant for the |
... |
Catches unused arguments. |
storeControl
is provided for assigning values for IMIFA models within mcmc_IMIFA
. It may be useful not to store certain parameters if memory is an issue (e.g. for large data sets or for a large number of MCMC iterations after burnin and thinning).
A named vector in which the names are the names of the storage switches and the values are logicals indicating whether that parameter is to be stored. The list also contains as an attribute a logical for each switch indicating whether it was actually supplied (TRUE
) or the default was accepted (FALSE
).
Posterior inference and plotting won't be possible for parameters not stored.
Non-storage of parameters will almost surely prohibit the computation of posterior predictive checking error metrics within get_IMIFA_results
also. In particular, if such error metrics are desired, psi.switch
must be TRUE
for all but the "FA"
and "IFA"
models, mu.switch
must be TRUE
except in situations where update.mu=FALSE
is allowed, load.switch
must be TRUE
for all but the entirely zero-factor models, and pi.switch
must be TRUE
for models with clustering structure and unequal mixing proportions for all but the PPRE metric. score.switch=TRUE
is not required for any posterior predictive checking.
If post-hoc adaptation is invoked in get_IMIFA_results
on a model fitted without adaptation, store.switch=TRUE
is not required unless active.crit="SC"
(for "IFA"
models only); load.switch=TRUE
is required for both active.crit="BD"
and active.crit="SC"
.
Finally, if loadings are not stored but scores are, caution is advised when examining posterior scores as Procrustes rotation will not occur within get_IMIFA_results
.
Keefe Murphy - <[email protected]>
mcmc_IMIFA
, get_IMIFA_results
, mixfaControl
, mgpControl
, bnpControl
stctrl <- storeControl(score.switch=FALSE) # data(olive) # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, storage=stctrl) # Alternatively specify these arguments directly # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, score.switch=FALSE)
stctrl <- storeControl(score.switch=FALSE) # data(olive) # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, storage=stctrl) # Alternatively specify these arguments directly # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, score.switch=FALSE)
Training and test sets for the United States Postal Service (USPS) handwritten digits data, with 8-bit 16x16 grayscale grid representations of image scans of the digits "0" through "9".
data(USPSdigits)
data(USPSdigits)
A list of length 2 with the following elements, each one a data.frame
:
train
The training set of 7,291 digits.
test
The test set of 2,007 digits.
Each data.frame
contains the known digit labels in its first column.
The remaining 256 columns give the concatenation of the 16x16 grid.
Pixels are scaled such that [-1,1] corresponds to [white,black].
Hastie, T., Tibshirani, R., and Friedman, J. (2001). The Elements of Statistical Learning (2nd edition). Springer Series in Statistics. New York, NY, USA: Springer.
# Load the data and record the labels data(USPSdigits, package="IMIFA") ylab <- USPSdigits$train[,1] train <- USPSdigits$train[,-1] # Examine the effect of discarding peripheral pixels SDs <- apply(train, 2, sd) ind <- SDs > 0.7 dat <- train[,ind] hist(SDs, breaks=200, xlim=c(0, 1)) rect(0.7, 0, 1, 12, col=2, density=25) show_digit(ind) # retained pixels are shown in black
# Load the data and record the labels data(USPSdigits, package="IMIFA") ylab <- USPSdigits$train[,1] train <- USPSdigits$train[,-1] # Examine the effect of discarding peripheral pixels SDs <- apply(train, 2, sd) ind <- SDs > 0.7 dat <- train[,ind] hist(SDs, breaks=200, xlim=c(0, 1)) rect(0.7, 0, 1, 12, col=2, density=25) show_digit(ind) # retained pixels are shown in black
This function takes a Monte Carlo sample of cluster labels, computes an average similarity matrix and returns the clustering with minimum mean squared error to this average. The mcclust
package must be loaded.
Zsimilarity(zs)
Zsimilarity(zs)
zs |
A matrix containing samples of clustering labels where the columns correspond to the number of observations (N) and the rows correspond to the number of iterations (M). |
This function takes a Monte Carlo sample of cluster labels, converts them to adjacency matrices, and computes a similarity matrix as an average of the adjacency matrices. The dimension of the similarity matrix is invariant to label switching and the number of clusters in each sample, desirable features when summarising partitions of Bayesian nonparametric models such as IMIFA. As a summary of the posterior clustering, the clustering with minimum mean squared error to this 'average' clustering is reported.
A heatmap of z.sim
may provide a useful visualisation, if appropriately ordered. The user is also invited to perform hierarchical clustering using hclust
after first converting this similarity matrix to a distance matrix - "complete" linkage is recommended. Alternatively, hc
could be used.
A list containing three elements:
z.avg |
The 'average' clustering, with minimum squared distance to |
z.sim |
The N x N similarity matrix, in a sparse format (see |
MSE.z |
A vector of length M recording the MSEs between each clustering and the 'average' clustering. |
The mcclust
package must be loaded.
This is liable to take quite some time to run, especially if the number of observations &/or number of iterations is large. Depending on how distinct the clusters are, z.sim
may be stored better in a non-sparse format. This function can optionally be called inside get_IMIFA_results
.
Keefe Murphy - <[email protected]>
Carmona, C., Nieto-barajas, L. and Canale, A. (2018) Model based approach for household clustering with mixed scale variables. Advances in Data Analysis and Classification, 13(2): 559-583.
get_IMIFA_results
, simple_triplet_matrix
, hclust
, hc
, comp.psm
, cltoSim
# Run a IMIFA model and extract the sampled cluster labels # data(olive) # sim <- mcmc_IMIFA(olive, method="IMIFA", n.iters=5000) # zs <- sim[[1]][[1]]$z.store # Get the similarity matrix and visualise it # zsimil <- Zsimilarity(zs) # z.sim <- as.matrix(zsimil$z.sim) # z.col <- mat2cols(z.sim, cols=heat.colors(30, rev=TRUE)) # z.col[z.sim == 0] <- NA # plot_cols(z.col, na.col=par()$bg); box(lwd=2) # Extract the clustering with minimum squared distance to this # 'average' and evaluate its performance against the true labels # table(zsimil$z.avg, olive$area) # Perform hierarchical clustering on the distance matrix # Hcl <- hclust(as.dist(1 - z.sim), method="complete") # plot(Hcl) # table(cutree(Hcl, k=3), olive$area)
# Run a IMIFA model and extract the sampled cluster labels # data(olive) # sim <- mcmc_IMIFA(olive, method="IMIFA", n.iters=5000) # zs <- sim[[1]][[1]]$z.store # Get the similarity matrix and visualise it # zsimil <- Zsimilarity(zs) # z.sim <- as.matrix(zsimil$z.sim) # z.col <- mat2cols(z.sim, cols=heat.colors(30, rev=TRUE)) # z.col[z.sim == 0] <- NA # plot_cols(z.col, na.col=par()$bg); box(lwd=2) # Extract the clustering with minimum squared distance to this # 'average' and evaluate its performance against the true labels # table(zsimil$z.avg, olive$area) # Perform hierarchical clustering on the distance matrix # Hcl <- hclust(as.dist(1 - z.sim), method="complete") # plot(Hcl) # table(cutree(Hcl, k=3), olive$area)