Title: | Co-Clustering of Ordinal Data via Latent Continuous Random Variables |
---|---|
Description: | It implements functions for simulation and estimation of the ordinal latent block model (OLBM), as described in Corneli, Bouveyron and Latouche (2019). |
Authors: | Marco Corneli, Charles Bouveyron and Pierre Latouche |
Maintainer: | Marco Corneli <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0 |
Built: | 2024-11-08 03:11:21 UTC |
Source: | https://github.com/cran/ordinalLBM |
It estimates the OLBM model parameters as well as the most likely posterior cluster assignments by maximum likelihood.
olbm(Y, Q, L, init = "kmeans", eps = 1e-04, it_max = 500, verbose = TRUE)
olbm(Y, Q, L, init = "kmeans", eps = 1e-04, it_max = 500, verbose = TRUE)
Y |
An M x P ordinal matrix, containing ordinal entries from 1 to K. Missing data are coded as zeros. |
Q |
The number of row clusters. |
L |
The number of column clusters. |
init |
A string specifying the initialisation type. It can be "kmeans" (the default) or "random" for a single random initialisation. |
eps |
When the difference between two consecutive vaules of the log-likelihood is smaller than eps, the M-EM algorithms will stop. |
it_max |
The maximum number of iterations that the M-EM algorithms will perform (although the minimum tolerance eps is not reached). |
verbose |
A boolean specifying whether extended information should be displayed or not (TRUE by default). |
It returns an S3 object of class "olbm" containing
estR |
the estimated row cluster memberships. |
estC |
the estimated column cluster memberships. |
likeli |
the final value of the log-likelihood. |
icl |
the value of the ICL criterion. |
Pi |
the Q x L estimated connectivity matrix. |
mu |
a Q x L matrix containing the estimated means of the latent Gaussian distributions. |
sd |
a Q x L matrix containing the estimated standard deviations of the latent Gaussian distributions. |
eta |
a Q x L x K array whose entry (q,l,k) is the estimated probability that one user in the q-th row cluster assign the score k to one product in the l-th column cluster. |
rho |
the estimated row cluster proportions. |
delta |
the estimated column cluster proportions. |
initR |
the initial row cluster assignments provided to the C-EM algorithm. |
initC |
the initial column cluter assignments provided to the C-EM algorigthm. |
Y |
the input ordinal matrix Y. |
thresholds |
the values (1.5, 2.5, ... , K-0.5) of the thresholds, defined inside the function olbm. |
Corneli M.,Bouveyron C. and Latouche P. (2019) Co-Clustering of ordinal data via latent continuous random variables and a classification EM algorithm. (https://hal.archives-ouvertes.fr/hal-01978174)
data(olbm_dat) res <- olbm(olbm_dat$Y, Q=3, L=2)
data(olbm_dat) res <- olbm(olbm_dat$Y, Q=3, L=2)
It is a list containing i) an ordinal toy data matrix simulated acccording to OLBM and ii) the row/column cluster assignments. To see how the data are simulated, you can type "?simu.olbm" in the R console and look at "Examples".
data(olbm_dat)
data(olbm_dat)
A list containing three items.
: an ordinal data matrix simulated according to OLBM.
: the actual row cluster assignments.
: the actual column cluster assignments.
It plots the re-organized incidence matrix and/or the estimated Gussian densities.
## S3 method for class 'olbm' plot(x, type = "hist", ...)
## S3 method for class 'olbm' plot(x, type = "hist", ...)
x |
The "olbm" object output of the function olbm. |
type |
A string specifying the type of plot to be produced. The currently supported values are "hist" and "incidence". |
... |
Additional parameters to pass to sub-functions. |
data(olbm_dat) res <- olbm(olbm_dat$Y, Q=3, L=2) plot(res, "hist") plot(res, "incidence")
data(olbm_dat) res <- olbm(olbm_dat$Y, Q=3, L=2) plot(res, "hist") plot(res, "incidence")
It simulates an ordinal data matrix according to OLBM.
simu.olbm(M, P, Pi, rho, delta, mu, sd, thresh)
simu.olbm(M, P, Pi, rho, delta, mu, sd, thresh)
M |
The number of rows of the ordinal matrix Y. |
P |
The number of columns of the ordinal matrix Y. |
Pi |
A Q x L connectivity matrix to manage missing data (coded az zeros in Y). |
rho |
A vector of length Q, containing multinomial probabilities for row cluster assignments. |
delta |
A vector of length L, containing multinomial probabilities for column cluster assignments. |
mu |
A Q x L matrix containing the means of the latent Gaussian distributions. |
sd |
A Q x L matrix containing the standard deviations of the latent Gaussian distributions. |
thresh |
A K+1 vector containing the sorted tresholds used to simulate the ordinal entries in Y, where K is the number of ordinal modalities. The first entry in tresh must be -Inf, the last entry +Inf. |
It returns a list containing:
Y |
An M x P matrix. The observed ordinal entries are integers between 1 and K. Missing data are coded as zeros. |
Rclus |
A vector of length M containing the row cluster memberships. |
Cclus |
A vector of length P containing the column cluster memberships. |
Corneli M.,Bouveyron C. and Latouche P. (2019) Co-Clustering of ordinal data via latent continuous random variables and a classification EM algorithm. (https://hal.archives-ouvertes.fr/hal-01978174)
M <- 150 P <- 100 Q <- 3 L <- 2 ## connectivity matrix Pi <- matrix(.7, nrow = Q, ncol = L) Pi[1,1] <- Pi[2,2] <- Pi[3,2] <- .5 ## cluster memberships proportions rho <- c(1/3, 1/3 ,1/3) delta <- c(1/2, 1/2) # Thresholds thresh <- c(-Inf, 2.37, 2.67, 3.18, 4.33, Inf) # K = 5 ## Gaussian parameters mu <- matrix(c(0, 3.4, 2.6, 0, 2.6, 3.4), nrow = Q, ncol = L) sd <- matrix(c(1.2,1.4,1.0,1.2,1.4,1.0), nrow = Q, ncol = L) ## Data simulation dat <- simu.olbm(M, P, Pi, rho, delta, mu, sd, thresh)
M <- 150 P <- 100 Q <- 3 L <- 2 ## connectivity matrix Pi <- matrix(.7, nrow = Q, ncol = L) Pi[1,1] <- Pi[2,2] <- Pi[3,2] <- .5 ## cluster memberships proportions rho <- c(1/3, 1/3 ,1/3) delta <- c(1/2, 1/2) # Thresholds thresh <- c(-Inf, 2.37, 2.67, 3.18, 4.33, Inf) # K = 5 ## Gaussian parameters mu <- matrix(c(0, 3.4, 2.6, 0, 2.6, 3.4), nrow = Q, ncol = L) sd <- matrix(c(1.2,1.4,1.0,1.2,1.4,1.0), nrow = Q, ncol = L) ## Data simulation dat <- simu.olbm(M, P, Pi, rho, delta, mu, sd, thresh)