Package 'ordinalLBM'

Title: Co-Clustering of Ordinal Data via Latent Continuous Random Variables
Description: It implements functions for simulation and estimation of the ordinal latent block model (OLBM), as described in Corneli, Bouveyron and Latouche (2019).
Authors: Marco Corneli, Charles Bouveyron and Pierre Latouche
Maintainer: Marco Corneli <[email protected]>
License: GPL (>= 2)
Version: 1.0
Built: 2024-11-08 03:11:21 UTC
Source: https://github.com/cran/ordinalLBM

Help Index


Fitting OLBM to the data

Description

It estimates the OLBM model parameters as well as the most likely posterior cluster assignments by maximum likelihood.

Usage

olbm(Y, Q, L, init = "kmeans", eps = 1e-04, it_max = 500,
  verbose = TRUE)

Arguments

Y

An M x P ordinal matrix, containing ordinal entries from 1 to K. Missing data are coded as zeros.

Q

The number of row clusters.

L

The number of column clusters.

init

A string specifying the initialisation type. It can be "kmeans" (the default) or "random" for a single random initialisation.

eps

When the difference between two consecutive vaules of the log-likelihood is smaller than eps, the M-EM algorithms will stop.

it_max

The maximum number of iterations that the M-EM algorithms will perform (although the minimum tolerance eps is not reached).

verbose

A boolean specifying whether extended information should be displayed or not (TRUE by default).

Value

It returns an S3 object of class "olbm" containing

estR

the estimated row cluster memberships.

estC

the estimated column cluster memberships.

likeli

the final value of the log-likelihood.

icl

the value of the ICL criterion.

Pi

the Q x L estimated connectivity matrix.

mu

a Q x L matrix containing the estimated means of the latent Gaussian distributions.

sd

a Q x L matrix containing the estimated standard deviations of the latent Gaussian distributions.

eta

a Q x L x K array whose entry (q,l,k) is the estimated probability that one user in the q-th row cluster assign the score k to one product in the l-th column cluster.

rho

the estimated row cluster proportions.

delta

the estimated column cluster proportions.

initR

the initial row cluster assignments provided to the C-EM algorithm.

initC

the initial column cluter assignments provided to the C-EM algorigthm.

Y

the input ordinal matrix Y.

thresholds

the values (1.5, 2.5, ... , K-0.5) of the thresholds, defined inside the function olbm.

References

Corneli M.,Bouveyron C. and Latouche P. (2019) Co-Clustering of ordinal data via latent continuous random variables and a classification EM algorithm. (https://hal.archives-ouvertes.fr/hal-01978174)

Examples

data(olbm_dat)
res <- olbm(olbm_dat$Y, Q=3, L=2)

OLBM simulated data

Description

It is a list containing i) an ordinal toy data matrix simulated acccording to OLBM and ii) the row/column cluster assignments. To see how the data are simulated, you can type "?simu.olbm" in the R console and look at "Examples".

Usage

data(olbm_dat)

Format

A list containing three items.

Y

: an ordinal data matrix simulated according to OLBM.

Rclus

: the actual row cluster assignments.

Cclust

: the actual column cluster assignments.


Plot OLBM

Description

It plots the re-organized incidence matrix and/or the estimated Gussian densities.

Usage

## S3 method for class 'olbm'
plot(x, type = "hist", ...)

Arguments

x

The "olbm" object output of the function olbm.

type

A string specifying the type of plot to be produced. The currently supported values are "hist" and "incidence".

...

Additional parameters to pass to sub-functions.

Examples

data(olbm_dat)
res <- olbm(olbm_dat$Y, Q=3, L=2)   
plot(res, "hist")
plot(res, "incidence")

Simulate OLBM data

Description

It simulates an ordinal data matrix according to OLBM.

Usage

simu.olbm(M, P, Pi, rho, delta, mu, sd, thresh)

Arguments

M

The number of rows of the ordinal matrix Y.

P

The number of columns of the ordinal matrix Y.

Pi

A Q x L connectivity matrix to manage missing data (coded az zeros in Y).

rho

A vector of length Q, containing multinomial probabilities for row cluster assignments.

delta

A vector of length L, containing multinomial probabilities for column cluster assignments.

mu

A Q x L matrix containing the means of the latent Gaussian distributions.

sd

A Q x L matrix containing the standard deviations of the latent Gaussian distributions.

thresh

A K+1 vector containing the sorted tresholds used to simulate the ordinal entries in Y, where K is the number of ordinal modalities. The first entry in tresh must be -Inf, the last entry +Inf.

Value

It returns a list containing:

Y

An M x P matrix. The observed ordinal entries are integers between 1 and K. Missing data are coded as zeros.

Rclus

A vector of length M containing the row cluster memberships.

Cclus

A vector of length P containing the column cluster memberships.

References

Corneli M.,Bouveyron C. and Latouche P. (2019) Co-Clustering of ordinal data via latent continuous random variables and a classification EM algorithm. (https://hal.archives-ouvertes.fr/hal-01978174)

Examples

M <- 150                                    
P <- 100 
Q <- 3
L <- 2

## connectivity matrix
Pi <- matrix(.7, nrow = Q, ncol = L)
Pi[1,1] <- Pi[2,2] <- Pi[3,2] <- .5

## cluster memberships proportions
rho <- c(1/3, 1/3 ,1/3)
delta <- c(1/2, 1/2)

# Thresholds
thresh <- c(-Inf, 2.37, 2.67, 3.18, 4.33, Inf)     # K = 5

## Gaussian parameters
mu <- matrix(c(0, 3.4, 2.6, 0, 2.6, 3.4), nrow = Q, ncol = L)   
sd <- matrix(c(1.2,1.4,1.0,1.2,1.4,1.0), nrow = Q, ncol = L)

## Data simulation
dat <- simu.olbm(M, P, Pi, rho, delta, mu, sd, thresh)