A Data Scientist's blog

Data Mining Algorithms In R/Clustering/Expectation Maximization · k

em {mclust} R Documentation

EM algorithm starting with E-step for parameterized Gaussian mixture models.

Description
Implements the EM algorithm for parameterized Gaussian mixture models,
starting with the expectation step.

Usage
em(modelName, data, parameters, prior = NULL, control = emControl(),
   warn = NULL, ...)

Arguments

modelName

A character string indicating the model. The help file for
mclustModelNames describes the available models.

data

A numeric vector, matrix, or data frame of observations. Categorical
variables are not allowed. If a matrix or data frame, rows
correspond to observations and columns correspond to variables. 

parameters

A names list giving the parameters of the model.
The components are as follows:

pro
Mixing proportions for the components of the mixture. 
If the model includes a Poisson term for noise, there 
should be one more mixing proportion than the number 
of Gaussian components.

mean
The mean for each component. If there is more than one component,
this is a matrix whose kth column is the mean of the kth
component of the mixture model.

variance
A list of variance parameters for the model.
The components of this list depend on the model
specification. See the help file for mclustVariance
for details.

Vinv
An estimate of the reciprocal hypervolume of the data region.
If set to NULL or a negative value, the default is determined by 
applying function hypvol to the data.
Used only when pro includes an additional
mixing proportion for a noise component.

prior

Specification of a conjugate prior on the means and variances.
The default assumes no prior.

control

A list of control parameters for EM. The defaults are set by the call 
emControl().

warn

A logical value indicating whether or not a warning should be issued
when computations fail. The default is warn=FALSE.

...

Catches unused arguments in indirect or list calls via do.call.

Value
A list including the following components: 

modelName

A character string identifying the model (same as the input argument).

z

A matrix whose [i,k]th entry is the
conditional probability of the ith observation belonging to
the kth component of the mixture.  

parameters

pro
A vector whose kth component is the mixing proportion for 
the kth component of the mixture model.
If the model includes a Poisson term for noise, there 
should be one more mixing proportion than the number 
of Gaussian components.

mean
The mean for each component. If there is more than one component,
this is a matrix whose kth column is the mean of the kth 
component of the mixture model. 

variance
A list of variance parameters for the model.
The components of this list depend on the model
specification. See the help file for mclustVariance 
for details.  

Vinv
The estimate of the reciprocal hypervolume of the data region
used in the computation when the input indicates the
addition of a noise component to the model.

loglik

The log likelihood for the data in the mixture model. 

Attributes:

"info"
Information on the iteration.

"WARNING"
An appropriate warning if problems are encountered in the computations.

References
C. Fraley and A. E. Raftery (2002).
Model-based clustering, discriminant analysis, and density estimation.
Journal of the American Statistical Association 97:611-631. 

C. Fraley and A. E. Raftery (2005).
Bayesian regularization for normal mixture estimation and model-based
clustering.
Technical Report, Department of Statistics, University of Washington.

C. Fraley and A. E. Raftery (2006).
MCLUST Version 3 for R: Normal Mixture Modeling and Model-Based Clustering, 
Technical Report no. 504, Department of Statistics,
University of Washington.

See Also
emE, ...,
emVVV,
estep,
me,
mstep,
mclustOptions,
do.call

Examples
msEst <- mstep(modelName = "EEE", data = iris[,-5], 
               z = unmap(iris[,5]))
names(msEst)

em(modelName = msEst$modelName, data = iris[,-5],
   parameters = msEst$parameters)
## Not run: 
do.call("em", c(list(data = iris[,-5]), msEst))   ## alternative call
## End(Not run)

[Package mclust version 3.1-1 Index]

`modelName`	A character string indicating the model. The help file for `mclustModelNames` describes the available models.
`data`	A numeric vector, matrix, or data frame of observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables.
`parameters`	A names list giving the parameters of the model. The components are as follows: pro Mixing proportions for the components of the mixture. If the model includes a Poisson term for noise, there should be one more mixing proportion than the number of Gaussian components. mean The mean for each component. If there is more than one component, this is a matrix whose kth column is the mean of the kth component of the mixture model. variance A list of variance parameters for the model. The components of this list depend on the model specification. See the help file for `mclustVariance` for details. Vinv An estimate of the reciprocal hypervolume of the data region. If set to NULL or a negative value, the default is determined by applying function `hypvol` to the data. Used only when `pro` includes an additional mixing proportion for a noise component.
`prior`	Specification of a conjugate prior on the means and variances. The default assumes no prior.
`control`	A list of control parameters for EM. The defaults are set by the call `emControl()`.
`warn`	A logical value indicating whether or not a warning should be issued when computations fail. The default is `warn=FALSE`.
`...`	Catches unused arguments in indirect or list calls via `do.call`.

`modelName`	A character string identifying the model (same as the input argument).
`z`	A matrix whose `[i,k]`th entry is the conditional probability of the ith observation belonging to the kth component of the mixture.
`parameters`	pro A vector whose kth component is the mixing proportion for the kth component of the mixture model. If the model includes a Poisson term for noise, there should be one more mixing proportion than the number of Gaussian components. mean The mean for each component. If there is more than one component, this is a matrix whose kth column is the mean of the kth component of the mixture model. variance A list of variance parameters for the model. The components of this list depend on the model specification. See the help file for `mclustVariance` for details. Vinv The estimate of the reciprocal hypervolume of the data region used in the computation when the input indicates the addition of a noise component to the model.
`loglik`	The log likelihood for the data in the mixture model.
`Attributes:`	"info" Information on the iteration. "WARNING" An appropriate warning if problems are encountered in the computations.

`s`	summary list of an incomplete categorical dataset produced by the function `prelim.cat`.
`start`	optional starting value of the parameter. This is an array with dimensions `s$d` whose elements sum to one. The default starting value is a uniform array (equal probabilities in all cells). If structural zeros appear in the table, `start` should contain zeros in those positions and nonzero (e.g. uniform) values elsewhere.
`prior`	optional vector of hyperparameters for a Dirichlet prior distribution. The default is a uniform prior distribution (all hyperparameters = 1) on the cell probabilities, which will result in maximum likelihood estimation. If structural zeros appear in the table, a prior should be supplied with `NA`s in those cells.
`showits`	if `TRUE`, reports the iterations of EM so the user can monitor the progress of the algorithm.
`maxits`	maximum number of iterations performed. The algorithm will stop if the parameter still has not converged after this many iterations.
`eps`	convergence criterion. This is the largest proportional change in an expected cell count from one iteration to the next. Any expected cell count that drops below 1E-07 times the average cell probability (1/number of non-structural zero cells) is set to zero during the iterations.

identifier	Model	HC	EM	Distribution	Volume	Shape	Orientation
E		*	*	(univariate)	equal
V		*	*	(univariate)	variable
EII	$λ I$	*	*	Spherical	equal	equal	NA
VII	$λ k I$	*	*	Spherical	variable	equal	NA
EEI	$λ A$		*	Diagonal	equal	equal	coordinate axes
VEI	$λ k A$		*	Diagonal	variable	equal	coordinate axes
EVI	$λ A k$		*	Diagonal	equal	variable	coordinate axes
VVI	$λ k A k$		*	Diagonal	variable	variable	coordinate axes
EEE	$λ D A D T$	*	*	Ellipsoidal	equal	equal	equal
EEV	$\lambda D_k A D_k^T$		*	Ellipsoidal	equal	equal	variable
VEV	$\lambda_k D_k A D_k^T$		*	Ellipsoidal	variable	equal	variable
VVV	$\lambda_k D_k A_k D_k^T$	*	*	Ellipsoidal	variable	variable	variable

Search This Blog

Sunday, February 12, 2012

EM algorithm starting with E-step for parameterized Gaussian mixture models

EM algorithm starting with E-step for parameterized Gaussian mixture models.

Description

Usage

Arguments

Value

References

See Also

Examples

EM algorithm for incomplete categorical data

EM algorithm for incomplete categorical data

Description

Usage

Arguments

Value

Note

References

See Also

Examples

Data Mining Algorithms In R/Clustering/Expectation Maximization (EM)

Data Mining Algorithms In R/Clustering/Expectation Maximization (EM)

Contents

[edit] Introduction

[edit] Technique to be discussed

[edit] Algorithm

[edit] Implementation

[edit] Packages

[edit] Executing the Algorithm

[edit] A simple example

[edit] View

[edit] Case Study

[edit] Scenario

[edit] Input data

[edit] Execution

[edit] Output

[edit] Analysis

[edit] References

Saturday, February 11, 2012

Value-at-risk and install PerformanceAnalytics VaR() packages

Value-at-risk

Value-at-risk

Re: Value-at-risk

Re: Value-at-risk

Re: Value-at-risk

Re: Value-at-risk

Re: Value-at-risk

Re: Value-at-risk

Re: Value-at-risk

Re: Value-at-risk

Re: Value-at-risk

Re: Value-at-risk