Skip to content

Latest commit

 

History

History
55 lines (44 loc) · 3.06 KB

README.md

File metadata and controls

55 lines (44 loc) · 3.06 KB

StepMixR : R interface to Python package StepMix

Build Status Codecov test coverage cran version Monthly Downloads Total Downloads

For StepMix, please refer to this repository.

This is an R interface to Stepmix, a Python package following the scikit-learn API for model-based clustering and generalized mixture modeling (latent class/profile analysis) of continuous and categorical data. StepMix handles missing values through Full Information Maximum Likelihood (FIML) and provides multiple stepwise Expectation-Maximization (EM) estimation methods based on pseudolikelihood theory. Additional features include support for covariates and distal outcomes, various simulation utilities, and non-parametric bootstrapping, which allows inference in semi-supervised and unsupervised settings.

Reference

If you find StepMix useful, please consider citing our arXiv preprint:

@article{morin2023stepmix,
  title={StepMix: A Python Package for Pseudo-Likelihood Estimation of Generalized Mixture Models with External Variables},
  author={Morin, Sacha and Legault, Robin and Bakk, Zsuzsa and Gigu{\`e}re, Charles-{\'E}douard and de la Sablonni{\`e}re, Roxane and Lacourse, {\'E}ric},
  journal={arXiv preprint arXiv:2304.03853},
  year={2023}
}

Install

You can install StepMixR from CRAN inside r using the function install.packages:

install.packages("stepmixr")

To install directly from github you need to have the package devtools installed. Once it is installed, you can use the following syntax.

devtools::install_github("Labo-Lacourse/stepmixr")

Tutorials

  1. A notebook available from google colab gives a detailed tutorials based on the iris dataset. This notebook is a R adaptation of a similar Python notebook which can be found here. This tutorial covers :
    1. Continuous LCA models;
    2. Binary LCA models;
    3. Categorical LCA models;
    4. Mixed LCA models (continuous and categorical data);
    5. Missing values.

Quickstart

Here is a quick example from R documentation.

  model1 <- stepmix(n_components = 3, n_steps = 2, measurement = "continuous")
  X <- iris[, 1:4]
  fit1 <- fit(model1, X)
  pr1 <- predict(fit1, X)