Cumulative Link Models with CmdStanR
Overview
clmstan fits cumulative link models (CLMs) for ordinal categorical data using CmdStanR. It supports 11 link functions including standard links (logit, probit, cloglog) and flexible parametric links (GEV, AEP, Symmetric Power).
Models are pre-compiled using the instantiate package for fast execution without runtime compilation.
Documentation
Full documentation is available at: https://t-momozaki.github.io/clmstan/
Installation
Prerequisites
This package requires:
- CmdStan - Stan’s command-line interface
- cmdstanr - R interface to CmdStan (not on CRAN)
Step 1: Install cmdstanr
# Install cmdstanr from r-universe (recommended)
install.packages("cmdstanr",
repos = c("https://stan-dev.r-universe.dev",
getOption("repos")))Step 2: Install CmdStan
library(cmdstanr)
install_cmdstan() # Only needed onceStep 3: Install clmstan
# From CRAN (when available)
install.packages("clmstan")
# From GitHub (development version)
# install.packages("devtools")
devtools::install_github("t-momozaki/clmstan")Note: During package installation, Stan models are compiled automatically. This may take a few minutes on first install.
Quick Start
library(clmstan)
# Example data
set.seed(123)
n <- 100
x <- rnorm(n)
latent <- 1.0 * x + rlogis(n)
y <- cut(latent, breaks = c(-Inf, -1, 0, 1, Inf), labels = 1:4)
data <- data.frame(y = y, x = x)
# Fit a cumulative link model with logit link
fit <- clm_stan(y ~ x, data = data, link = "logit",
chains = 4, iter = 2000, warmup = 1000)
# View results
fit$fit$summary(variables = c("beta", "c_transformed", "beta0"))Supported Link Functions
Standard Links (5)
| Link | Distribution | Use Case |
|---|---|---|
| logit | Logistic | Default, proportional odds |
| probit | Normal | Symmetric, latent variable interpretation |
| cloglog | Gumbel (max) | Asymmetric, proportional hazards |
| loglog | Gumbel (min) | Asymmetric |
| cauchit | Cauchy | Heavy tails |
Flexible Links with Parameters (6)
| Link | Parameter | Description |
|---|---|---|
| tlink | -distribution, adjustable tail weight | |
| aranda_ordaz | Generalized asymmetric link | |
| sp | , base | Symmetric Power, adjustable skewness |
| log_gamma | Continuous symmetric/asymmetric adjustment | |
| gev | Generalized Extreme Value | |
| aep | Asymmetric Exponential Power |
Threshold Structures
| Structure | Description |
|---|---|
| flexible | Free thresholds (default) |
| equidistant | Equal spacing between thresholds |
| symmetric | Symmetric around center |
# Equidistant thresholds
fit_equi <- clm_stan(y ~ x, data = data, threshold = "equidistant")Prior Specification
Default Priors
clmstan uses weakly informative default priors:
| Parameter | Default Prior |
|---|---|
| Regression coefficients () | normal(0, 2.5) |
| Thresholds () | normal(0, 10) |
| Equidistant spacing () | gamma(2, 0.5) |
For link parameters estimated via Bayesian inference:
| Link | Parameter | Default Prior |
|---|---|---|
| tlink | gamma(2, 0.1) |
|
| aranda_ordaz | gamma(0.5, 0.5) |
|
| sp | gamma(0.5, 0.5) |
|
| log_gamma | normal(0, 1) |
|
| gev | normal(0, 2) |
|
| aep | gamma(2, 1) |
Custom Priors
Use the prior() function with distribution helpers:
Prior for Link Parameters
When estimating link parameters, you can specify custom priors:
# Custom prior for t-link df parameter
fit <- clm_stan(y ~ x, data = data, link = "tlink",
link_param = list(df = "estimate"),
prior = prior(gamma(3, 0.2), class = "df"))
# Custom prior for GEV xi parameter
fit <- clm_stan(y ~ x, data = data, link = "gev",
link_param = list(xi = "estimate"),
prior = prior(normal(0, 0.5), class = "xi"))Available Distribution Functions
| Function | Parameters | Example |
|---|---|---|
normal(mu, sigma) |
: mean, : SD | normal(0, 2.5) |
gamma(alpha, beta) |
: shape, : rate | gamma(2, 0.1) |
student_t(df, mu, sigma) |
: df, : location, : scale | student_t(3, 0, 2.5) |
cauchy(mu, sigma) |
: location, : scale | cauchy(0, 2.5) |
flat() |
none | flat() |
Note: flat() creates an improper uniform prior. Use with caution as it may lead to improper posteriors if the data does not provide sufficient information. For thresholds with ordered constraints, Stan’s internal transformation provides implicit regularization.
Prior Classes
| Class | Description | Compatible Distributions |
|---|---|---|
b |
Regression coefficients () | normal, student_t, cauchy, flat |
Intercept |
Thresholds (, flexible) | normal, student_t, cauchy, flat |
c1 |
First threshold (, equidistant) | normal, student_t, cauchy, flat |
cpos |
Positive thresholds (symmetric) | normal, student_t, cauchy, flat |
d |
Equidistant spacing () | gamma |
df |
t-link degrees of freedom () | gamma |
lambda_ao |
Aranda-Ordaz | gamma |
r |
Symmetric Power | gamma |
lambda_lg |
Log-gamma | normal, student_t, cauchy |
xi |
GEV | normal, student_t, cauchy |
theta1, theta2
|
AEP shape () | gamma |