4.1 Article

Regularized regression for categorical data

期刊

STATISTICAL MODELLING
卷 16, 期 3, 页码 161-200

出版社

SAGE PUBLICATIONS LTD
DOI: 10.1177/1471082X16642560

关键词

boosting; categorical data; fused lasso; group lasso; multinomial model; proportional odds model; regression trees

向作者/读者索取更多资源

In the last two decades, regularization techniques, in particular penalty-based methods, have become very popular in statistical modelling. Driven by technological developments, most approaches have been designed for high-dimensional problems with metric variables, whereas categorical data has largely been neglected. In recent years, however, it has become clear that regularization is also very promising when modelling categorical data. A specific trait of categorical data is that many parameters are typically needed to model the underlying structure. This results in complex estimation problems that call for structured penalties which are tailored to the categorical nature of the data. This article gives a systematic overview of penalty-based methods for categorical data developed so far and highlights some issues where further research is needed. We deal with categorical predictors as well as models for categorical response variables. The primary interest of this article is to give insight into basic properties of and differences between methods that are important with respect to statistical modelling in practice, without going into technical details or extensive discussion of asymptotic properties.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.1
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Theory & Methods

Tree-structured modelling of varying coefficients

Moritz Berger, Gerhard Tutz, Matthias Schmid

STATISTICS AND COMPUTING (2019)

Article Statistics & Probability

Flexible uncertainty in mixture models for ordinal responses

Gerhard Tutz, Micha Schneider

JOURNAL OF APPLIED STATISTICS (2019)

Article Mathematics, Interdisciplinary Applications

Tree-based modeling of time-varying coefficients in discrete time-to-event models

Marie-Therese Puth, Gerhard Tutz, Nils Heim, Eva Muenster, Matthias Schmid, Moritz Berger

LIFETIME DATA ANALYSIS (2020)

Article Statistics & Probability

Hierarchical Models for the Analysis of Likert Scales in Regression and Item Response Analysis

Gerhard Tutz

Summary: Appropriate modelling of Likert-type items should consider the scale level and the specific role of the neutral middle category, and separate the neutral category to avoid biased estimates when modeling the effects of explanatory variables on the outcome. The proposed hierarchical model using binary response models as building blocks can be easily extended to include response style effects and non-linear smooth effects of explanatory variables.

INTERNATIONAL STATISTICAL REVIEW (2021)

Article Mathematics, Interdisciplinary Applications

On the structure of ordered latent trait models

Gerhard Tutz

JOURNAL OF MATHEMATICAL PSYCHOLOGY (2020)

Article Mathematics, Interdisciplinary Applications

Ordinal Trees and Random Forests: Score-Free Recursive Partitioning and Improved Ensembles

Gerhard Tutz

Summary: The study introduces an improved method for ordinal trees that avoid the artificial assignment of scores and adopts the construction principle of binary models, combining trees and parametric models for prediction. The potential performance issues of random forests are also discussed, with proposals for ensemble models to achieve better predictive performance.

JOURNAL OF CLASSIFICATION (2022)

Article Mathematics, Interdisciplinary Applications

Item Response Thresholds Models: A General Class of Models for Varying Types of Items

Gerhard Tutz

Summary: A comprehensive class of models is proposed for various types of responses, including continuous, binary, ordered categorical, and count type responses. These models are flexible and can accommodate a wide range of distributions.

PSYCHOMETRIKA (2022)

Article Statistics & Probability

Heterogeneity in general multinomial choice models

Ingrid Mauerer, Gerhard Tutz

Summary: This study proposes a flexible and general heterogeneous multinomial logit model to study differences in choice behavior. The model captures heterogeneity that classical models cannot capture, indicates the strength of heterogeneity, and allows for examining the explanatory variables causing heterogeneity.

STATISTICAL METHODS AND APPLICATIONS (2023)

Article Social Sciences, Mathematical Methods

Flexible Item Response Models for Count Data: The Count Thresholds Model

Gerhard Tutz

Summary: A new item response theory model for count data is proposed, which does not assume a fixed distribution for the responses and shows good performance in recovering parameters and response distributions, as well as flexibility in accommodating varying response distributions.

APPLIED PSYCHOLOGICAL MEASUREMENT (2022)

Article Education & Educational Research

Latent Trait Item Response Models for Continuous Responses

Gerhard Tutz, Pascal Jordan

Summary: This article presents a general framework for latent trait item response models for continuous responses. Unlike classical test theory models, which differentiate between true scores and error scores, this model links the responses directly to latent traits. It is demonstrated that classical test theory models can be derived as special cases but the model class is much broader. The framework provides appropriate modeling for restricted responses, such as positive responses or responses within a certain interval. The model also extends common response time models and explores the role of the total score, leading to a modified total score. Various applications are illustrated, including one that considers covariates that may modify the response.

JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS (2023)

Review Statistics & Probability

Ordinal regression: A review and a taxonomy of models

Gerhard Tutz

Summary: Ordinal models can be seen as being composed from simpler binary models, leading to a taxonomy of models; the structured overview covers existing models and shows how models can be extended to consider further effects of explanatory variables; particular attention is given to modeling additional heterogeneity and investigating the exact meaning of heterogeneity terms.

WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS (2022)

Article Social Sciences, Mathematical Methods

Uncertainty in Latent Trait Models

Gerhard Tutz, Gunther Schauberger

APPLIED PSYCHOLOGICAL MEASUREMENT (2020)

Article Computer Science, Interdisciplinary Applications

BTLLasso: A Common Framework and Software Package for the Inclusion and Selection of Covariates in Bradley-Terry Models

Gunther Schauberger, Gerhard Tutz

JOURNAL OF STATISTICAL SOFTWARE (2019)

暂无数据