| Title: | Quantile Regression for Logarithmic Relationships with Non-Positive Outcome Values |
|---|---|
| Description: | Provides the lzrq() function for estimating logarithmic regression slopes in quantile regression models, permitting the outcome variable to take on non-positive values. lzrq() conducts regression after replacing non-positive values with a sufficiently negative value. If the fitted values of a quantile regression on this transformed outcome are all greater than the negative value, then results are displayed. The resulting coefficients can be meaningfully interpreted as logarithmic intensive-margin relationships between the outcome variable and the independent variables, even with non-positive values in the outcome variable. If the condition does not hold for the specified quantile, then the command iteratively makes the value larger and checks again. After ten iterations where the condition does not hold, the functions return an error and suppress results. This is an automated adaptation of the algorithm described by Liu & Kaplan (2025) <https://drive.google.com/file/d/1F3dnhm8MrlO5aRrGt48rBWAEaBqdCBH-/view> and implemented in the companion Stata command lzqreg, described in Fitzgerald et al. (2026) <doi:10.31222/osf.io/juda7_v1>. |
| Authors: | David Valenta [aut], Jack Fitzgerald [aut, cre] |
| Maintainer: | Jack Fitzgerald <[email protected]> |
| License: | CC BY 4.0 |
| Version: | 0.1.0 |
| Built: | 2026-06-23 11:09:29 UTC |
| Source: | https://github.com/jack-fitzgerald/lzrq |
lzrq fits a quantile regression model where the raw dependent variable
is transformed with a calibrated extensive margin (CEM) transformation of the
form
lzrq is a wrapper for rq. If all fitted values
from a quantile regression on the CEM-transformed outcome are greater than
, results are returned. The resulting coefficients can be
interpreted as logarithmic intensive-margin relationships between the outcome
and the independent variables, even when the outcome contains non-positive
values. If the condition does not hold, lzrq iteratively increases
and re-checks. After ten failed iterations, lzrq returns an
error and suppresses results. This is an automated adaptation of the algorithm
described by Liu & Kaplan (2025).
All methods supported by rq are available, including
coef(), residuals(), fitted(), predict(),
summary(), AIC(), and logLik().
lzrq(formula, data, tau = 0.5, psi_init = -1e35, ...)lzrq(formula, data, tau = 0.5, psi_init = -1e35, ...)
formula |
A formula with the raw (untransformed) outcome on the left-hand side. |
data |
Optional data frame containing the variables in the model. |
tau |
Quantile level. Numeric scalar strictly between 0 and 1. Defaults to 0.5. |
psi_init |
Initial lower bound constant used in the bisection algorithm.
Defaults to |
... |
Additional arguments passed to |
An object of class c("lzrq", "rq"). This is the fitted rq
object with two additional fields:
lb_constant |
The lower bound constant |
n_nonpos |
Number of non-positive outcome values in the estimation sample. |
All rq fields are preserved, so the full suite of
rq postestimation methods works automatically.
David Valenta ([email protected]) and Jack Fitzgerald ([email protected])
Fitzgerald, J., Adema, J., Fiala, L., Kujansuu, E., & Valenta, D. (2026). Non-Robustness in Log-Like Specifications. MetaArXiv. doi:10.31222/osf.io/juda7_v1
Liu, X., & Kaplan, D. M. (2025). Quantile Regression with Log(0) Outcomes. https://drive.google.com/file/d/1F3dnhm8MrlO5aRrGt48rBWAEaBqdCBH-/view
library(MASS) # The 'epil' dataset records epileptic seizure counts. # The outcome 'y' is zero for about 10% of observations. table(epil$y == 0) # Because the median of 'y', conditional on 'trt', is above zero for # both treatment groups, lzrq returns results at quantile 0.5. result <- lzrq(y ~ trt + age, data = epil, tau = 0.5) print(result) summary(result) # However, the tenth percentile of 'y' is zero for some treatment groups, # so lzrq suppresses results and returns an error at quantile 0.1. tryCatch( lzrq(y ~ trt + age, data = epil, tau = 0.1), error = function(e) message(e$message) )library(MASS) # The 'epil' dataset records epileptic seizure counts. # The outcome 'y' is zero for about 10% of observations. table(epil$y == 0) # Because the median of 'y', conditional on 'trt', is above zero for # both treatment groups, lzrq returns results at quantile 0.5. result <- lzrq(y ~ trt + age, data = epil, tau = 0.5) print(result) summary(result) # However, the tenth percentile of 'y' is zero for some treatment groups, # so lzrq suppresses results and returns an error at quantile 0.1. tryCatch( lzrq(y ~ trt + age, data = epil, tau = 0.1), error = function(e) message(e$message) )