6 Types of Dependent Variables that will Never Meet the GLM Normality Assumption
The Assumptions of Normality and Constant Variance in a GLM are quite robust to departures. That means that even if the assumptions aren’t met perfectly, the resulting p-values will still be reasonable estimates.
But you need to check the assumptions anyway, because some departures are so far from the assumptions that the p-value become inaccurate. And in many cases there are remedial measures you can take to turn non-normal residuals into normal ones.
But sometimes you can’t.
Sometimes it’s because the dependent variable just isn’t appropriate for a GLM. The dependent variable, Y, doesn’t have to be normal for the residuals to be normal (since Y is affected by the X’s).
But Y does have to be continuous, unbounded, and measured on an interval or ratio scale.
If you go through the 11 Steps to Statistical Modeling, Step 3 is: Choose the variables for answering your research questions and determine their level of measurement. Part of the reason for doing this is to save yourself from running a GLM on a DV that just isn’t appropriate and will never meet assumptions. Some of these include DVs that are:
- Categorical
- Ordinal
- Discrete counts, bounded at 0, which is often the most common value
- Zero Inflated, where even if the rest of the distribution looks normal, there is a huge spike in the distribution at 0.
- Censored or truncated, including time to event variables
- a Proportion, which is bounded at 0 and 1, or a percentage, which is bounded at 0 and 100.
If you have one of these, Stop. Do not pass Go. Do not run a GLM.
Hopefully you noticed this at Step 3, not when you’re checking assumptions.
But luckily, there are other types of regression procedures available for all of these variables.
I’ll follow up with more details in my next post, and we’ll be talking more about it at our next Craft of Statistical Analysis Webinar on September . You can join us by registering here. It’s free.
Tags: Assumptions of GLM, Categorical, Censored, Constant Variance, Dependent Variable, Discrete Counts, normality, Ordinal, Proportion, Truncated, Zero Inflated

May 13th, 2010 at 9:57 am
[...] are many types of outcome variables that don’t work in linear models, but look like they should. (I mean, specifically, OLS regression and ANOVA [...]