Statistical models, such as
general linear models (linear regression, ANOVA, mixed models) and
generalized linear models (logistic, Poisson, proportional hazard
regression, etc.) all have the same general form. On the left
side of the equation is one or more response
variables, Y. On the
right hand side is one or more predictor variables, X, and their coefficients, B. X,
the variables on the
right hand side can have many forms and are called by many names.
There are subtle distinctions in the
meanings of these names, but they are often used interchangeably.
Even worse, statistical software packages use different names for
similar concepts, even among their own procedures. This quest for
accuracy often renders confusion. (It’s hard enough without
switching the words!).
Here are some common terms that all
refer to a variable in a model that is proposed to affect or predict
another variable. There are slight differences in the meanings of
these terms, but they are often used interchangeably.
Independent Variable:
It implies causality: the independent variable affects
the dependent variable. Used predominantly in ANOVA, but often in
regression as well. It can be either continuous or categorical.
Predictor Variable:
It does not imply causality. A predictor variable is simply
useful for predicting the value of the response variable. Used
predominantly in regression. Predictor variables can be
continuous or categorical.
Predictor:
Same as Predictor Variable.
Covariate:
A continuous predictor variable. Used in both ANCOVA
(analysis of covariance) and regression. Some people use this to
refer to all predictor variables in regression, but it really means
continuous predictors. Adding a covariate to ANOVA (analysis of
variance) turns it into ANCOVA (analysis of covariance).
Factor: A categorical predictor variable. It may or may not indicate a cause/effect
relationship with the response variable (this depends on the study
design, not the analysis). Independent variables in ANOVA are
almost always called factors. In regression, they are often
referred to as indicator variables, categorical predictors, or dummy
variables. They are all the same thing in this context.
Grouping Variable:
Same as a factor. Used in SPSS in the independent samples t-test.
Fixed factor:
A categorical independent variable in which the specific values of the
categories are specific and important, often chosen by the
experimenter. Examples include experimental treatments or
demographic categories, such as sex and race. If you’re not doing
a mixed model (and you should know if you are), all your factors are
fixed factors. For a more thorough explanation of fixed and
random factors, see Specifying
Fixed and Random Factors in Mixed or Multi-Level Models.
Random factor:
A categorical independent variable in which the values of the
categories were randomly assigned. Generally used in mixed
modeling. Examples include subjects or random blocks. For a
more thorough explanation of fixed and random factors, see Specifying Fixed and
Random Factors in Mixed or Multi-Level Models.
Dummy variable:
A categorical variable that has been dummy coded. Dummy coding
(also called indicator coding) is usually used in regression models,
but not ANOVA. A dummy variable can have only two values: 0 and
1. When a categorical variable has more than two values, it is
recoded into multiple dummy variables.
Monthly Tips, Resources, and News.PlusThe Top Resources for Learning 13 Statistical Methods. And our free Webinar Series: The Craft of Statistical Analysis. FREE!