Posts Tagged ‘Covariates’

Free Webinar: Fixed and Random Factors in Mixed Models: What IS the difference?

Tuesday, May 26th, 2009

Just a quick reminder that if you haven’t registered for tomorrow’s webinar, Fixed and Random Factors in Mixed Models: What IS the difference?, be sure to sign up today.

In this webinar, we’ll talk about

* the difference between covariates and factors
* the difference between fixed and random factors
* what parameters are estimated for each one
* the assumptions and meanings of each
* how to decide whether a factor should be fixed or random

Date: Wednesday, May 27, 2009

Time: 1pm Eastern Time (12pm Central, 11am Mountain, 10am Pacific)

Where: Anywhere you have an internet connection

Length of Program: An Hour

Cost: Always free

Space is limited.

Find out more and register at:
http://www.analysisfactor.com/learning/teletraining7.html

>> How it Works <<


On the last Wednesday of every month, The Analysis Factor offers a free statistics training webinar.

On a web-based meeting system, I will do a 30-40 minute presentation on a very specific applied statistical topic. I will stop frequently for questions, and at the end you can ask questions relevant to your research. Each call covers an issue many researchers get stuck on when practicing statistics.

You need to register to get connection instructions and directions for downloading handouts. Spots are limited, so register early.

The webinar will be recorded, so if you miss it, you can still watch it.

Watch over and over if it helps. Or watch again as the topic becomes relevant to your research. But you need to register to get access to the recording.

Whether you can make the call live or not, register now at:

http://www.analysisfactor.com/learning/teletraining7.html

Dummy Coding in SPSS GLM–More on Fixed Factors, Covariates, and Reference Groups, Part 2

Tuesday, March 31st, 2009

Yesterday’s post outlined one issue in deciding whether to put a categorical predictor variable into Fixed Factors or Covariates in SPSS GLM.  That issue dealt with how SPSS automatically creates dummy variables out of any variable in Fixed Factors.

Another default to keep in mind is that SPSS will automatically create interactions between any and all variables in Fixed Factors.  If you put 5 variables in Fixed Factors, you’ll get all 2-way, 3-way, 4-way, and even a 5-way interaction among those 5 variables. (more…)

Dummy Coding in SPSS GLM–More on Fixed Factors, Covariates, and Reference Groups, Part 1

Monday, March 30th, 2009

If you have a categorical variable that you plan to use in a regression analysis in SPSS, there are a couple ways to do it. You can use the SPSS Regression procedure, which I will talk about more in another post.  Or you can use SPSS GLM, which I discuss here, and in a  follow-up post.

The big question in SPSS GLM is what goes where.  As I’ve detailed in another post, any continuous independent variable goes into covariates.  And don’t use random factors at all unless you really know what you’re doing.

So the question is what to do with your categorical variables.  You have two choices, and each has advantages and disadvantages.

The easiest is to put categorical variables in Fixed Factors.  SPSS will dummy code those variables for you, which is quite convenient if your categorical variable has more than two categories.  However, there are some defaults you need to be aware of that may or may not make this a good choice.

SPSS always makes the reference group the one that comes last alphabetically.  So if the values you input are strings, it will be the one that comes last.  If those values are numbers, it will be the highest one.

In some studies it really doesn’t matter which is the reference group.  But in others, interpreting regression coefficients will be a whole lot easier if you choose a group that makes a good comparison, such as a control group or the most common group in the data.  If you want that to be the reference, make it come last alphabetically.  I’ve been known to do things like change my data so that the control group becomes something like ZControl.  (But create a new variable–never overwrite original data).

It really can get confusing, though, if the variable was already dummy coded–if it already had values of 0 and 1.  Because 1 comes last alphabetically, SPSS will make that group the reference group.  This can really lead to confusion when interpreting coefficients.  It’s not impossible if you’re paying attention, but you do have to pay attention.

In tomorrow’s post I’ll discuss another default in SPSS that will affect your decision.

If you want more information on using and interpreting parameter estimates in regression using SPSS, get the recording from my free teleseminar: Interpreting Regression Coefficients: A Walk Through Output.

Editor’s Update 10/9/09: In just a few weeks, I’ll be offering a 3-hour workshop on the ins and outs of SPSS GLM.  We’ll cover the defaults, the menus and syntax, the meanings of all these terms, when you need each option, and what the results mean.  Get more info and register at: http://theanalysisinstitute.com/workshops/SPSS-GLM/index.html

3 Reasons Psychology Researchers should Learn Regression

Tuesday, February 17th, 2009

Back when I was doing psychology research, I knew ANOVA pretty well.  I’d taken a number of courses on it, and could run it backward and forward.  I kept hearing about ANCOVA, but in every ANOVA class, that was the last topic on the syllabus, and we always ran out of time.

The other thing that drove me crazy was those stats professors kept saying “ANOVA is just a special case of Regression.”  I could not for the life of me figure out why or how.

It was only when I switched over to statistics that I finally took a regression class and figured out what ANCOVA was all about. And only when I started consulting, and seeing hundreds of different ANOVA and regression models, that I finally made the connection.

But if you don’t have the driving curiosity about ANOVA and regression, why should you, as a researcher in Psychology, Education, or Agriculture, who is trained in ANOVA, want to learn regression?  There are 3 main reasons.

1. There a many, many continuous independent variables and covariates that need to be included in models.  Without the tools to analyze them as continuous, you are left forcing them into and ANOVA using an arbitrary technique like median splits.  At best, you’re losing power.  At worst, you’re not publishing your article because you’re missing real effects.

2. Having a solid understanding of the General Linear Model in its various forms equips you to really understand your variables and their relationships.  It allows you to try a model different ways–not for data fishing, but for discovering the true nature of the relationships.  Having the capacity to add an interaction term or a squared term  allows you to listen to your data and makes you a better researcher.

3. The multiple linear regression model is the basis for many other statistical techniques–logistic regression, multilevel and mixed models, Poisson regression, Survival Analysis, and so on.  Each of these is a step (or small leap) beyond multiple regression.  If you’re still struggling with what it means to center variables or interpret interactions, learning one of these other techniques become arduous, if not painful.

Having guided thousands of researchers through their statistical analysis over the past 10 years, I am convinced that having a strong, intuitive understanding of the general linear model in its variety of forms is the key to being an effective and confident statistical analyst.  You are then free to learn and explore other methodologies as needed.

SPSS GLM: Choosing Fixed Factors and Covariates

Tuesday, December 30th, 2008

The beauty of the Univariate GLM procedure in SPSS is that it is so flexible.  You can use it to analyze regressions, ANOVAs, ANCOVAs with all sorts of interactions, dummy coding, etc.

The down side of this flexibility is it is often confusing what to put where and what it all means.

So here’s a quick breakdown.

The dependent variable I hope is pretty straightforward.  Put in your continuous dependent variable.

Fixed Factors are categorical independent variables.  It does not matter if the variable is (more…)

Confusing Statistical Terms #1: The Many Names of Independent Variables

Monday, November 24th, 2008

Statistical models, such as general linear models (linear regression, ANOVA, mixed models) and generalized linear models (logistic, Poisson, proportional hazard regression, etc.) all have the same general form.  On the left side of the equation is one or more response variables, Y.  On the right hand side is one or more predictor variables, X,  and their coefficients, BX, the variables on the right hand side can have many forms and are called by many names.

There are subtle distinctions in the meanings of these names, but they are often used interchangeably.  Even worse, statistical software packages use different names for similar concepts, even among their own procedures.  This quest for accuracy often renders confusion.  (It’s hard enough without switching the words!).

Here are some common terms that all refer to a variable in a model that is proposed to affect or predict another variable.  There are slight differences in the meanings of these terms, but they are often used interchangeably.

  • Independent Variable: It implies causality:  the independent variable affects the dependent variable.  Used predominantly in ANOVA, but often in regression as well.  It can be either continuous or categorical.
  • Predictor Variable:  It does not imply causality.  A predictor variable is simply useful for predicting the value of the response variable.  Used predominantly in regression.  Predictor variables can be continuous or categorical.
  • Predictor:  Same as Predictor Variable.
  • Covariate:  A continuous predictor variable.  Used in both ANCOVA (analysis of covariance) and regression.  Some people use this to refer to all predictor variables in regression, but it really means continuous predictors.  Adding a covariate to ANOVA (analysis of variance) turns it into ANCOVA (analysis of covariance).
  • Factor:  A categorical predictor variable.  It may or may not indicate a cause/effect relationship with the response variable (this depends on the study design, not the analysis).  Independent variables in ANOVA are almost always called factors.  In regression, they are often referred to as indicator variables, categorical predictors, or dummy variables.  They are all the same thing in this context.
  • Grouping Variable: Same as a factor.  Used in SPSS in the independent samples t-test.
  • Fixed factor:  A categorical independent variable in which the specific values of the categories are specific and important, often chosen by the experimenter.  Examples include experimental treatments or demographic categories, such as sex and race.  If you’re not doing a mixed model (and you should know if you are), all your factors are fixed factors.  For a more thorough explanation of fixed and random factors, see Specifying Fixed and Random Factors in Mixed or Multi-Level Models
  • Dummy variable:  A categorical variable that has been dummy coded.  Dummy coding (also called indicator coding) is usually used in regression models, but not ANOVA.  A dummy variable can have only two values: 0 and 1.  When a categorical variable has more than two values, it is recoded into multiple dummy variables.
  • Indicator variable: See dummy variable.


Bookmark and Share