## Control Variable

Research topics can be discussed in this area.
skr
PLS User
Posts: 15
Joined: Thu Jan 17, 2019 6:04 am
Real name and title: S. Ray

### Control Variable

Hi,
Suppose, age and gender are considered to be control variables in my model. Values for gender are as follows: 1=female, 2=male (nominal qualitative). Values for 5 different age groups are denoted by 1 to 5 (ordinal qualitative). Should I simply include these two as single indicator constructs in a similar manner irrespective of their measures? or any other modification is required? What does a negative path co-efficient of the gender signify, if the path is significant?
Thanks

jmbecker
SmartPLS Developer
Posts: 1079
Joined: Tue Mar 28, 2006 11:09 am
Real name and title: Dr. Jan-Michael Becker

### Re: Control Variable

Gender is easy to include: Teat it as a dummy variable (0/1) single indicator construct (you can also leave it as 1/2, because it will be standardized). The effect is then interpreted as the average difference on your DV between female and male weighted by the sample size of females and males (if both are equal it is simply the difference). Hence, negative coefficient means that on average you DV is lower for males than for females.

Age groups are more complicated: If the ordinal variable categories are equidistant you may use it as it is. Many people are using Likert scales (5 or 7 ratings) as quasi-metric directly in PLS. If your age groups are not equidistant you may need to create several dummies and add them all as single indicator constructs to the model. The effects are then all interpreted compared to the reference group (the one group that does not get a dummy variable, but is always zero).
Dr. Jan-Michael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker

skr
PLS User
Posts: 15
Joined: Thu Jan 17, 2019 6:04 am
Real name and title: S. Ray

### Re: Control Variable

Thank you Dr. Becker.

skr
PLS User
Posts: 15
Joined: Thu Jan 17, 2019 6:04 am
Real name and title: S. Ray

### Re: Control Variable

Dr. Becker,
If values for 4 different age groups are mentioned as: 18-26=1, 27-34=2, 35-50=3, 51 onward=4, could I go without creating dummies and include as it is, as a single indicator construct?

jmbecker
SmartPLS Developer
Posts: 1079
Joined: Tue Mar 28, 2006 11:09 am
Real name and title: Dr. Jan-Michael Becker

### Re: Control Variable

That does not quite sound like equidistant categories as they are unequally large. You would need to justify that a change from 1 to 2 is conceptually the same as from 2 to 3 or 3 to 4.
Dr. Jan-Michael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker

skr
PLS User
Posts: 15
Joined: Thu Jan 17, 2019 6:04 am
Real name and title: S. Ray

### Re: Control Variable

Dr. Becker,
Please get me corrected. Since there are four different age groups, I have to create three dummies (e.g. Age1, Age2, Age3). Now, I can assume the first age group i.e. 18-26 as the reference group since most of the participants are from this group. Now I have to include Age1, Age2, and Age3 as single indicator constructs to the model. The first group doesn't get a dummy. I just want to ask you whether the effect of the first group (not directly visible in the model) is embedded with the joint effect of three dummies.

jmbecker
SmartPLS Developer
Posts: 1079
Joined: Tue Mar 28, 2006 11:09 am
Real name and title: Dr. Jan-Michael Becker

### Re: Control Variable

Yes that is correct. The dummies Age1, Age2, Age3 show the difference to the first group (reference group) that does not have a dummy. Thereby, the effect is embedded within the model.
Dr. Jan-Michael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker

skr
PLS User
Posts: 15
Joined: Thu Jan 17, 2019 6:04 am
Real name and title: S. Ray

### Re: Control Variable

Thank you.

skr
PLS User
Posts: 15
Joined: Thu Jan 17, 2019 6:04 am
Real name and title: S. Ray

### Re: Control Variable

Dr. Becker,
I have been facing another problem. Bootstrapping doesn't exhibit appropriate result (all Zero values) with the inclusion of three dummies (age1, age2, age3) as control variables while assuming the first age group i.e. 18-26 as the reference group (most number of participants are from this group). However bootstrapping works well while assuming the last group i.e. 51 onward=4 as the reference group (least number of participants/ only 6 out of 581). Please comment.

jmbecker
SmartPLS Developer
Posts: 1079
Joined: Tue Mar 28, 2006 11:09 am
Real name and title: Dr. Jan-Michael Becker

### Re: Control Variable

Dummy variables may cause problems with the bootstrapping procedure if they are unevenly distributed. That is because bootstrapping is a random sampling procedure. It samples with replacement from the original dataset. Hence, there might be subsamples where one of your dummies has only ones or zeros because it has sampled only those observations. For example, if you have a dummy with only a few ones then it is likely that the random procedure does not pick any of those and you have a variable with only zeros.
That would create a variable with zero variance which cannot be estimated in a standardized regression which is used for the PLS path model.
Dr. Jan-Michael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker

Anna
PLS Junior User
Posts: 1
Joined: Mon Feb 10, 2020 5:26 pm
Real name and title: Anna Mayr

### Re: Control Variable

Hello,
I have the same problem, that my dummy variables are unevenly distributed. The Bootstrapping doesn´t exhibit appropiate results. Is there any solution for this problem?

Thanks a lot!

Kind regards,
Anna