Control Variable
Control Variable
Hi,
Suppose, age and gender are considered to be control variables in my model. Values for gender are as follows: 1=female, 2=male (nominal qualitative). Values for 5 different age groups are denoted by 1 to 5 (ordinal qualitative). Should I simply include these two as single indicator constructs in a similar manner irrespective of their measures? or any other modification is required? What does a negative path coefficient of the gender signify, if the path is significant?
Thanks
Suppose, age and gender are considered to be control variables in my model. Values for gender are as follows: 1=female, 2=male (nominal qualitative). Values for 5 different age groups are denoted by 1 to 5 (ordinal qualitative). Should I simply include these two as single indicator constructs in a similar manner irrespective of their measures? or any other modification is required? What does a negative path coefficient of the gender signify, if the path is significant?
Thanks

 SmartPLS Developer
 Posts: 1129
 Joined: Tue Mar 28, 2006 11:09 am
 Real name and title: Dr. JanMichael Becker
Re: Control Variable
Gender is easy to include: Teat it as a dummy variable (0/1) single indicator construct (you can also leave it as 1/2, because it will be standardized). The effect is then interpreted as the average difference on your DV between female and male weighted by the sample size of females and males (if both are equal it is simply the difference). Hence, negative coefficient means that on average you DV is lower for males than for females.
Age groups are more complicated: If the ordinal variable categories are equidistant you may use it as it is. Many people are using Likert scales (5 or 7 ratings) as quasimetric directly in PLS. If your age groups are not equidistant you may need to create several dummies and add them all as single indicator constructs to the model. The effects are then all interpreted compared to the reference group (the one group that does not get a dummy variable, but is always zero).
Age groups are more complicated: If the ordinal variable categories are equidistant you may use it as it is. Many people are using Likert scales (5 or 7 ratings) as quasimetric directly in PLS. If your age groups are not equidistant you may need to create several dummies and add them all as single indicator constructs to the model. The effects are then all interpreted compared to the reference group (the one group that does not get a dummy variable, but is always zero).
Dr. JanMichael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Re: Control Variable
Thank you Dr. Becker.
Re: Control Variable
Dr. Becker,
If values for 4 different age groups are mentioned as: 1826=1, 2734=2, 3550=3, 51 onward=4, could I go without creating dummies and include as it is, as a single indicator construct?
If values for 4 different age groups are mentioned as: 1826=1, 2734=2, 3550=3, 51 onward=4, could I go without creating dummies and include as it is, as a single indicator construct?

 SmartPLS Developer
 Posts: 1129
 Joined: Tue Mar 28, 2006 11:09 am
 Real name and title: Dr. JanMichael Becker
Re: Control Variable
That does not quite sound like equidistant categories as they are unequally large. You would need to justify that a change from 1 to 2 is conceptually the same as from 2 to 3 or 3 to 4.
Dr. JanMichael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Re: Control Variable
Dr. Becker,
Please get me corrected. Since there are four different age groups, I have to create three dummies (e.g. Age1, Age2, Age3). Now, I can assume the first age group i.e. 1826 as the reference group since most of the participants are from this group. Now I have to include Age1, Age2, and Age3 as single indicator constructs to the model. The first group doesn't get a dummy. I just want to ask you whether the effect of the first group (not directly visible in the model) is embedded with the joint effect of three dummies.
Please get me corrected. Since there are four different age groups, I have to create three dummies (e.g. Age1, Age2, Age3). Now, I can assume the first age group i.e. 1826 as the reference group since most of the participants are from this group. Now I have to include Age1, Age2, and Age3 as single indicator constructs to the model. The first group doesn't get a dummy. I just want to ask you whether the effect of the first group (not directly visible in the model) is embedded with the joint effect of three dummies.

 SmartPLS Developer
 Posts: 1129
 Joined: Tue Mar 28, 2006 11:09 am
 Real name and title: Dr. JanMichael Becker
Re: Control Variable
Yes that is correct. The dummies Age1, Age2, Age3 show the difference to the first group (reference group) that does not have a dummy. Thereby, the effect is embedded within the model.
Dr. JanMichael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Re: Control Variable
Thank you.
Re: Control Variable
Dr. Becker,
I have been facing another problem. Bootstrapping doesn't exhibit appropriate result (all Zero values) with the inclusion of three dummies (age1, age2, age3) as control variables while assuming the first age group i.e. 1826 as the reference group (most number of participants are from this group). However bootstrapping works well while assuming the last group i.e. 51 onward=4 as the reference group (least number of participants/ only 6 out of 581). Please comment.
I have been facing another problem. Bootstrapping doesn't exhibit appropriate result (all Zero values) with the inclusion of three dummies (age1, age2, age3) as control variables while assuming the first age group i.e. 1826 as the reference group (most number of participants are from this group). However bootstrapping works well while assuming the last group i.e. 51 onward=4 as the reference group (least number of participants/ only 6 out of 581). Please comment.

 SmartPLS Developer
 Posts: 1129
 Joined: Tue Mar 28, 2006 11:09 am
 Real name and title: Dr. JanMichael Becker
Re: Control Variable
Dummy variables may cause problems with the bootstrapping procedure if they are unevenly distributed. That is because bootstrapping is a random sampling procedure. It samples with replacement from the original dataset. Hence, there might be subsamples where one of your dummies has only ones or zeros because it has sampled only those observations. For example, if you have a dummy with only a few ones then it is likely that the random procedure does not pick any of those and you have a variable with only zeros.
That would create a variable with zero variance which cannot be estimated in a standardized regression which is used for the PLS path model.
That would create a variable with zero variance which cannot be estimated in a standardized regression which is used for the PLS path model.
Dr. JanMichael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Re: Control Variable
Hello,
I have the same problem, that my dummy variables are unevenly distributed. The Bootstrapping doesn´t exhibit appropiate results. Is there any solution for this problem?
Thanks a lot!
Kind regards,
Anna
I have the same problem, that my dummy variables are unevenly distributed. The Bootstrapping doesn´t exhibit appropiate results. Is there any solution for this problem?
Thanks a lot!
Kind regards,
Anna

 SmartPLS Developer
 Posts: 1129
 Joined: Tue Mar 28, 2006 11:09 am
 Real name and title: Dr. JanMichael Becker
Re: Control Variable
I am not aware of a methodological fix for the problem.
You may reconsider why you have severely undersampled one of the categories and whether the results are even valuable given the few answers in one of the categories. In addition, increasing the sample size and especially sampling more of the underrepresented units would help.
You may reconsider why you have severely undersampled one of the categories and whether the results are even valuable given the few answers in one of the categories. In addition, increasing the sample size and especially sampling more of the underrepresented units would help.
Dr. JanMichael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de