## Sample Selection Biase and Gaussian Copulas

Before posting, check our FAQ to see if your question is already covered.
sineoriz@gmail.com
PLS Junior User
Posts: 2
Joined: Wed Jun 14, 2023 12:08 pm
Real name and title: Hyeon Jo Dr.

### Sample Selection Biase and Gaussian Copulas

Hello SmartPLS users and experts,

I am currently analyzing the behavior of online platform users and have received the following comments from reviewers:

Reviewer A:
Questionnaires may suffer from selection bias. How can this be mitigated?

Reviewer B:
Regarding the empirical analysis, the small sample size could result in estimation bias and sample selection bias. This study's results are based on data from individuals willing to participate in the survey. They might display a stronger relationship between MV1 and DV1 compared to those who decline the survey. It's suggested that the author considers the endogeneity stemming from survey samples.

Q1. Is there a way to check and address sample selection bias in SmartPLS? If so, could you guide me through the process?
Q2. I plan to apply Gaussian Copulas for endogeneity, but I'm unsure about it. My research model is as follows:

There are four independent variables (IV1, IV2, IV3, IV4), 2 mediators (MV1, MV2), and 1 final variable (DV1).
All four IVs are the determinants of MV1, MV2, and DV. MV1 and MV2 are the determinants of DV1.

In this case, should I first remove MV2 and DV1, set the outcome as MV1, and apply copulas (endogenous variable) one at a time, in pairs (e.g., IV1, IV2), in triples (IV1, IV2, IV3), and then all together (IV1, IV2, IV3, IV4) performing bootstrapping on each possible model?
Then, after removing MV1 and DV1, should I set the outcome to MV2 and repeat the same steps?
Afterward, in the complete model with DV1 as the outcome, should I include MV1 and MV2 as endogenous variables and examine every combination, ranging from one to six variables?
If so, there are over 100 possible scenarios. Is it appropriate to test each one individually?

jmbecker
SmartPLS Developer
Posts: 1265
Joined: Tue Mar 28, 2006 11:09 am
Real name and title: Dr. Jan-Michael Becker

### Re: Sample Selection Biase and Gaussian Copulas

1) Selection bias is nothing that can be checked with a PLS analysis or similar. The only thing that you can do is comparing your sample characteristics to some other (outside) information about the population to show that you have an adequate sample.

2) If you use Gaussian copulas you should ad the copula term to your orginal model. There is not need to run reduced models. Deleting IVs from the model will only lead to more endogeneity bias because of omitted variable bias.

However:
If you have a small sample the Gaussian copula approach may not work very well. Please review the following article:
Becker, J. M., Proksch, D., & Ringle, C. M. (2022). Revisiting Gaussian copulas to handle endogenous regressors. Journal of the Academy of Marketing Science, 50(1), 46-66. https://doi.org/10.1007/s11747-021-00805-y

I am not sure, but I could imagine that Gaussian copulas will not help you with endogeneity bias due to sample selection bias. The reason is that it comes from data not included in the analysis (the ones that do not participate) and not from information not included in the model. But I don't have a reference for this at the moment. It is just an intuition.
Dr. Jan-Michael Becker, BI Norwegian Business School, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker