Hello,
I have a formative model with 1 IV (12 indicators), and 2 DV (6 and 7 indicators).
I run a bootstrap with 5.000 subsamples and the results I get are a little bit strange.
the Path coefficient between IV and DV1 is 0.501 for the original sample, but in the column Mean sample, the path coefficient is 0.079
Can someone explain why the difference is so huge between the original sample and the sample mean?
Kind regards.
K
Difference between Original sample and sample mean

 SmartPLS Developer
 Posts: 908
 Joined: Tue Mar 28, 2006 11:09 am
 Real name and title: Dr. JanMichael Becker
Re: Difference between Original sample and sample mean
The original sample estimate is the parameter from estimating the model on your original dataset as you would also get it from a normal PLS algorithm estimation.
The sample mean estimate is the average of the estimates from all the subsamples of you dataset drawn during the bootstrapping procedure.
If the two deviate strongly it is likely that there is a data problem in your sample or a model problem that causes large outliers in the sampling distribution of your parameter estimates (you may also want to investigate the histogram of the parameter estimates from the bootstrapping).
That can have multiple reasons: wrong coding of variables, severe multicollinearity, model problems (i.e., using PLSc although your model is not a common factor model), very small sample size, excessive missing values, wrong coded missing values, and many others to only name a few. You need to carefully trace the problem by looking more deeply into your model and data.
The sample mean estimate is the average of the estimates from all the subsamples of you dataset drawn during the bootstrapping procedure.
If the two deviate strongly it is likely that there is a data problem in your sample or a model problem that causes large outliers in the sampling distribution of your parameter estimates (you may also want to investigate the histogram of the parameter estimates from the bootstrapping).
That can have multiple reasons: wrong coding of variables, severe multicollinearity, model problems (i.e., using PLSc although your model is not a common factor model), very small sample size, excessive missing values, wrong coded missing values, and many others to only name a few. You need to carefully trace the problem by looking more deeply into your model and data.
Dr. JanMichael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Ja ... v=hdr_xprf
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Researchgate: https://www.researchgate.net/profile/Ja ... v=hdr_xprf
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de