I have a model where three blocks predict a fourth block. There are no hypothesized relations among the three predictor blocks, but they do have substantial correlations. When I delete one of the predictor blocks, R square for the dependent block actually increases (slightly), vs R square for the model with all three predictor blocks. Moreover, when I impose predictive relations among the predictors--but leave all three predictor blocks predicting the dependent block--R square declines substantially.
I understand that PLS uses a piecewise estimation approach, and so may not behave the same as a simultaneous estimation procedure, but is this sort of behavior common? It causes me to doubt the meaningfulness of the numbers produced by PLS.
--Ed Rigdon
R square changes and changes in the model
R square changes and changes in the model
Edward E. Rigdon
Marketing Department
Georgia State University
Atlanta, GA USA
Marketing Department
Georgia State University
Atlanta, GA USA
- cringle
- SmartPLS Developer
- Posts: 820
- Joined: Tue Sep 20, 2005 9:13 am
- Real name and title: Prof. Dr. Christian M. Ringle
- Location: Hamburg (Germany)
- Contact:
Dear Ed,
the PLS methodology – as extensively statistically described by Lohmöller (viewtopic.php?t=16) - and the results produced appear reasonable to me. The case that you described never happened in my research efforts and is not a common PLS behaviour. However, sometimes “wild” results occur, for example, when multicollinearity reaches critical levels. This might be the source of your troubles as well.
Best
Christian
the PLS methodology – as extensively statistically described by Lohmöller (viewtopic.php?t=16) - and the results produced appear reasonable to me. The case that you described never happened in my research efforts and is not a common PLS behaviour. However, sometimes “wild” results occur, for example, when multicollinearity reaches critical levels. This might be the source of your troubles as well.
Best
Christian
Prof. Dr. Christian M. Ringle, Hamburg University of Technology (TUHH), SmartPLS
- Literature on PLS-SEM: https://www.smartpls.com/documentation
- Google Scholar: https://scholar.google.de/citations?use ... AAAJ&hl=de
- Literature on PLS-SEM: https://www.smartpls.com/documentation
- Google Scholar: https://scholar.google.de/citations?use ... AAAJ&hl=de
- joerghenseler
- PLS Expert User
- Posts: 39
- Joined: Fri Oct 14, 2005 9:59 am
- Real name and title:
Decreasing R-square when adding independent latent variables
I do not think that multicollinearity may be the reason, because although multicollinearity may give you strange path coefficients, it does not (negatively) influence the prediction of the latent variables.
If you draw connections between the independent variables, the latent variable scores of the independent latent variables, which before were (somehow) optimized to predict the dependent latent variable, are now also optimized to explain hypothesized mediating variables (which have a correlation with the dependent variable smaller than one). As the latent variable scores of the independent variables have altered, they may now explain less of the dependent variable than before.
Probably this effect is quite strong if you work with Mode B (which is most often interpreted as formative) or if the measurement model is not very reliable, so that the latent variable scores of one particular latent variable may strongly vary depending on the context in which it is put.
I am pretty sure that the same happens with simultaneous estimation procedures. I guess you will find similar results e.g. with LISREL. LISREL is not designed to maximize the R square, but to fit the theoretical covarianve matrix as far as possible with the observed one.
If you draw connections between the independent variables, the latent variable scores of the independent latent variables, which before were (somehow) optimized to predict the dependent latent variable, are now also optimized to explain hypothesized mediating variables (which have a correlation with the dependent variable smaller than one). As the latent variable scores of the independent variables have altered, they may now explain less of the dependent variable than before.
Probably this effect is quite strong if you work with Mode B (which is most often interpreted as formative) or if the measurement model is not very reliable, so that the latent variable scores of one particular latent variable may strongly vary depending on the context in which it is put.
I am pretty sure that the same happens with simultaneous estimation procedures. I guess you will find similar results e.g. with LISREL. LISREL is not designed to maximize the R square, but to fit the theoretical covarianve matrix as far as possible with the observed one.