Selection of PLS-POS groups across repeated runs

ahg · Post by **ahg** » Mon Apr 17, 2017 3:16 pm

Hi all,

We are trying to apply a 3-group PLS-POS (N=799) and, as suggested by Becker et al. (2013) ("However, like the expectation–maximization (EM) algorithm in FIMIX-PLS, PLS-POS can face the problem of ending in local optima due to its use of a hill-climbing approach. Thus, a repeated application of PLS-POS with different starting partitions is advisable.", p. 676), we are performing different runs (random assigment with pre-segmentation) of the algorithm.

The model includes 2 exogenous formative variables and 3 endogenous (1 formative, 2 reflective) variables (see image below).

In the first round of runs, the criterion we used was to maximize the sum of R-square for the target construct (PI in this model). So far, after 25 runs, the results are mostly consistent in terms of variance explained and path coefficients (around 70-80% of the time, the three groups are different in each run but similar across different runs). Interestingly, despite this consistency, we have observed that group memberships (i.e. the assignment of observations to the groups) are not consistent at all. That is, only a low proportion of observations is consistently assigned to each group (after renaming each group). However, the results have been problematic, with very important discriminant validity issues (by observing the results our guess is that the algorithm created the groups based on each of the antecedents: OBE, MOT, FBI).

After this, we are exploring another optimization criterion (sum of all construct weighed R-squares). Again, across several runs we have consistent results in terms of R-square and path coefficients, and inconsistent results in group memberships, but no discriminant validity issues.

Now, my questions would be:

Given that the formation of groups is not consistent in terms of membership assignment, what would be the criterion for selection of group membership for multi-group analysis (MGA)?
Obviously, classification of observations based on averaging group assignment does not make sense after observing the inconsistency of group membership. Our guess is that a good choice would be to select the PLS-POS analysis with best average R-square improvement from the original model. Would that be enough?

Thanks in advance!

jmbecker · Post by **jmbecker** » Sun Apr 23, 2017 9:20 am

The problem with inconsistent assignment of observation results from observation that do not fit well in any of the groups (or fit well in all groups). In the first case these are outliers that have a large distance to all groups and in the second case these are observations that are in the overlapping parts of the groups with small distance to all groups.
If you would have very homogenous groups you would approach an R² of 1. Of course this will rarely happen in practice and you will always have observations that are not well predicted by your model (in any of the groups).
It will probably not matter very much to which group you assign them for MGA if the path coefficient for the groups are the same.
It could pose a problem when you try to relate the grouping to external variables to make the groups accessible to explain them.

Generally, I would use the average weighted R² as the optimization (and selection) criterion. This was not part of the original MISQ paper, but performs better with unequal groups as it accounts for the unequal sample size of the groups. Thereby, you do not create (or select) a solution with small very well fitting outlier groups that are hard to interpret.

ahg · Post by **ahg** » Mon Apr 24, 2017 4:28 pm

Hi Jan-Michael,

Thanks for your reply. You have confirmed our suspecions about the reasons why group assignment seemed to show erratic results but consistent overall model results (furthermore, we had already started to work with avg. weighted R-square as it seemed to perform much better, but a confirmation is very much appreciated) .

Your post was really helpful.

PS: Is there any additional reference material on the mathematical process behind PLS-POS? That would highly help understanding how the method works.

Best,

Ángel.

jmbecker · Post by **jmbecker** » Mon Apr 24, 2017 7:53 pm

The MISQ article has an online appendix that explains the algorithm in detail.

ahg · Post by **ahg** » Mon Apr 24, 2017 11:06 pm

jmbecker wrote:The MISQ article has an online appendix that explains the algorithm in detail.

Great! I was not aware of the Appendix.

Thanks again!

forum.smartpls.com

Selection of PLS-POS groups across repeated runs

Selection of PLS-POS groups across repeated runs

Re: Selection of PLS-POS groups across repeated runs

Re: Selection of PLS-POS groups across repeated runs

Re: Selection of PLS-POS groups across repeated runs

Re: Selection of PLS-POS groups across repeated runs