We are trying to apply a 3-group PLS-POS (N=799) and, as suggested by Becker et al. (2013) ("However, like the expectation–maximization (EM) algorithm in FIMIX-PLS, PLS-POS can face the problem of ending in local optima due to its use of a hill-climbing approach. Thus, a repeated application of PLS-POS with different starting partitions is advisable.", p. 676), we are performing different runs (random assigment with pre-segmentation) of the algorithm.
The model includes 2 exogenous formative variables and 3 endogenous (1 formative, 2 reflective) variables (see image below).
In the first round of runs, the criterion we used was to maximize the sum of R-square for the target construct (PI in this model). So far, after 25 runs, the results are mostly consistent in terms of variance explained and path coefficients (around 70-80% of the time, the three groups are different in each run but similar across different runs). Interestingly, despite this consistency, we have observed that group memberships (i.e. the assignment of observations to the groups) are not consistent at all. That is, only a low proportion of observations is consistently assigned to each group (after renaming each group). However, the results have been problematic, with very important discriminant validity issues (by observing the results our guess is that the algorithm created the groups based on each of the antecedents: OBE, MOT, FBI).
After this, we are exploring another optimization criterion (sum of all construct weighed R-squares). Again, across several runs we have consistent results in terms of R-square and path coefficients, and inconsistent results in group memberships, but no discriminant validity issues.
Now, my questions would be:
- Given that the formation of groups is not consistent in terms of membership assignment, what would be the criterion for selection of group membership for multi-group analysis (MGA)?
- Obviously, classification of observations based on averaging group assignment does not make sense after observing the inconsistency of group membership. Our guess is that a good choice would be to select the PLS-POS analysis with best average R-square improvement from the original model. Would that be enough?