Formative constructs with problematic indicator weights

Questions about the implementation and application of the PLS-SEM method, that are not related to the usage of the SmartPLS software.
cgrimpe
PLS User
Posts: 11
Joined: Tue Nov 15, 2005 12:20 pm
Real name and title:

Formative constructs with problematic indicator weights

Post by cgrimpe »

I've run several PLS models with varying sets of data. Nearly in all models where formative constructs were employed, there are positive as well as negative indicator signs for any one construct. On the one hand, you have to clean up your measurement model before estimating the structural model. On the other hand, formative indicators are assigned to constructs for the sake of completeness as eliminating one indicator might alter the nature of the whole construct.

In contrast to that, reflective constructs seem to be rather unproblematic. If you have covarying indicators you can be quite sure that your loadings will correspondingly be high. Moreover, formative weights usually appear to be much lower than loadings. Is there a way to yield "nicer" indicator weights from the very beginning?

To me, this also seems to be a problem when it comes to the review process for a journal. "Unsightly" indicator weights may raise criticism you can't disprove and may also lead to a predominant use of reflective indicators. What is your opinion on that?
jjsailors
PLS Expert User
Posts: 62
Joined: Fri Oct 14, 2005 1:43 am
Real name and title:

Post by jjsailors »

ALL indicators in PLS are formative, whether the outer, "measurement" model is specified as being inner or outer directed. All that varies is the estimation method by which the weights are determined. To avoid the problems you describe (due to multicollinearity) you should specify your outer model as being outer directed. In a diagram this gives the appearance of having reflective indicators but it is not. ALL latent variables in PLS are weighted sums of their inidcators, no matter how the weights are determined. The problem is really one of how we diagram PLS models, not a conceptual problem.
John J. Sailors, PhD
Associate Professor of Marketing
The University of St. Thomas
Opus College of Business
Minneapolis, MN
cgrimpe
PLS User
Posts: 11
Joined: Tue Nov 15, 2005 12:20 pm
Real name and title:

Post by cgrimpe »

I'm not quite sure if I understand your point. You typically handle multicollinearity by centering or standardizing your indicators which is automatically done by PLS when the option "mean=0, variance=1" for the variable treatment is selected. Moreover, a maximum variance inflation factor of say 2 should not affect the indicator weights so much.

It's clear that all constructs are formative in the way their score is calculated. But if you have a well-founded formative construct - as i.e. socio-economic status - is it really justified to have it as being outer directed, i.e. being reflective? And what do you look at then? Loadings or weights?
User avatar
joerghenseler
PLS Expert User
Posts: 39
Joined: Fri Oct 14, 2005 9:59 am
Real name and title:

Formative versus reflective

Post by joerghenseler »

John is right when he says that the direction of the arrows in PLS is not necessarily an indication of whether the measurement model is formative or reflective.
Actually, Chin, Ringle and all the other developpers of GUIs for PLS Path Modeling just use the direction of the arrows to differentiate between Mode A and Mode B.
Usually, Mode A is interpreted as reflective, Mode B as formative. But this is more a question of philosophy than of statistics. Actually, Mode A is the first component of a PLS regression (this is by the way a tool to overcome multicollinearity, and implemented in SPAD-PLS).
PLS Path Modeling determines the latent variable scores (as John says) always as weighted sum of its indicators. In Mode A, the weights are the correlations between the indicators and the inner estimation of the latent variable; in Mode B the weights are regression weights.

However, I do not agree with John that determining a latent variable as a linear combination of indicators means that it is formative. Take the example of summated scales: They are by definition linear combinations, and many summated scales are regarded as reflective!
If you have a latent variable determined by linear combination of its indicators, you CAN determine the measurement error of an indicator, and thus regard your measurement model as reflective.

Conclusion:
"Reflective vs. formative" is more a philosphic question (are the indicators the cause of the LV or is the LV the cause of the indicators) than a statistical question.
cgrimpe
PLS User
Posts: 11
Joined: Tue Nov 15, 2005 12:20 pm
Real name and title:

Re: Formative versus reflective

Post by cgrimpe »

joerghenseler wrote:PLS Path Modeling determines the latent variable scores (as John says) always as weighted sum of its indicators. In Mode A, the weights are the correlations between the indicators and the inner estimation of the latent variable; in Mode B the weights are regression weights.
Yes, in mode A the loadings are used to calculate the LV score and in mode B the weights. So there is actually a difference and - in my opinion - you can't just model all of your constructs in mode A, i.e. reflective, as John suggested because the differences in LV score estimation will lead to differences in path coefficients as well. Apart from that, many summated scales that have appeared in literature are misspecified as Eggert/Fassott (2003) show.

So back to my original question: How do you get rid of negative indicator weights in formative constructs?
User avatar
joerghenseler
PLS Expert User
Posts: 39
Joined: Fri Oct 14, 2005 9:59 am
Real name and title:

Post by joerghenseler »

You get rid of them if you use Mode PLS (only available with SPADPLS so far) or you restrict yourself in just looking at the first component of the PLS regresion - which actually is the same as mode A.
If you use Mode A and report the weights instead of the loadings you are at least on step nearer towhat you are looking for.
T_Hansen
PLS Junior User
Posts: 5
Joined: Wed Nov 09, 2005 4:20 pm
Real name and title:

Post by T_Hansen »

The argumentation that reflective and formative models vary statistically rather slightly has been shown in a set of studies. Albers/Hildebrandt for example have modelled both reflective and formative constructs of the same problem domain and came to the conclusion that, just as Dr. Henseler has stated, rather little differences were to be found.

However I have not really come across a satisfying solution on what to do with intercorrelated constructs of formative nature like in Dr. Grimpes problem. Diamantopoulos advises to eliminate intercorrelating variables in formative constructs, thus modifying semantics. Johnson advises to build an (unweighted) index in order not to eliminate any variables. Then there is the possibility of drawing it with the appearence of a reflective construct in PLS, however I would have a hard time explaining this in an academic paper ("misspecification") and I would not be able to answer when it would formally be formative in PLS.

If I understand you correctly, in the case of Dr. Grimpes you advise on the latter. Can someone elaborate on a.) when to use then inbound vs. outbound arrows for PLS models? b.) whether to use loadings vs. weights in such a case?
jjsailors
PLS Expert User
Posts: 62
Joined: Fri Oct 14, 2005 1:43 am
Real name and title:

Post by jjsailors »

It is the weights that are always used in PLS to calculate latent variable scores, never the loadings. Graphic conventions of Mode A vs Mode B are just that—conventions for pictorial representation. In all PLS models the weights are always used to construct the latent variables, all that differs between modes is how the weights are estimated.

Furthermore, with all due respect, Reflective vs Formative or Constructive isn't simply a philosophical question, it is a question of causality.

Reflective indicators model the situation where the latent variable causes the indicators. Having a common cause, the indicators will be correlated (having low correlations suggest a problem with the model), and as that common cause changes (increases or decreases) so too will all the indicators. In other words, a reflective model means that all the indicators will move in tandem (due to differing measurement error they will not all move equally, but they will all move).

Formative indicators do not require a common cause and need not move in tandem. They may or may not be correlated. Rather than being caused by the latent variable, they are seen as causing the latent variable.

With regard to measurement error, all measures have error, but in PLS this variance is not isolated or treated differently than true score variance.

With respect to a construct like socio-economic status, yes, it is a formative construct (education level is not caused by the same thing that causes the prestige of one's job), as all constructs in PLS are, whether you use Mode A or Mode B (inner or outer directed). As such no matter which mode you employ for your estimation you have the latent variable is modeled by formative indicators and they should be presented and discussed as such.
John J. Sailors, PhD
Associate Professor of Marketing
The University of St. Thomas
Opus College of Business
Minneapolis, MN
cgrimpe
PLS User
Posts: 11
Joined: Tue Nov 15, 2005 12:20 pm
Real name and title:

Post by cgrimpe »

jjsailors wrote:It is the weights that are always used in PLS to calculate latent variable scores, never the loadings.
Not necessarily. I exported the LV scores (estimated in Mode A) and did a correlation analysis between them and composite variables calculated "by hand" using the indicator values multiplied by (1) the loadings and (2) the weights. In both cases, the correlation is 1. So there is virtually no difference except for the absolute value of the loading/weight which is typically higher for a loading.
jjsailors wrote:With respect to a construct like socio-economic status, yes, it is a formative construct (education level is not caused by the same thing that causes the prestige of one's job), as all constructs in PLS are, whether you use Mode A or Mode B (inner or outer directed). As such no matter which mode you employ for your estimation you have the latent variable is modeled by formative indicators and they should be presented and discussed as such.
So as a rule: Estimate all your LVs in Mode A no matter whether you conceptualize them to be formative or reflective. Then take the loadings for the reflective and the weights for the formative for interpretation. Would you agree?
User avatar
joerghenseler
PLS Expert User
Posts: 39
Joined: Fri Oct 14, 2005 9:59 am
Real name and title:

Post by joerghenseler »

If your construct is reflective, you should use Mode A.

If your construct is formative, you can either use Mode A or Mode B.
Mode B is fine if you have no multicollinearity and/or if you to use a formative independent variable to maximize the explanation of the dependent variable.
If you have multicollinearity, using Mode A is one solution.
Another possibility is using Mode B and later decompose the variance of the indicators or use other techniques recommended in statistics textbooks to overcome multicollinearity in multiple regression.
jjsailors
PLS Expert User
Posts: 62
Joined: Fri Oct 14, 2005 1:43 am
Real name and title:

Post by jjsailors »

cgrimpe wrote: In both cases, the correlation is 1.
Yes, (AX1 + BX2 + CX3) will have a correlation of 1 with (JX1 +KX2 + LX3) for all values of A, B, C, J, K, and L, so long as the pairs (A,J), (B, K) and (C, L) have the same sign (positive or negative). The two weighted sums will have different values, but those values will move in perfect lock step, ie, will have a correlation of 1.
cgrimpe wrote: So as a rule: Estimate all your LVs in Mode A no matter whether you conceptualize them to be formative or reflective. Then take the loadings for the reflective and the weights for the formative for interpretation. Would you agree?
I prefer Mode A because it circumvents the problem of multicollinearity at the indicator level, which is typically an issue in applied research, where I have most often used PLS. Mathematically all LVs are weighted sums, using the weights and not the loadings. Concpetually PLS never really models reflective indicators.

The only time I would use the loading is in the following situation. Imagine that we have a variable, X1 that is an indicator for an exogenous variable and a variable Y1 that is an indicator of an endogenous variable that is predicted, directly or indirectly, by the construct which X1 helps define. Now imagine that we are interested in calculating the impact of X1 on Y1. I would take the weight for X1, multiply it by the direct path coefficient, if any, linking it's construct to the construct that Y1 helps define, sum that with all the products of its weight times each indirect path coefficient linking its construct to the construct that Y1 helps define, and then mulitply that by the loading of Y1 on it's construct.

In other words, if you need to make a prediction of the impact of a construct on an indicator, you would use the loading but that should not be confused with the fact that in PLS the loadings are never used for calculating latent variable scores and so play no role in the formation of latent variables nor in determining path coefficients.
joerghenseler wrote: If your construct is reflective, you should use Mode A.
If your construct is truly reflective you should reevaluate the use of PLS because all the choice of mode does is determine the way that weights are calculated and defining a construct as a weighted sum of indicators means it is formative, no matter where the weights come from.
John J. Sailors, PhD
Associate Professor of Marketing
The University of St. Thomas
Opus College of Business
Minneapolis, MN
jjsailors
PLS Expert User
Posts: 62
Joined: Fri Oct 14, 2005 1:43 am
Real name and title:

Post by jjsailors »

By the way, please don't take my comments as implying anything negative about PLS. I am a huge PLS fan but it reflects a different model than does covariance based SEM, regardless of estimation mode or how we graph our models. Ignoring the differences between those PLS and covariance based SEM serves no good purpose.
John J. Sailors, PhD
Associate Professor of Marketing
The University of St. Thomas
Opus College of Business
Minneapolis, MN
User avatar
joerghenseler
PLS Expert User
Posts: 39
Joined: Fri Oct 14, 2005 9:59 am
Real name and title:

Post by joerghenseler »

John, how would you determine latent variable scores if not as a linear combination of indicators (whether belonging to the block in question or not)?
jjsailors
PLS Expert User
Posts: 62
Joined: Fri Oct 14, 2005 1:43 am
Real name and title:

Post by jjsailors »

joerghenseler wrote:John, how would you determine latent variable scores if not as a linear combination of indicators (whether belonging to the block in question or not)?
This is another area where PLS and covariance-based SEM butt heads. True latent variables or constructs don't have scores that we can measure. They can only be estimated, as in the case of true factor analysis and the corresponding factor scores. A true factor model--which is what a reflective indicator model represents--can only have estimates of the values of the latent variable.

Formative indicators define what should really be called composite variables, which is always what we have in PLS, whether we use Mode A or Mode B estimation. For example, socio-economic status is a composite variable defined as being jointly determined by its components.

The fact that latent scores (composite variable scores, really) are determinate and known in PLS is one of it's advantages over covariance-based SEM, which is marked by estimates of latent scores that have an indeterminate true-score.

In an earlier post the question of summated scales came up. For example, an IQ test containing 40 questions where an individual's score is an adjustment made to their total number of correctly answered questions such that the average across individuals is 100. One might ask, isn't intelligence a true latent variable, which should be measured by reflective indicators? But don't we use our IQ test score, an adjusted sum, as a measure of intelligence?

The answer, of course, is yes, but that IQ score, that adjusted sum is just one measure, an imperfect measure and is not itself synonymous with intelligence. In other words, that sum is a single (imperfect) measure of intelligence. It would, ideally, be used in a structural model along with other (also imperfect) measures of intelligence. Sometimes, we use a single measure alone. When we do so we are ignoring measurement error but it is, of course, still there, and our single, summed measure is a reflection of the construct of interest, but it does not equal that construct.
John J. Sailors, PhD
Associate Professor of Marketing
The University of St. Thomas
Opus College of Business
Minneapolis, MN
User avatar
joerghenseler
PLS Expert User
Posts: 39
Joined: Fri Oct 14, 2005 9:59 am
Real name and title:

Post by joerghenseler »

jjsailors wrote: This is another area where PLS and covariance-based SEM butt heads. True latent variables or constructs don't have scores that we can measure.
Do you regard "not having scores" as a constitual element of reflective constructs?
If so, what about PLS path modelling using covariance matrices as input?
In that case, you do not have scores, either.
Post Reply