## PLS Predict: Questions about Information Theoretic Criteria

Research topics can be discussed in this area.
skono
PLS Junior User
Posts: 8
Joined: Wed Aug 17, 2016 4:02 pm
Real name and title: Shintaro Kono, Ph.D Candidate

### PLS Predict: Questions about Information Theoretic Criteria

Greetings,

I am writing a paper using the PLS Predict function, as well as the information theoretic criteria advocated by Sharma et al. (forth coming) (see the following link for the Excel sheet to calculate these indices: https://www.pls-sem.net/downloads/). I am comparing multiple models in the literature I am researching, and using RMSE and MAE (for predictive ability) and BIC and GM (for the balance between explanation and prediction). Here are a few questions.

(a) What exactly do we mean by "saturated model R2" in the Excel sheet for the information theoretic criteria calculation, WHEN you have 2nd-order measurement models? Should lower-order variables also directly predict the target outcome variable, or only the higher-orders (I assumed the latter is the case, but am asking because of the following problems)?

(b) I have negative BIC (as well as many other indices in negative, such as AIC, AICu, HQ, HQc). What does this mean? Indeed, the Excel file has a negative value for BIC when you download it. Is there any problems with the equations, or is this just fine? If it's fine, how to we interpret negative values in model comparisons (e.g., closer to zero is better, or negative values meaning those models are crappy anyway?).

(c) I have one model that has a moderated path to the target outcome variable. And the RMSE and MAE for this model is remarkably lower (and thus better prediction) than other models'. However, this moderated model has lower explanation power. I understand that explanation and prediction are two different things. Would this be one of those cases where low explanatory power models predict well, OR is there any problems around applying PLS Predict and RMSE/MAE to moderated models?

All the best,
Shin

jmbecker
SmartPLS Developer
Posts: 932
Joined: Tue Mar 28, 2006 11:09 am
Real name and title: Dr. Jan-Michael Becker

### Re: PLS Predict: Questions about Information Theoretic Criteria

skono wrote:
Fri Jun 29, 2018 8:59 pm

(a) What exactly do we mean by "saturated model R2" in the Excel sheet for the information theoretic criteria calculation, WHEN you have 2nd-order measurement models? Should lower-order variables also directly predict the target outcome variable, or only the higher-orders (I assumed the latter is the case, but am asking because of the following problems)?
Very good question. I think you should ask the authors of the paper. I also found similar questions regarding the saturated model.
skono wrote:
Fri Jun 29, 2018 8:59 pm
(b) I have negative BIC (as well as many other indices in negative, such as AIC, AICu, HQ, HQc). What does this mean? Indeed, the Excel file has a negative value for BIC when you download it. Is there any problems with the equations, or is this just fine? If it's fine, how to we interpret negative values in model comparisons (e.g., closer to zero is better, or negative values meaning those models are crappy anyway?).
I think the paper states the following in the table "Possible misconceptions and clarifications":
Model selection criteria have values restricted to a specific range.
Unlike R2, which varies between 0 and 1 and has a useful interpretation, the model selection criteria do not have a scale. Thus, a wide range of values (including negative values) are possible. Furthermore, there are no strict “cut-off” values to indicate which models are important.

Therefore, negative values are possible and not necessarily a sign of a bad model. The model selection criteria are only meaningful for comparing models and therefore selecting the one with the largest value.
skono wrote:
Fri Jun 29, 2018 8:59 pm
(c) I have one model that has a moderated path to the target outcome variable. And the RMSE and MAE for this model is remarkably lower (and thus better prediction) than other models'. However, this moderated model has lower explanation power. I understand that explanation and prediction are two different things. Would this be one of those cases where low explanatory power models predict well, OR is there any problems around applying PLS Predict and RMSE/MAE to moderated models?
There should not be a problem with the PLSpredict and moderation models. Hence, you probably encounter such a case where prediction is better than explanation in your moderated model.
Dr. Jan-Michael Becker, University of Cologne, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Ja ... v=hdr_xprf