Page **1** of **1**

### About Outliers

Posted: **Wed Dec 02, 2015 12:46 pm**

by **thomasmakz@gmail.com**

Dear all,

I have a a question. My primary data has 472 observations after removal of straight lining, and I found there were more than 30 outliers (univariate outlier, manifest variable). I do not think removal of these outlier is a wise solution. Shall I keep it as a subgroup and run a multi group analysis? Or other treatment I can do?

Thank you all.

Thomas

### Re: About Outliers

Posted: **Wed Dec 02, 2015 3:20 pm**

by **thomasmakz@gmail.com**

I have run PLS, here is my findings

Fullset (include outlier)

Construct A: AVE =0.697 ; CR = 0.920 , Cronbach's Alpha = 0.891

Construct B: AVE = 0.756; CR = 0.925, Cronbach's Alpha = 0.892

Construct C:AVE = 0.691; CR = 0.870, Cronbach's Alpha = 0.779

Construct D:AVE = 0.887; CR = 0.959, Cronbach's Alpha = 0.936

Path: A-->D: 0.456 (p<0.001)

Path B--> D: 0.204 (p<0.001)

Path C-->D: 0.132 (ns)

D adjusted R-square: 0.5, Q-square: 0.441

Dataset without outlier (totally 30 outlier out of 481 data)

Construct A: AVE =0.736 ; CR = 0.933 , Cronbach's Alpha = 0.910

Construct B: AVE = 0.776; CR = 0.933, Cronbach's Alpha = 0.904

Construct C:AVE = 0.733 CR = 0.892, Cronbach's Alpha = 0.819

Construct D:AVE = 0.917; CR = 0.959, Cronbach's Alpha = 0.954

Path: A-->D: 0.381 (p<0.001)

Path B--> D: 0.236 (p<0.001)

Path C-->D: 0.189 (p<0.01)

D adjusted R-square: 0.514, Q-square: 0.441

Outlier Dataset

Construct A: AVE =0.495 ; CR = 0.826 , Cronbach's Alpha = 0.738

Construct B: AVE = 0.662; CR = 0.887, Cronbach's Alpha = 0.836

Construct C:AVE = 0.418; CR = 0.586, Cronbach's Alpha = 0.418

Construct D:AVE = 0.732; CR = 0.891, Cronbach's Alpha = 0.817

Path: A-->D: 0.673 (p<0.001)

Path B--> D: 0.117 (ns)

Path C-->D: -0.008 (ns)

D adjusted R-square: 0.552, Q-square: 0.441

However, I run the muligroup analysis:

PLS-MGA:

A-->D, path mean diff = 0.292 (ns)

B-->D, path mean diff = 0.119 (ns)

C-->D, path mean diff = 0.197 (ns)

So, shall i drop the 30 outliers for better reliability and validity as well as R-square? However, it has no sig difference in path coefficient...please help.

### Re: About Outliers

Posted: **Wed Dec 02, 2015 5:14 pm**

by **jmbecker**

I would always be careful with excluding outliers. You need to explain, why you think that the outliers are not meaningful and not just extreme cases of the population. Just because there are some very unsatisfied customers in a customer satisfaction survey does not make the worth excluding. You would lose valuable information.

However, given the lower reliability of your measures it seems that these respondents have different response styles and interpret the measurement items differently (or act randomly?). You really need to investigate the outliers to judge their situation.

### Re: About Outliers

Posted: **Fri Dec 11, 2015 9:21 am**

by **Hengkov**

Hi,

In some situations, outliers can not be excluded. It would not describe the real situation there. If the presence of outlier results remain good, why should be removed?

With only pursue the fulfillment of assumptions and so on, it's not a good reason to remove outliers or change it with winsorize also not a good option.

Best,

### Re: About Outliers

Posted: **Mon Mar 09, 2020 8:15 am**

by **PLStudent**

Hello! I have some dilemma regarding cleaning or data preparation before conducting PLS-SEM analysis. I want to test model which contains 5 latent constructs all measured in a reflective way. All manifest variables are Likert type with a range of response from 1 (strongly disagree) till 9 (strongly agree). Obtained results show pretty low outer loading and few significant path coefficients with coefficients of determination 0.01, 0.06 and 0.144. My research is from social sciences field so it can be considered as acceptable. My main concern is outliers. Which methods to identify outliers are acceptable bearing in mind that all my manifest variables are Likert type? I examined data to observe cases where respondents answered with only one or two values (they did not make try to discriminate there answeres) or answered in some patterns (strait lining answers)...I also used SPSS and boxplots to identify some extreme values for each manifest item because I saw in some papers that Likert ordinal data can be treated as an interval scale. Also, some researches propose to use Median absolute deviation method to detect outliers. Please, could you give me a piece of advice on how to conduct in this outlier identification issue?