Hi,
I am struggling with the import of my CSV-Files in terms of missing values.
The data are covered in SAV-Files and missing values are coded with "-999999". I have a lot of those files and therefore I am using a script in R to automatically save them as CSV-Files in order to make them usable for PLS.
The CSV-Files seem to be correct, because the number of missing values is identical with the CSV-File I manually save from SPSS. So there is no mistake with the script from R. But when it comes to importing the data into SmartPLS (and after coding the missing values there!) the number of missing values is far below the actual number. The funny thing is, that if I do the same thing with the CSV-File, which I manually saved from SPSS, the number of missing values is correct.
So both CSV-Files have the same amount of missing values, but when I import them, one has less missing values than the other. Any ideas how to solve this? Or any suggestions where the problem is located?
Thanks in advance!
Best Regards
Flo
Problem with Importing Data (Missing Values) from SAV to CSV to SmartPLS
-
- SmartPLS Developer
- Posts: 1284
- Joined: Tue Mar 28, 2006 11:09 am
- Real name and title: Dr. Jan-Michael Becker
Re: Problem with Importing Data (Missing Values) from SAV to CSV to SmartPLS
First, are you using SmartPLS 3 or SmartPLS2? You are posting here in the SmartPLS 2 forum.
Apparently, you script does something strange as the CSV export from SPSS directly seems to give correct results in SmartPLS. It seems that are is a formatting problem.
Two ideas:
Do your missing values have decimal places in the CSV? That could lead to problems if you write them as -999999.00 or -999999,00. Generally, importing problems often come from incorrect decimal delimiter setting (i.e., using a comma or a dot).
SmartPLS3 also treats empty cells as missing values. Maybe you have empty cells in addition to your -999999 missing values?
Apparently, you script does something strange as the CSV export from SPSS directly seems to give correct results in SmartPLS. It seems that are is a formatting problem.
Two ideas:
Do your missing values have decimal places in the CSV? That could lead to problems if you write them as -999999.00 or -999999,00. Generally, importing problems often come from incorrect decimal delimiter setting (i.e., using a comma or a dot).
SmartPLS3 also treats empty cells as missing values. Maybe you have empty cells in addition to your -999999 missing values?
Dr. Jan-Michael Becker, BI Norwegian Business School, SmartPLS Developer
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Researchgate: https://www.researchgate.net/profile/Jan_Michael_Becker
GoogleScholar: http://scholar.google.de/citations?user ... AAAJ&hl=de
Re: Problem with Importing Data (Missing Values) from SAV to CSV to SmartPLS
Thank you for your reply!
First, I have to admit, that i am indeed using SmartPLS 3. I somehow got into the wrong section. Sorry for that!
Regarding your ideas, I also have the feeling about the formatting problem.
But there are no empty cells and all missing values are coded in the same way without any differences in terms of delimiter settings, when I compare both files.
As I just discovered there are differences when it comes to large or small number. Using the export from SPSS gives exactly the same number as in SPSS, while the script version transforms them to e.g. 4,64E+09 instead of 4636000000. This might be a reason for different median etc. calculations when I import the data in SmartPLS as well. But I don't know if this is the same reason as for the missing values, because these ones seem to be correct.
First, I have to admit, that i am indeed using SmartPLS 3. I somehow got into the wrong section. Sorry for that!
Regarding your ideas, I also have the feeling about the formatting problem.
But there are no empty cells and all missing values are coded in the same way without any differences in terms of delimiter settings, when I compare both files.
As I just discovered there are differences when it comes to large or small number. Using the export from SPSS gives exactly the same number as in SPSS, while the script version transforms them to e.g. 4,64E+09 instead of 4636000000. This might be a reason for different median etc. calculations when I import the data in SmartPLS as well. But I don't know if this is the same reason as for the missing values, because these ones seem to be correct.
- cringle
- SmartPLS Developer
- Posts: 818
- Joined: Tue Sep 20, 2005 9:13 am
- Real name and title: Prof. Dr. Christian M. Ringle
- Location: Hamburg (Germany)
- Contact:
Re: Problem with Importing Data (Missing Values) from SAV to CSV to SmartPLS
When the data export from SPSS includes such values "4,64E+09", SmartPLS 3 cannot correctly read the data file. "E" is a string and SmartPLS 3 only expects values.
The rule for SmartPLS data:
Best regards
Christian
The rule for SmartPLS data:
- Only use text in the first row (i.e., the header which contains the variable names); sometimes, we also experienced problems when using special characters in the first row (e.g., names in French with an accent). Hence, no special characters in the variable names.
- Otherwise, only use numbers (and never a string).
Best regards
Christian
Prof. Dr. Christian M. Ringle, Hamburg University of Technology (TUHH), SmartPLS
- Literature on PLS-SEM: https://www.smartpls.com/documentation
- Google Scholar: https://scholar.google.de/citations?use ... AAAJ&hl=de
- Literature on PLS-SEM: https://www.smartpls.com/documentation
- Google Scholar: https://scholar.google.de/citations?use ... AAAJ&hl=de