How to Import Data from Files
Importing SPSS, SAS, Stata Data
Unlike the CSV file format, the SPSS/SAS/Stata file formats contain their own data dictionary. Opal uses the haven R package (and then require a functional R server) to import data in these formats: value types, categories, missing values etc. are extracted from the R data structure read by haven.
Procedure
Note
Go to the destination project’s tables page,
Select Import,
Select the Data Format SPSS (R), or SAS (R) etc., follow instructions (upload data file, set options etc.) and launch the import task,
When the import task is completed, go to the new table’s page.
⇒ The table is created/updated with the imported data
Importing CSV Data
The following applies to any delimited textual file formats, CSV (comma delimiter) being the most common one.
Opal supports two CSV importers:
CSV, which is the built-in one,
CSV (R), available as a plugin (opal-datasource-readr, see Plugins administration), based on the readr R package.
The main difference between these two CSV importers is how the data types are handled. CSV (R) is recommended for its capability to detect data types.
CSV (R) Importer
This R-based importer will attempt to detect the data types by reading the first 1000 rows. In case a variable has no values in these first rows, the text value type will be used.
Procedure
Note
Preliminary: having the opal-datasource-readr plugin installed and a functional R server,
Go to the destination project’s tables page,
Select Import,
Select the Data Format CSV (R), follow instructions (upload CSV file, set options etc.) and launch the import task,
When the CSV (R) import task is completed, go to the new table’s page.
⇒ The table is created/updated with the imported data
CSV Importer
In the case of CSV importer, the data types are NOT guessed from the provided CSV file:
Either the destination table already exists and then the variable value types are the ones declared (if there is an inconsistency with the provided data, import will fail),
Or all the value types are text.
Procedure
Setting up the destination table variables prior to the CSV import can be counter intuitive and error prone (missing variables, wrong data types): this is NOT a recommended procedure. The recommended approach is to import first the CSV data as-is and then make a view (i.e. a logical table) to transform variable value types to the correct ones.
Note
Go to the destination project’s tables page,
Select Import,
Select the Data Format CSV, follow instructions (upload CSV file, set options etc.) and launch the import task,
When the CSV import task is completed, go to the new table’s page (note that all variables have text type),
Make a view based on the imported table: How to Transform Tables with Views.
⇒ The table is created/updated with the imported data, and is accessed through a view