Project Tables¶
Tables give access to the project data along with their description. A table can be a raw table (i.e. with data persisted in the project’s database) or a logical table (also called view) which is a set of derived variables (data are computed on-demand).
See also details about how to manage tables and variables individually:
SQL¶
SQL is a very powerful language for data manipulation. SQL support in Opal allows to easily join tables, filtering, sorting and grouping data, make aggregations etc. SQL queries can be executed on one or more tables (or views) of a project from the web interface (or all projects using the programmatic API). Permission to access the values of the considered tables is required.
The supported SQL syntax is the the one of SQLite. More specifically see the SQL syntax and functions documentation.
The result of the SQL query can be downloaded from the web interface in CSV format. For a programmatic access to the SQL API, see the SQL python command and the opal.sql()
function in the opalr R package.
Note that in Opal, there is no variable for accessing the identifiers. Then when performing assignment of the table data into the SQL environment, an identifiers column is added and called _id
by default.
Note
- SQL API is read-only. Statements like CREATE, ALTER, DROP, INSERT, DELETE are not supported.
- Maximum number of columns is 2,000. SQL queries cannot be executed on tables with more than 1,999 variables (+ identifier). Use views to extract the variables of interest.
Table and Variable Naming¶
When executed in the context of a project, the simple table name can be used:
SELECT * FROM CNSIM1 LIMIT 10
If this simple table name contains a .
character it must be escaped by backquotes:
SELECT * FROM `StandingHeight.Baseline` LIMIT 10
When there is no project context, or when referring a table that is not in the current project, use the fully qualified table name with backquotes:
SELECT * FROM `CNSIM.CNSIM1` LIMIT 10
To desambiguate the column names, the table name can be used in SELECT
, JOIN
etc. statements:
SELECT CNSIM1._id, CNSIM1.GENDER, `CNSIM.CNSIM2`.PM_BMI_CATEGORICAL
FROM CNSIM1
LEFT JOIN `CNSIM.CNSIM2` ON CNSIM1._id = `CNSIM.CNSIM2`._id
The same escape rule applies to variable names, when they contain a .
character:
SELECT `InstrumentRun.timeStart`
FROM StandingHeight
LIMIT 10
or with fully qualified table name:
SELECT `baseline.StandingHeight`.`InstrumentRun.timeStart`
FROM `baseline.StandingHeight`
LIMIT 10
Functions¶
Agregate Functions
See the aggregation functions documentation.
SELECT avg(LAB_HDL) as HDL_AVG, GENDER
FROM CNSIM1
WHERE LAB_HDL is not null
GROUP BY GENDER
Date and Time Functions
See the date and time functions documentation.
SELECT *, date('now') AS extraction_date
FROM CNSIM1
LIMIT 10
Scalar Functions
See the scalar functions documentation.
SELECT round(LAB_HDL, 1) as HDL_ABS, GENDER
FROM CNSIM1
LIMIT 10
Paging¶
The directive LIMIT
and OFFSET
(combined with ORDER BY
) can be used to extract some part of the data. The following query gets the 101st to 111th lines of the query result output ordered by the identifiers:
SELECT *
FROM CNSIM1
ORDER BY _id
LIMIT 10 OFFSET 100
Union of Tables¶
When tables have the same columns, they can be stacked as follow:
SELECT * FROM CNSIM1
UNION ALL SELECT * FROM CNSIM2
LIMIT 10
Join Tables¶
Let’s join the tables samples
(columns _id
, Donor
, ConsentStatus
) and donors
(columns _id
, Gender
):
SELECT Donor AS ID, samples._id AS sample_id, ConsentStatus, Gender
FROM samples
LEFT JOIN donors ON donors._id = samples.Donor
LIMIT 10
And then make aggregations, for instance counting the number of donors having at least one “unknown” consent, per gender:
SELECT count(DISTINCT Donor) AS DonorsCount, Gender
FROM samples
LEFT JOIN donors ON donors._id = samples.Donor
WHERE ConsentStatus LIKE "%unknown"
GROUP BY Gender
Permissions¶
Specify the access rights to any table of the project and its content.
View dictionary and values of all tables Permission¶
Allow the user to see the data dictionary with individual-level data of any table.
Add table Permission¶
Allow the addition of a table or of a view, directly or via import or copy tasks.
Administrate Permission¶
Allow all operations on any table/view (including removing it).
Operations¶
Search Variables¶
Variables of the project’s tables can be searched. See Search Variables.
Download Dictionary¶
The whole project data dictionary can be download as an Excel file. This file is compatible with the operations of Add/Update Tables and Add View. When no tables are selected, the downloaded dictionary contains the definition of all tables. When some tables are selected, the dictionary contains only their definitions.
Backup Views¶
Create an archived backup of views selection (or all views).
Import Data¶
When importing data, Opal relies on the concept of datasource. This allows Opal to abstract the data importation process from the source datasource to the destination datasource regardless of their underlying implementations (file, SQL database etc.).
The importation process follows several steps:
- Data format selection: file-based (CSV, Opal Archive), server-based (SQL, Limesurvey, Opal)
- Data format specific options
- Incremental options
- Identifiers mapping options
- Data dictionary update review and table to to import selection
- Data to import review
- Archiving options when dealing with a file-based datasource
Some import options can be described as follow:
Option | Description |
---|---|
Incremental | Opal is able to detect new or updated data, relying on entity identifier and some timestamps. By default the data import is not
incremental, i.e. already existing data will be overridden.
|
Limit | A maximum number of data rows to be imported. Combined with the option, this allow to import small chunks of data at a time. |
Identifier Mapping | If an identifiers mapping is selected, each participant identifier encountered in the imported datasource is expected to be one of
the identifiers registered for this mapping. Depending on the identifiers mapping strategy the import could fail:
If no identifiers mapping is selected, the participant identifiers are imported as-is. Unknown participant identifiers will be automatically added in Opal.
|
Export Data¶
Selected tables (or currently viewed table) can be exported. The exportation process offers several options:
- Data format selection; file-based (CSV, Opal Archive), server-based (SQL)
- Data format specific options; destination folder or export database name
- Values filter options; available when a filter has been applied on the table’s values
- Identifier mapping options:
- if an identifiers mapping is selected each entity to be exported must have a mapped identifier. Otherwise the export will fail
- if no identifiers is selected the data are exported with system identifiers
Copy Data¶
Selected tables (or all tables) can be copied into another project or in the same project but with a different name (table renaming is available only when one table is selected for copy).
Add Table¶
Adds a table to the project. Each table must have a unique name and an entity type.
Add/Update table from Dictionary¶
A table can be added or updated from a data dictionary file. This data dictionary can be an Excel file (see Excel file template) or a view XML file (this can be obtained from an existing view by selecting “Download View XML”). For importing data dictionary from another format (SPSS file for instance), an alternate solution is to follow the process of importing data with the setting of limiting to 0 data rows (3rd screen in the import data wizard).
An advanced option offers the possibility to merge the data dictionaries when doing an update (otherwise default behavior is to override the properties and attributes).
Add View¶
This operation follows a step-by-step procedure:
- Specify the view name and the data dictionary (optional). The data dictionary can be provided as an XML file (this can be obtained from an existing view by selecting “Download View XML”) or an Excel file (see Excel file template). If a view with same name already exists, confirmation for overriding it is required. If a plain table already exists with same name, the operation is not allowed.
- Specify which tables this view refers to (required).
Derived variable algorithms are expressed using Magma Javascript API.
Restore Views¶
Restore backed up views. Restored views of the same name as those of existing views will be skipped unless the override options is checked.
Remove¶
Removes the selected tables/views from the project and deletes its data.