Project Tables

Tables give access to the project data along with their description. A table can be a raw table (i.e. with data persisted in the project’s database) or a logical table (also called view) which is a set of derived variables (data are computed on-demand).

See also details about how to manage tables and variables individually:

SQL

SQL is a very powerful language for data manipulation. SQL support in Opal allows to easily join tables, filtering, sorting and grouping data, make aggregations etc. SQL queries can be executed on one or more tables (or views) of a project from the web interface (or all projects using the programmatic API). Permission to access the values of the considered tables is required.

The supported SQL syntax is the the one of SQLite. More specifically see the SQL syntax and functions documentation.

The result of the SQL query can be downloaded from the web interface in CSV format. For a programmatic access to the SQL API, see the SQL python command and the opal.sql() function in the opalr R package.

Note that in Opal, there is no variable for accessing the identifiers. Then when performing assignment of the table data into the SQL environment, an identifiers column is added and called _id by default.

Note

SQL API is read-only. Statements like CREATE, ALTER, DROP, INSERT, DELETE are not supported.
Maximum number of columns is 2,000. SQL queries cannot be executed on tables with more than 1,999 variables (+ identifier). Use views to extract the variables of interest.

Table and Variable Naming

When executed in the context of a project, the simple table name can be used:

SELECT * FROM CNSIM1 LIMIT 10

If this simple table name contains a . character it must be escaped by backquotes:

SELECT * FROM `StandingHeight.Baseline` LIMIT 10

When there is no project context, or when referring a table that is not in the current project, use the fully qualified table name with backquotes:

SELECT * FROM `CNSIM.CNSIM1` LIMIT 10

To desambiguate the column names, the table name can be used in SELECT, JOIN etc. statements:

SELECT CNSIM1._id, CNSIM1.GENDER, `CNSIM.CNSIM2`.PM_BMI_CATEGORICAL
  FROM CNSIM1
  LEFT JOIN `CNSIM.CNSIM2` ON CNSIM1._id = `CNSIM.CNSIM2`._id

The same escape rule applies to variable names, when they contain a . character:

SELECT `InstrumentRun.timeStart`
  FROM StandingHeight
  LIMIT 10

or with fully qualified table name:

SELECT `baseline.StandingHeight`.`InstrumentRun.timeStart`
  FROM `baseline.StandingHeight`
  LIMIT 10

Functions

Agregate Functions

See the aggregation functions documentation.

SELECT avg(LAB_HDL) as HDL_AVG, GENDER
    FROM CNSIM1
    WHERE LAB_HDL is not null
    GROUP BY GENDER

Date and Time Functions

See the date and time functions documentation.

SELECT *, date('now') AS extraction_date
    FROM CNSIM1
    LIMIT 10

Scalar Functions

See the scalar functions documentation.

SELECT round(LAB_HDL, 1) as HDL_ABS, GENDER
    FROM CNSIM1
    LIMIT 10

Paging

The directive LIMIT and OFFSET (combined with ORDER BY) can be used to extract some part of the data. The following query gets the 101st to 111th lines of the query result output ordered by the identifiers:

SELECT *
  FROM CNSIM1
  ORDER BY _id
  LIMIT 10 OFFSET 100

Union of Tables

When tables have the same columns, they can be stacked as follow:

SELECT * FROM CNSIM1
  UNION ALL SELECT * FROM CNSIM2
  LIMIT 10

Join Tables

Let’s join the tables samples (columns _id, Donor, ConsentStatus) and donors (columns _id, Gender):

SELECT Donor AS ID, samples._id AS sample_id, ConsentStatus, Gender
    FROM samples
    LEFT JOIN donors ON donors._id = samples.Donor
    LIMIT 10

And then make aggregations, for instance counting the number of donors having at least one “unknown” consent, per gender:

SELECT count(DISTINCT Donor) AS DonorsCount, Gender
    FROM samples
    LEFT JOIN donors ON donors._id = samples.Donor
    WHERE ConsentStatus LIKE "%unknown"
    GROUP BY Gender

Permissions

Specify the access rights to any table of the project and its content.

View dictionary and values of all tables Permission

Allow the user to see the data dictionary with individual-level data of any table.

Add table Permission

Allow the addition of a table or of a view, directly or via import or copy tasks.

Administrate Permission

Allow all operations on any table/view (including removing it).

Operations

Search Variables

Variables of the project’s tables can be searched. See Search Variables.

Download Dictionary

The whole project data dictionary can be download as an Excel file. This file is compatible with the operations of Add/Update Tables and Add View. When no tables are selected, the downloaded dictionary contains the definition of all tables. When some tables are selected, the dictionary contains only their definitions.

Backup Views

Create an archived backup of views selection (or all views).

Import Data

When importing data, Opal relies on the concept of datasource. This allows Opal to abstract the data importation process from the source datasource to the destination datasource regardless of their underlying implementations (file, SQL database etc.).

The importation process follows several steps:

Data format selection: file-based (CSV, Opal Archive), server-based (SQL, Limesurvey, Opal)
Data format specific options
Incremental options
Identifiers mapping options
Data dictionary update review and table to to import selection
Data to import review
Archiving options when dealing with a file-based datasource

Some import options can be described as follow:

Option	Description
Incremental	Opal is able to detect new or updated data, relying on entity identifier and some timestamps. By default the data import is not incremental, i.e. already existing data will be overridden.
Limit	A maximum number of data rows to be imported. Combined with the option, this allow to import small chunks of data at a time.
Identifier Mapping	If an identifiers mapping is selected, each participant identifier encountered in the imported datasource is expected to be one of the identifiers registered for this mapping. Depending on the identifiers mapping strategy the import could fail: Each identifiers must be mapped prior importation (default): the import will fail if an imported identifier does not have corresponding system identifier for the selected mapping Ignore unknown identifiers at import: only data with identifier having a corresponding system identifier in the selected mapping will be imported Generate a system identifier for each unknown imported identifiers: a system identifier will be generated for each unknown imported identifier If no identifiers mapping is selected, the participant identifiers are imported as-is. Unknown participant identifiers will be automatically added in Opal.

Export Data

Selected tables (or currently viewed table) can be exported. The exportation process offers several options:

Data format selection; file-based (CSV, Opal Archive), server-based (SQL)
Data format specific options; destination folder or export database name
Values filter options; available when a filter has been applied on the table’s values
Identifier mapping options:
- if an identifiers mapping is selected each entity to be exported must have a mapped identifier. Otherwise the export will fail
- if no identifiers is selected the data are exported with system identifiers

Copy Data

Selected tables (or all tables) can be copied into another project or in the same project but with a different name (table renaming is available only when one table is selected for copy).

Add Table

Adds a table to the project. Each table must have a unique name and an entity type.

Add/Update table from Dictionary

A table can be added or updated from a data dictionary file. This data dictionary can be an Excel file (see Excel file template) or a view XML file (this can be obtained from an existing view by selecting “Download View XML”). For importing data dictionary from another format (SPSS file for instance), an alternate solution is to follow the process of importing data with the setting of limiting to 0 data rows (3rd screen in the import data wizard).

An advanced option offers the possibility to merge the data dictionaries when doing an update (otherwise default behavior is to override the properties and attributes).

Add View

This operation follows a step-by-step procedure:

Specify the view name and the data dictionary (optional). The data dictionary can be provided as an XML file (this can be obtained from an existing view by selecting “Download View XML”) or an Excel file (see Excel file template). If a view with same name already exists, confirmation for overriding it is required. If a plain table already exists with same name, the operation is not allowed.
Specify which tables this view refers to (required).

Derived variable algorithms are expressed using Magma Javascript API.

Restore Views

Restore backed up views. Restored views of the same name as those of existing views will be skipped unless the override options is checked.

Remove

Removes the selected tables/views from the project and deletes its data.