Variables and Data

The variables are organized in an abstract way, independently of the way they are persisted.

The following diagram presents a ‘traditional’ view of what is a table:

  • the ‘columns’ are the variables,

  • the ‘rows’ are the value sets for each entity,

  • the ‘cells’ are the variable entity values.

_images/magma-table.png

The following diagram shows the relationships between the different concepts:

_images/magma.png

Variables

Variables and Categories

A variable describes a set of values. The values of a variable are all of the same type. Possible value types are:

  • integer

  • decimal

  • text

  • binary

  • locale

  • boolean

  • datetime

  • date

  • point

  • linestring

  • polygon

A variable is about an entity, i.e. all the values for a variable are from the same entity type. Possible entity types are:

  • Participant

  • Instrument

A category describe some of the possible values of a variable. A category is associated to one and only one variable.

Datasources and Tables

A variable is in one and only one table.

A table has several variables and is in one and only one datasource.

A datasource has several tables. A datasource is not a database: it can be persisted in a database, using different schema. It can also be persisted in a file in xml or Excel formats. It is important to understand that Opal separates the formal description of the variables from the way they are persisted. This gives to Opal a lot of versatility.

Attributes

Datasources, variables and categories have attributes. These attributes provide additional meta-information. An attribute is made of:

  • a namespace (optional),

  • a name (required),

  • a locale (optional), that specifies in which language is the attribute value,

  • a value (required even if null).

Example of a variable asked_age which has the following attributes:

Name

Locale

Value

label

en

What is your age ?

label

fr

Quel est votre age ?

questionnaire

IdentificationQuestionnaire

page

P1

The variable asked_age has also some categories (with their attributes):

Name

Attributes

888

label:en=Don’t know
label:fr=Ne sait pas

999

label:en=Prefer not to answer
label:en=Préfère ne pas répondre

Fully Qualified Names

Each of these elements has a short name. A fully qualified name will identify them uniquely:

  • Datasource fully qualified name: <datasource_name>

  • Table fully qualified name: <datasource_name>.<table_name>

  • Variable fully qualified name: <datasource_name>.<table_name>:<variable_name>

The fully qualified name is useful for disambiguation.

Following the example of the asked_age variable, its fully qualified name could be: opal-data.IdentificationQuestionnaire:asked_age

Derived Variables

A derived variable is a variable which values are computed using a script. This script is expressed using the Magma Javascript API.

Views

Opal deals with variables and values in tables. Views are here to:

  • define a subset of a table, both in terms of variables and values,

  • define a subset of many tables in terms of variables and values,

  • define #Derived Variables that are to be resolved against ‘real’ ones.

These virtual tables are then manipulated just like standard tables (for instance they can be copied to a datasource).

Given table Table1:

ID

Var1

Var2

Var3

1

Value 1.1

Value 2.1

Value 3.1

2

Value 1.2

Value 2.2

Value 3.2

3

Value 1.3

Value 2.3

Value 3.3

A view can be defined so that the resulting ‘table’ may be View1:

ID

Var1

Var3

1

Value 1.1

Value 3.1

3

Value 1.3

Value 3.3

or View2:

ID

DerivedVar = function(Var1, Var2)

1

function(Value1.1, Value2.1)

3

function(Value1.3, Value2.3)

Given Table1 above and the following table Table2:

ID

VarA

VarB

VarC

100

Value A.100

Value B.100

Value C.100

200

Value A.200

Value B.200

Value C.200

300

Value A.300

Value B.300

Value C.300

A view can also be a combination or a ‘join’ of both tables, as in View3:

ID

Var1

Var3

VarA

VarC

1

Value 1.1

Value 3.1

3

Value 1.3

Value 3.3

100

Value A.100

Value C.100

300

Value A.300

Value C.300

Data

Entities

The entities can be of different types:

  • Participant (most common)

  • Instrument (provided by Onyx)

  • Workstation (provided by Onyx)

  • … (any that might fit your needs)

Each entity has a unique identifier. An entity can have several value sets, but only one value set for a particular table.

Value Types

The following table gives more information about the textual representation of a value, given a value type:

Value Type

Value as a String

integer

The string value must all be decimal digits, except that the first character may be an ASCII minus sign ‘-’ to indicate a negative value. The resulting integer has radix 10 and the supported range is [-2 63, 2 63-1].

decimal

As described by Java Double documentation.

text

As-is.

binary

Base64 encoded.

locale

String representation of a locale is <language>[_<country>[_<variant>]] (for instance en, en_CA etc.) where:

language: lowercase two-letter ISO-639 code.
country: uppercase two-letter ISO-3166 code.
variant: vendor specific code, see Java Locale.

boolean

True value if is equal, ignoring case, to the string “true”.

datetime

Date times are represented in ISO_8601 format: “yyyy-MM-dd’T’HH:mm:ss.SSSZ”
Supported input formats are (four digits year is required):

yyyy-MM-dd’T’HH:mm:ss.SSSZ
yyyy-MM-dd’T’HH:mm:ssZ
yyyy-MM-dd’T’HH:mmZ
yyyy-MM-dd’T’HH:mm:ss.SSSzzz
yyyy-MM-dd HH:mm:ss
yyyy/MM/dd HH:mm:ss
yyyy.MM.dd HH:mm:ss
yyyy MM dd HH:mm:ss
yyyy-MM-dd HH:mm
yyyy/MM/dd HH:mm
yyyy.MM.dd HH:mm
yyyy MM dd HH:mm

date

Dates are represented in ISO_8601 format: “yyyy-MM-dd”
Supported input formats are (four digits year is required):

yyyy-MM-dd
yyyy/MM/dd
yyyy.MM.dd
yyyy MM dd
dd-MM-yyyy
dd/MM/yyyy
dd.MM.yyyy
dd MM yyyy

point

Point coordinates (longitude, latitude).
Supported input formats are:

GeoJSON
JSON
Google Map

# GeoJSON
{type: “Point”, coordinates: [-71.34, 41.12]}

# GeoJSON coordinates only: [lon,lat]
[-71.34, 41.12]

# JSON (different flavours of keys)
{“lat” : 41.12,”lon” : -71.34}
{“lat” : 41.12,”lng” : -71.34}
{“latitude” : 41.12,”longitude” : -71.34}
{“lt” : 41.12,”lg” : -71.34}

# String, comma separated latitude and longitude (Google map like): lat,lon
41.12,-71.34

linestring

Array of point coordinates.
Supported input format is GeoJSON:

# GeoJSON
{type: “LineString”, coordinates: [[22.2,44.1],[33.4,55.3],[32.12,44]]}

# GeoJSON coordinates only
[[22.2,44.1],[33.4,55.3],[32.12,44]]

polygon

Array of shapes. A shape is a list of points. The last point must be equal to the first point.
Supported input format is GeoJSON:

# GeoJSON
{type: “Polygon”, coordinates: [[ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ]]}

# GeoJSON coordinates only: one shape polygon
[
[ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ]
]

# GeoJSON coordinates only: polygon with outter and inner shapes
[
[ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ],
[ [100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8], [100.2, 0.2] ]
]

Value Sets and Values

A value is associated to a variable and is part of a value set. Each value set is for a particular entity and a particular table. An entity has a maximum of one value set in one table.

A value is always associated with a type and a data (or a sequence of data if the variable is repeatable).