Skip to content

Instruments

Overview

In Open Data Capture (ODC), an instrument is the unit of data collection: it defines what the user sees, what data is produced, and how that data is validated.

Instruments range from simple forms (e.g., a questionnaire assessing depressive symptoms) to complex interactive tasks (e.g., the Stroop Task).

This page is organized in two halves:

  • Instrument Model: What an instrument is, how it narrows by kind, how scalar vs series instruments differ, and how schema-driven typing works.
  • Instrument Sources and Bundling: How multi-file instrument sources become a single executable bundle that can be stored and executed at runtime.

Instrument Model

Runtime Representation

At runtime, an instrument is a plain JavaScript object. It is not a class instance, and it does not require any runtime type metadata; what makes an object a valid instrument is that it matches the shape expected by the ODC runtime and UI.

TypeScript: Compile-Time Structure

Although instruments are plain objects at runtime, ODC uses TypeScript to provide static type checking and compile-time data-shape enforcement.

The type system is designed to enforce and enable things like:

  • Discriminated narrowing: Based on kind, the type of content (and other fields) narrows to the correct variant.
  • Schema-inferred data typing: The instrument output data type is inferred from the Zod validation schema.
  • Localization shaping: Based on language, UI-facing values become either a single value or a per-language mapping.

These rules are implemented using TypeScript features such as:

It is important to understand that TypeScript types do not exist at runtime. Their role is to help instrument authors catch mistakes early.

Mental Model: “Instrument” Is a Discriminated Family

At the highest level, an Instrument is not one shape: it is a family of shapes that share a common envelope and then diverge based on a discriminator.

Two discriminators drive most of the design:

  1. kind selects what type of instrument this is, and TypeScript uses that to narrow the allowed shape of content (and other fields).
  2. language selects how UI-facing values are represented (single value vs per-language map).

There is also a higher-level split:

  • Scalar instruments: “completable” instruments that produce a single output payload, validated by a schema.
  • Series instruments: “compositional” instruments that reference multiple scalar instruments by identity.

The Base Instrument

Every instrument, regardless of kind, carries:

  • Runtime Compatibility: a fixed __runtimeVersion written by the helper functions (authors do not set it).
  • Metadata:
    • details: research/administrative metadata (title, description, license, etc.)
    • clientDetails (optional, but highly recommended): subject-facing metadata (instructions, duration, optional display title)
  • Tagging: tags are filterable labels for discovery.
  • Supported Languages: language decides whether UI fields are a plain value or a per-language map.
  • Discriminator: kind decides which variant you are in (form vs interactive vs series).
  • Content: content is the kind-specific payload.

Think of it like this simplified shape:

type Language = 'en' | 'fr';
// conceptual shape (not the exact type definition)
type BaseInstrument = {
__runtimeVersion: 1;
kind: 'FORM' | 'INTERACTIVE' | 'SERIES';
language: Language | Language[];
details: {
/* researcher-facing metadata; localized fields depend on language */
};
clientDetails?: {
/* subject-facing metadata; localized fields depend on language */
};
content: {
/* becomes specific after narrowing by kind */
};
};

The important point is not the exact properties; it is that the system is deliberately set up so that once kind and language are known, everything downstream becomes more specific.

How Narrowing Works: kind Drives Shape

The type system is built around the idea:

  • If kind === 'FORM', then content must be form content, and the instrument must have scalar-instrument capabilities (schema, measures, internal identity).
  • If kind === 'INTERACTIVE', then content must be interactive content (a render callback contract), and it is still scalar.
  • If kind === 'SERIES', then content is a list of references to scalar instruments, and it is not scalar.

Therefore, in the codebase, consumers can narrow the type of Instrument based on kind:

import type { AnyInstrument } from '/runtime/v1/@opendatacapture/runtime-core';
export function handleInstrument(instrument: AnyInstrument) {
switch (instrument.kind) {
case 'FORM':
// instrument.content is declarative form content
break;
case 'INTERACTIVE':
// instrument.content has render(done)
break;
case 'SERIES':
// instrument.content is an array of { edition, name } references
break;
}
}

The benefit is that the type system becomes predictable for both authors and consumers: kind selects the variant, and TypeScript follows.

Scalar vs Series Instruments

Scalar instruments are those that can be completed. They:

  • Have a stable internal identity (edition + name)
  • Define a validationSchema (Zod v3 or v4)
  • Produce one output payload (data) whose static type is derived from the schema output type
  • Can define measures derived from that output

Series instruments are orchestration containers. They:

  • Define an ordered list of scalar instrument identities (the same { edition, name } shape used by scalar internal)
  • Do not define validationSchema, measures, or internal (they are not directly completed)
Quick Comparison
CapabilityScalar (FORM / INTERACTIVE)Series (SERIES)
Produces one output payloadYesNo
Has validationSchemaYesNo
Has stable internal identityYesNo
Can define measuresYesNo
content describes UIYesNo

Where “Data” Comes From: Schema as the Source of Truth

A central design rule is:

  • The instrument output data type is derived from validationSchema.

This is why defineInstrument is so important. It creates a pipeline where one authoring artifact (the schema) becomes the source of truth for:

  • What the runtime will validate
  • What TypeScript believes the output type is
  • What downstream parts of the instrument (e.g., form fields) are allowed to be

Conceptually, the inference pipeline is:

  1. You choose kind (FORM or INTERACTIVE).
  2. You provide a Zod schema as validationSchema.
  3. TypeScript infers the output type from that schema.
  4. That inferred type flows into:
    • Scalar instrument content (“data drives UI”)
    • Scalar instrument measures (refs and computed values)

This accomplishes two things:

  1. Runtime validation and compile-time typing stay aligned by construction.
  2. Everything that depends on TData (measures, form field typing, etc.) becomes automatically type-safe.

Localization Model: One Rule Used Everywhere

ODC supports unilingual and multilingual instruments. The instrument’s language value selects the mode:

  • Unilingual: language is a single language (e.g. 'en').
    • UI fields are single values (e.g. title: string).
  • Multilingual: language is an array of languages (e.g. ['en', 'fr']).
    • UI fields become per-language objects (e.g. title: { en: string; fr: string }).

This rule is reused consistently across UI-facing properties:

  • details.title, details.description
  • clientDetails.instructions, clientDetails.title
  • tags
  • form field label/description
  • measure labels

Conceptually:

// Conceptual behavior
InstrumentUIOption<TLanguage, TValue> =
TLanguage is single language -> TValue
TLanguage is language array -> { [each language]: TValue }

The practical implication is that changing language changes the required structure of many fields.

Instruments by Kind (Narrowed View)

Form Instruments

Form instruments are scalar instruments whose content is a declarative description of fields. The key design is “data drives UI”:

  • The form’s schema defines the output data shape.
  • The form content is typed so that each key in the data shape can only be assigned fields that match the value type for that key.

Practically, you author:

  • A Zod schema for the output data (e.g. { overallHappiness: number })
  • content with fields keyed by those data keys (e.g. overallHappiness: { kind: 'number', ... })

The form system also supports two authoring styles:

  1. Flat: a direct mapping from data keys to field definitions.
  2. Grouped: an array of “sections”, each with a title/description and a subset of fields.

Dynamic behavior is modeled explicitly:

  • A field can be static (always present) or dynamic (conditionally rendered).
  • A dynamic field declares deps (the keys it depends on) and a render(...) function that returns a suitable static field or null.

The type system uses the output data type to ensure:

  • The field kind matches the expected value type
  • Dynamic field rendering returns a compatible field definition for that value type

This is a compile-time guarantee that the form authoring surface stays consistent with the instrument’s output.

Interactive Instruments

Interactive instruments are scalar instruments whose content is code-driven rather than declarative.

The contract is:

  • content.render(done) runs your instrument UI logic.
  • When finished, you call done(data) to complete with an output payload.

Two notable constraints are enforced by types:

  • Interactive output data is JSON-shaped (the interactive Data base is a JSON type).
  • The done(data) payload type must match what the validationSchema accepts/produces (because TData is inferred from schema output).

Interactive content may also provide:

  • Optional html scaffolding
  • meta tags
  • staticAssets (key/value asset map)
  • A restricted head injection surface for legacy script/style (__injectHead)
Series Instruments

Series instruments are not scalar. Their job is to refer to scalar instruments, not define output themselves.

Conceptually, a series is:

  • metadata + localization + tags
  • plus content: [{ edition, name }, ...]

The runtime can interpret those references as “run these scalar instruments in order”.

The type system keeps the distinction sharp:

  • Series content is references, not executable UI
  • Series has no schema/measures/internal identity

Measures: Derived Values From Scalar Output

Scalar instruments can define measures, which are named derived values displayed/used by the system.

There are two measure modes:

  • Constant (ref): point at a key in the output data.
    • The type system attempts to constrain ref to keys whose values are “measure-compatible” (primitive-ish types the UI can display).
    • If the data type is too broad to analyze, the constraint relaxes (so authors can still write measures, but with less compiler help).
  • Computed: compute a value from the full data payload.
    • The function receives typed data, so computed measures are checked against the schema-derived output type.

Measures are also localized via the same language rule (measure labels are single strings or per-language maps depending on language).

Authoring Instruments

It is possible to export a plain object as an Instrument (because Instruments are just JavaScript objects at runtime). However, most instruments should use the public helper functions because they set runtime-controlled fields and make TypeScript inference and narrowing work as intended.

Public Runtime API

In the runtime environment, users should define instruments by importing helper functions from the runtime v1 entrypoint:

import { defineInstrument, defineSeriesInstrument } from '/runtime/v1/@opendatacapture/runtime-core';

Scalar instruments (FORM and INTERACTIVE) define a validation schema using Zod:

import { z } from '/runtime/v1/zod@3.x';

defineInstrument (Scalar)

Use defineInstrument(...) for FORM and INTERACTIVE instruments.

It is recommended because it:

  • Sets runtime-controlled fields (notably the runtime version marker)
  • Makes TypeScript infer the output data type from validationSchema

Minimal form example:

import { defineInstrument } from '/runtime/v1/@opendatacapture/runtime-core';
import { z } from '/runtime/v1/zod@3.x';
export default defineInstrument({
kind: 'FORM',
language: 'en',
tags: ['Example'],
internal: { edition: 1, name: 'HAPPINESS_QUESTIONNAIRE' },
clientDetails: {
estimatedDuration: 1,
instructions: ['Please answer based on your current feelings.']
},
content: {
overallHappiness: {
kind: 'number',
label: 'How happy are you overall?',
description: 'Please select a number from 1 to 10 (inclusive)',
min: 1,
max: 10,
variant: 'slider'
}
},
details: {
title: 'Happiness Questionnaire',
description: 'A questionnaire about happiness.',
license: 'Apache-2.0'
},
measures: null,
validationSchema: z.object({
overallHappiness: z.number().int().min(1).max(10)
})
});

Minimal interactive example:

import { defineInstrument } from '/runtime/v1/@opendatacapture/runtime-core';
import { z } from '/runtime/v1/zod@3.x';
export default defineInstrument({
kind: 'INTERACTIVE',
language: 'en',
tags: ['Example'],
internal: { edition: 1, name: 'CLICK_THE_BUTTON_TASK' },
clientDetails: {
estimatedDuration: 1,
instructions: ['Please click the button when you are done.']
},
content: {
render(done) {
const start = Date.now();
const button = document.createElement('button');
button.textContent = 'Submit Instrument';
document.body.appendChild(button);
button.addEventListener('click', () => {
done({ seconds: (Date.now() - start) / 1000 });
});
}
},
details: {
title: 'Click the Button Task',
description: 'A very simple interactive instrument.',
license: 'Apache-2.0'
},
measures: null,
validationSchema: z.object({
seconds: z.number()
})
});

defineSeriesInstrument (Series)

Use defineSeriesInstrument(...) for SERIES instruments.

A series instrument typically looks like:

import { defineSeriesInstrument } from '/runtime/v1/@opendatacapture/runtime-core';
export default defineSeriesInstrument({
kind: 'SERIES',
language: 'en',
tags: ['Onboarding'],
details: {
title: 'Onboarding',
description: 'A series of onboarding instruments.',
license: 'Apache-2.0'
},
content: [
{ edition: 1, name: 'CONSENT_FORM' },
{ edition: 3, name: 'DEMOGRAPHICS' }
]
});

Repo-Only Constraint: License Narrowing

When this package is compiled inside the ODC repo, a global type (OpenDataCaptureContext) marks isRepo: true, and the definition type is intersected with an extra requirement:

  • inside repo: details.license must be an ApprovedLicense
  • outside repo: details.license can be any LicenseIdentifier

This is a type-level policy hook: it changes authoring constraints without changing runtime behavior. If you are authoring instruments in the instrument playground for your own instance, this will not affect you.

Instrument Sources and Bundling

Instrument Sources

A single-file instrument source (like the examples above) is the simplest case, but many instruments are multi-file:

  • interactive instruments often include separate HTML/CSS/JS modules
  • assets such as fonts and images may be imported

In the simplest case, your instrument source is just an index.ts/index.js file whose default export is a valid Instrument object (typically produced by defineInstrument(...) or defineSeriesInstrument(...)).

If instruments were defined directly in the Open Data Capture codebase, adding or updating instruments would require rebuilding and redeploying the application.

To avoid that, ODC treats “instrument sources” as the authoring units (a set of source files), and a bundling pipeline turns those sources into a single executable artifact.

Instrument Bundler

Although we often talk about files, the instrument bundler is platform-agnostic and does not require a filesystem.

Instead, it accepts an array of inputs:

  • each input is a virtual file with a name and source

Those inputs are bundled into a single output using esbuild with a custom plugin.

The bundler identifies the entrypoint by searching the inputs for an index file name (in order):

  • index.tsx
  • index.jsx
  • index.ts
  • index.js

Once found, the bundler injects a tiny entry module into esbuild. For example, if the index file is index.js, the injected entry looks like:

import instrument from './index.js'; var __exports = instrument;

The effect is that the instrument’s default export becomes available under a known name (__exports) in the bundle.

Our custom plugin assumes responsibility for resolving all imports found:

  • Relative imports must exist in the inputs (or the bundler throws)
  • HTTP/runtime imports are treated as external (e.g., import React from '/runtime/v1/react@18.x')

The bundling step produces:

  • one JavaScript virtual file
  • and, if CSS was imported, one CSS virtual file that aggregates all stylesheets used by the instrument

When CSS is present, the bundler converts the CSS bundle to base64 and injects it into the JavaScript bundle by attaching it to __exports:

// lots of stuff generated by esbuild
var __exports = index_default;
__exports.content.__injectHead.style = '...'; // base64 encoded string

Then, when rendering interactive instruments, a script looks for the special content.__injectHead property and decodes the styles if found. This is inserted into the document head before the render method is called.

This new content is then passed into the esbuild transpiler as a single asynchronous Immediately Invoked Function Expression that resolves to a Promise of an Instrument. This output conforms to the ECMAScript 2022 Language Specification and can be executed in any modern browser. At this point, the code can be minified and tree-shaken (i.e., dead code removed).

For example:

(async () => {
// code with styles injected (if applicable)
return __exports;
})();

Storage and Execution

The final bundle output is stored in the database.

At runtime, it can be evaluated in the global scope (using indirect eval or the Function constructor) to produce a Promise of an Instrument object, which the runtime can then render/execute.