Workflow Parameters & Configuration Guide

This guide explains how parameters and compute settings work in Pennsieve workflows

How Parameters Work

Every processor in a workflow can accept parameters that control its behavior. For example,
an alignment processor might accept a reference genome and a number of threads.

Parameters are defined by the processor author when they register their application.
When you create a workflow or run it, you can set values for these parameters.

Where Parameter Values Come From

There are three places a parameter value can come from, listed from highest to lowest priority:

  1. Run -- You provide a value when you trigger the run. This always wins.
  2. Workflow -- The workflow author sets a default when creating or editing the workflow.
  3. Application -- The processor author sets a default when registering the processor.

When a run starts, the system merges all three levels. If the same parameter is set
at multiple levels, the highest-priority value is used.

Example:

ParameterApplication defaultWorkflow defaultRun valueWhat the processor gets
referencehg38--mm10mm10 (run wins)
threads48--8 (workflow wins over app)
modefast----fast (app default used)

Required vs Optional Parameters

Whether a parameter is required or optional is determined by the application author
when they register their processor. The rule is simple:

  • If the application author provides a default value for a parameter, it is optional.
    You can override it, but you don't have to.
  • If the application author does not provide a default value, the parameter is required.
    You must provide a value either in the workflow defaults or when you trigger a run.
    If it's missing, the run will be rejected.

Viewing Parameters

When you view a workflow (GET /definitions/{id}), each processor node includes:

  • paramSchema -- The full list of parameters the processor accepts, including their
    types, descriptions, and default values. Use this to understand what you can configure.
  • defaultParams -- The values the workflow author has explicitly set for this workflow.
  • resolvedParams -- A preview of what the processor would receive if you ran the
    workflow right now without any overrides. This merges the workflow defaults with
    the application defaults.

Setting Parameters

When creating a workflow (POST /definitions):

Set defaults per processor node using defaultParams:

{
  "processors": [
    {
      "id": "proc-1",
      "type": "processor",
      "sourceUrl": "https://github.com/org/alignment-tool",
      "defaultParams": {
        "threads": "8",
        "reference": "hg38"
      }
    }
  ]
}

When updating a workflow (PATCH /definitions/{id}):

Update defaults for specific nodes without recreating the workflow:

{
  "defaultParams": {
    "proc-1": {
      "threads": "16"
    }
  }
}

When running a workflow (POST /runs):

Override any parameter for this specific run using processorParams:

{
  "processorParams": {
    "proc-1": {
      "reference": "mm10"
    }
  }
}

Only include the parameters you want to override. Everything else uses the
workflow or application defaults.


How Compute Settings Work

Each processor runs in a container that needs CPU, memory, and a compute type
(where it runs). These settings are determined automatically but can be overridden.

Compute Type

The compute type controls where the processor runs:

Compute typeWhat it means
standardRuns on AWS Fargate (general purpose, supports any workload)
lambdaRuns on AWS Lambda (fast startup, limited to shorter tasks)
gpuRuns on GPU-enabled Fargate (for machine learning, etc.)

How it's determined (highest to lowest priority):

  1. Run -- Set executionTarget in processorConfigs when triggering a run
  2. Workflow node -- Set computeType on the processor node when creating the workflow
  3. Default -- Falls back to "standard"

If you specify a compute type that the processor doesn't support (e.g., requesting gpu
for a processor that only supports standard), the run will be rejected.

CPU and Memory

CPU and memory control how much compute power the container gets.

SettingTypical valuesUnit
CPU256, 512, 1024, 2048, 4096CPU units (1024 = 1 vCPU)
Memory512, 1024, 2048, 4096, 8192MiB

How they're determined (highest to lowest priority):

  1. Run -- Set cpu and memory in processorConfigs when triggering a run
  2. Application -- The processor author sets sensible defaults when registering the processor

There are no workflow-level defaults for CPU and memory. The processor author
knows their application's resource needs best.

Example -- overriding CPU and memory for a specific run:

{
  "workflowInstanceConfiguration": {
    "workflowId": "...",
    "computeNodeId": "...",
    "processorConfigs": [
      {
        "nodeId": "proc-1",
        "executionTarget": "standard",
        "cpu": "4096",
        "memory": "8192"
      }
    ]
  }
}

If you leave cpu or memory empty, the processor's registered defaults are used.

Version

Each processor can be pinned to a specific version (a git tag or commit hash).
If not specified, it defaults to "latest".


Summary

SettingSet by app authorSet by workflow authorSet at run time
ParametersDefault valuesdefaultParams per nodeprocessorParams per node
Compute typeSupported typescomputeType per nodeexecutionTarget in processorConfigs
CPU / MemoryDefault values--cpu / memory in processorConfigs
Version----version in processorConfigs