Workflow Parameters & Configuration Guide
This guide explains how parameters and compute settings work in Pennsieve workflows
How Parameters Work
Every processor in a workflow can accept parameters that control its behavior. For example,
an alignment processor might accept a reference genome and a number of threads.
Parameters are defined by the processor author when they register their application.
When you create a workflow or run it, you can set values for these parameters.
Where Parameter Values Come From
There are three places a parameter value can come from, listed from highest to lowest priority:
- Run -- You provide a value when you trigger the run. This always wins.
- Workflow -- The workflow author sets a default when creating or editing the workflow.
- Application -- The processor author sets a default when registering the processor.
When a run starts, the system merges all three levels. If the same parameter is set
at multiple levels, the highest-priority value is used.
Example:
| Parameter | Application default | Workflow default | Run value | What the processor gets |
|---|---|---|---|---|
| reference | hg38 | -- | mm10 | mm10 (run wins) |
| threads | 4 | 8 | -- | 8 (workflow wins over app) |
| mode | fast | -- | -- | fast (app default used) |
Required vs Optional Parameters
Whether a parameter is required or optional is determined by the application author
when they register their processor. The rule is simple:
- If the application author provides a default value for a parameter, it is optional.
You can override it, but you don't have to. - If the application author does not provide a default value, the parameter is required.
You must provide a value either in the workflow defaults or when you trigger a run.
If it's missing, the run will be rejected.
Viewing Parameters
When you view a workflow (GET /definitions/{id}), each processor node includes:
- paramSchema -- The full list of parameters the processor accepts, including their
types, descriptions, and default values. Use this to understand what you can configure. - defaultParams -- The values the workflow author has explicitly set for this workflow.
- resolvedParams -- A preview of what the processor would receive if you ran the
workflow right now without any overrides. This merges the workflow defaults with
the application defaults.
Setting Parameters
When creating a workflow (POST /definitions):
Set defaults per processor node using defaultParams:
{
"processors": [
{
"id": "proc-1",
"type": "processor",
"sourceUrl": "https://github.com/org/alignment-tool",
"defaultParams": {
"threads": "8",
"reference": "hg38"
}
}
]
}
When updating a workflow (PATCH /definitions/{id}):
Update defaults for specific nodes without recreating the workflow:
{
"defaultParams": {
"proc-1": {
"threads": "16"
}
}
}
When running a workflow (POST /runs):
Override any parameter for this specific run using processorParams:
{
"processorParams": {
"proc-1": {
"reference": "mm10"
}
}
}
Only include the parameters you want to override. Everything else uses the
workflow or application defaults.
How Compute Settings Work
Each processor runs in a container that needs CPU, memory, and a compute type
(where it runs). These settings are determined automatically but can be overridden.
Compute Type
The compute type controls where the processor runs:
| Compute type | What it means |
|---|---|
| standard | Runs on AWS Fargate (general purpose, supports any workload) |
| lambda | Runs on AWS Lambda (fast startup, limited to shorter tasks) |
| gpu | Runs on GPU-enabled Fargate (for machine learning, etc.) |
How it's determined (highest to lowest priority):
- Run -- Set
executionTargetinprocessorConfigswhen triggering a run - Workflow node -- Set
computeTypeon the processor node when creating the workflow - Default -- Falls back to
"standard"
If you specify a compute type that the processor doesn't support (e.g., requesting gpu
for a processor that only supports standard), the run will be rejected.
CPU and Memory
CPU and memory control how much compute power the container gets.
| Setting | Typical values | Unit |
|---|---|---|
| CPU | 256, 512, 1024, 2048, 4096 | CPU units (1024 = 1 vCPU) |
| Memory | 512, 1024, 2048, 4096, 8192 | MiB |
How they're determined (highest to lowest priority):
- Run -- Set
cpuandmemoryinprocessorConfigswhen triggering a run - Application -- The processor author sets sensible defaults when registering the processor
There are no workflow-level defaults for CPU and memory. The processor author
knows their application's resource needs best.
Example -- overriding CPU and memory for a specific run:
{
"workflowInstanceConfiguration": {
"workflowId": "...",
"computeNodeId": "...",
"processorConfigs": [
{
"nodeId": "proc-1",
"executionTarget": "standard",
"cpu": "4096",
"memory": "8192"
}
]
}
}
If you leave cpu or memory empty, the processor's registered defaults are used.
Version
Each processor can be pinned to a specific version (a git tag or commit hash).
If not specified, it defaults to "latest".
Summary
| Setting | Set by app author | Set by workflow author | Set at run time |
|---|---|---|---|
| Parameters | Default values | defaultParams per node | processorParams per node |
| Compute type | Supported types | computeType per node | executionTarget in processorConfigs |
| CPU / Memory | Default values | -- | cpu / memory in processorConfigs |
| Version | -- | -- | version in processorConfigs |
Updated about 3 hours ago