Pennsieve Analytics -- Operator Guide

Provisioning, deployment modes, logging, cost, and security for compute node operators

🚧

This documentation applies to Pennsieve Workflow Services V2 -- This is not enabled by default in each workspace. We expect a broad roll-out in Q2 2026

This guide covers provisioning, deployment modes, logging, cost, and security for compute node operators. It provides a close look on how Pennsieve implements compute nodes. Typically, users do not have to understand or interact with this level of infrastructure. Instead, users will allow Pennsieve to manage the infrastructure and only develop the applications that run on it.

For processor development (container contract, environment variables, dual-mode images), see the Processor Guide.


1. Infrastructure Overview

A compute node is a self-contained AWS environment that executes data processing workflows on the Pennsieve platform. Each compute node is provisioned with Terraform and runs entirely within a single AWS account.

Architecture

Pennsieve API                      AWS Account (Compute Node)
┌──────────┐    HTTP POST     ┌─────────────────────────────────────────────────┐
│ Workflow │ ───────────────► │  Compute Gateway (Lambda)                       │
│ Service  │                  │       │                                         │
└──────────┘                  │       ▼                                         │
                              │  Step Functions (Master Executor)               │
                              │       │                                         │
                              │       ▼                                         │
                              │  ASL Converter ── registers ECS task defs       │
                              │       │            AND/OR Lambda functions      │
                              │       │            per processor image          │
                              │       ▼                                         │
                              │  Dynamic Workflow State Machine                 │
                              │       │                                         │
                              │       ├─► Init (resolve packages, get URLs)     │
                              │       ├─► Data Transfer (S3 → EFS)              │
                              │       ├─► Status: STARTED                       │
                              │       ├─► Processor Stage 1 (ECS or Lambda)     │
                              │       ├─► Processor Stage 2 (ECS or Lambda)     │
                              │       ├─► ...                                   │
                              │       ├─► Cleanup (remove EFS temp files)       │
                              │       └─► Finalize (archive logs, cost report)  │
                              │                                                 │
                              │  Shared Resources:                              │
                              │    EFS ──── persistent file system              │
                              │    ECS ──── Fargate cluster (no idle compute)   │
                              │    S3  ──── log archive bucket                  │
                              │    SM  ──── per-execution credential secrets    │
                              └─────────────────────────────────────────────────┘

Deployment Modes

ModeNetworkingMonthly Base CostUse Case
BasicDefault VPC, public subnets~$2-4Development, testing
SecureCustom VPC + NAT Gateway + Flow Logs~$49-53Production
CompliantCustom VPC + VPC Endpoints + Flow Logs, no internet~$40-43Regulated environments

Basic mode uses the account's default VPC with public subnets. ECS tasks get public IPs and can access the internet directly. This is the simplest and cheapest option, suitable for development and testing. Note that Lambda processors in basic mode do not have internet access — Lambda functions in a VPC never receive a public IP, and basic mode has no NAT Gateway. Lambda processors can still read and write files on EFS normally; only outbound internet calls (e.g., calling external APIs) are unavailable.

Secure mode provisions a dedicated VPC with public and private subnets. All processors run in private subnets behind a NAT Gateway, which gives them full internet access while keeping them isolated from other resources in the account. All network traffic (inbound and outbound) is logged via VPC Flow Logs, providing a complete audit trail of every connection — what was contacted, when, and whether it was allowed or denied. The NAT Gateway is the main cost driver (~$45/month) but can be shared across multiple compute nodes in the same account to reduce per-node cost.

Compliant mode also provisions a dedicated VPC, but with no internet access at all. Instead, all AWS service calls (S3, ECR, CloudWatch, etc.) go through VPC endpoints that keep traffic entirely within the AWS network. This is designed for regulated environments where outbound internet access is prohibited by policy. Like secure mode, all network traffic is logged via VPC Flow Logs. Processors that require external API calls will not work in this mode.

For more details: see Compute Node Deployment Modes


Security

Credential Isolation

Session and refresh tokens are never stored in the Step Functions state in plaintext. The compute gateway writes both tokens to an AWS Secrets Manager secret at the start of each execution (wf-session/{nodeIdentifier}/{executionRunId}). Before each processor runs, a ResolveToken state reads the secret and injects the tokens into the processor's environment. The secret is deleted by the finalizer when the workflow completes.

All deployment modes use this Secrets Manager-based credential flow. Secrets Manager encrypts secrets at rest using AWS-managed keys (basic mode) or customer-managed KMS keys (secure/compliant mode).

Encryption

ResourceBasic ModeSecure/Compliant Mode
SFN state dataDefault encryptionKMS CMK
SFN CloudWatch logsDefault encryptionKMS CMK
Lambda CloudWatch logsDefault encryptionKMS CMK
ECS clusterDefault encryptionKMS CMK
S3 log archiveSSE-KMS (AWS-managed)SSE-KMS (AWS-managed)
Secrets Manager secretsAWS-managed keyAWS-managed key
EFSEncrypted at restEncrypted at rest

In secure and compliant modes, KMS customer-managed keys (CMKs) provide CloudTrail audit trails for all encrypt/decrypt operations and allow key policy controls on who can access the data.


2. Logging Strategy

Log Sources

A single workflow execution produces logs from multiple sources:

SourceCloudWatch Log GroupContent
ECS processor containers/ecs/processor/{nodeId}Processor stdout/stderr
Lambda processors/aws/lambda/proc-lambda-{nodeId}-*Processor stdout/stderr
Workflow init/aws/lambda/workflow-init-*Package resolution, URL generation
Data transfer/aws/lambda/data-transfer-*File downloads, merges, cleanup
Status updater/aws/lambda/workflow-status-updater-*Status transition confirmations
Workflow finalizer/aws/lambda/workflow-finalizer-*Log archival, cost report
ASL converter/aws/lambda/workflow-asl-converter-*Task def registration, ASL generation
Step Functions/aws/vendedlogs/states/workflow-*State machine transitions
VPC flow logs/vpc/flow-logs/{nodeId}Network traffic metadata (secure/compliant only)

Per-Processor Log Isolation

ECS processors write to a unique CloudWatch log stream scoped by processor UUID:

{processorUUID}/processor/{ecsTaskId}

Lambda processors write to their own CloudWatch log group (/aws/lambda/proc-lambda-{nodeId}-{hash}). The finalizer filters these logs by workflow instance ID to extract only the entries for the current workflow run.

Logs from different processors never mix, even when multiple processors run in parallel. When retrieving logs via the gateway API, pass applicationUuid to get only that processor's logs:

GET /logs?workflowInstanceId={id}&applicationUuid={processorUUID}

Log Archival

The workflow finalizer automatically archives logs to S3 after each execution:

s3://workflow-logs-{accountId}-{env}-{nodeId}/
  logs/{computeNodeId}/{workflowInstanceId}/
    processor-logs.json                          # all processor logs combined
    {processorUUID}/processor-logs.json          # per-processor logs
    workflow-init/lambda-logs.json               # init Lambda logs
    data-transfer/lambda-logs.json               # data transfer Lambda logs
    workflow-finalizer/lambda-logs.json           # finalizer Lambda logs
    vpc-flow-logs/flow-logs.json                 # VPC flow logs (secure/compliant only)

Each log file is a JSON array of entries:

[
  {
    "timestamp": 1706000000000,
    "message": "Processing file: dataset.csv\n",
    "stream": "abc-123/processor/ee5b55a4b163470d"
  }
]

Retention

Logs move through three storage tiers:

TierLocationDurationAccess Speed
LiveCloudWatch30 daysInstant (console, API, CLI)
Warm archiveS3 Standard90 daysInstant (S3 download)
Cold archiveS3 GlacierUp to 7 yearsMinutes to hours (restore request required)

At the end of each workflow, the finalizer copies logs from CloudWatch to S3. CloudWatch retains its own copy for 30 days (useful for quick debugging via the console), after which it expires automatically. The S3 copy stays in Standard storage for 90 days and then transitions to Glacier for long-term retention at minimal cost (~$0.004/GB/month). Glacier objects are automatically deleted after 7 years (2555 days), aligning with HIPAA data retention requirements.


3. Cost

Infrastructure Cost (Idle)

When no workflows are running, you pay only for the always-on infrastructure:

ResourceBasic ModeSecure ModeCompliant Mode
EFS (empty)~$0~$0~$0
EFS mount targets~$0.30~$0.30~$0.30
Secrets Manager$0.40$0.40$0.40
KMS keys (SFN, ECS, log encryption)$0$2.00$2.00
S3 bucket (empty)~$0~$0~$0
NAT Gateway$45.00
VPC Endpoints (5)~$36.00
VPC Flow Logs~$1-3~$1-3
Total~$2-4~$49-53~$40-43

Lambda functions, ECS tasks, Step Functions, and CloudWatch all have zero cost when idle.

NAT Gateways can be shared across compute nodes in the same account. With 10 nodes sharing one NAT, the per-node cost drops from $45 to $4.50.

VPC Flow Logs cost depends on traffic volume ($0.50/GB ingested to CloudWatch). The estimate above assumes light to moderate workflow traffic. Flow logs capture metadata only (source, destination, port, protocol, action) — not packet contents. Logs are retained for 90 days.

Per-Workflow Execution Cost

Each workflow execution incurs pay-per-use charges. The finalizer calculates a cost estimate automatically and logs it as COST_ESTIMATE.

Cost components:

ComponentRateTypical Cost
ECS Fargate (vCPU)$0.04048/vCPU-hourDepends on processor duration
ECS Fargate (memory)$0.004445/GB-hourDepends on processor duration
Lambda invocations$0.20 per 1M requests~$0.000001 per workflow
Lambda compute$0.0000166667/GB-second~$0.000625 per workflow
Step Functions$0.000025/transition~$0.000175-0.000375 per workflow
CloudWatch Logs$0.50/GB ingestedDepends on processor output
EFS throughput$0.03/GB read, $0.06/GB writeDepends on data volume

Example: simple 2-processor workflow

A workflow with two sequential processors, each running for 60 seconds on the default 0.5 vCPU / 1 GB configuration, processing 100 MB of input data:

ComponentCost
ECS Fargate (2 tasks, 60s each)~$0.0019
Lambda (7 invocations, ~37.5 GB-s)~$0.0006
Step Functions (~9 transitions)~$0.0002
CloudWatch Logs (~110 KB)~$0.0001
EFS throughput (~300 MB)~$0.0003
Total~$0.003

At 1,000 workflows per month with this profile, the execution cost is roughly $3/month on top of the infrastructure base.

Cost Tracking

Every workflow execution gets a cost estimate in the finalizer logs:

{
  "ecs": {"taskCount": 2, "totalSeconds": 120, "estimatedCost": 0.001892},
  "lambda": {"invocationCount": 7, "estimatedCost": 0.000626},
  "stepFunctions": {"stateTransitions": 9, "estimatedCost": 0.000225},
  "cloudWatchLogs": {"estimatedCost": 0.000054},
  "efsThroughput": {"estimatedCost": 0.000027},
  "totalEstimatedCost": 0.002824
}

All resources are tagged with ComputeNodeId and Environment for cost allocation in AWS Cost Explorer.