Skip to main content

Documentation Index

Fetch the complete documentation index at: https://platform.docs.zenoo.com/llms.txt

Use this file to discover all available pages before exploring further.

File Handling in the Hub

Overview

The Zenoo Hub provides a comprehensive file handling system that allows client applications to upload, cache, and manage files as part of execution workflows. The file caching architecture is designed to:
  • Improve user experience - Users can upload files separately from form submissions, providing faster perceived performance
  • Simplify workflow processing - Components work with file descriptors instead of multipart data
  • Ensure execution isolation - Files are strictly scoped to their owning execution with authorization enforcement
  • Enable handoff scenarios - Files can be transferred between parent and child executions (e.g., mobile handoffs)
  • Provide automatic cleanup - Files are automatically removed when executions expire or terminate

Architecture

The file handling system consists of several key components:
  • FileCache - Core service managing file storage, retrieval, and lifecycle
  • Storage Backends - Pluggable storage implementations (Redis or local filesystem)
  • Redis Caches - Two cache structures for file metadata and execution-file mappings
  • REST API - Client-facing endpoints for upload, download, and metadata queries
  • Cleanup Processors - Kafka Streams processors handling automatic file deletion

Execution Ownership Model

Files are always owned by exactly one execution:
  • Files uploaded with a JWT token are associated with the execution in that token
  • Execution ownership determines access rights (users can only download files from their execution)
  • Ownership can be transferred to parent executions in handoff scenarios
  • When an execution terminates or expires, all its files are automatically deleted

File Lifecycle

1. Upload Phase

When a file is uploaded via POST /api/files/cache:
  1. Client submits multipart file with valid JWT token
  2. Hub extracts execution UUID from JWT
  3. Apache Tika detects the file’s MIME type
  4. File bytes are stored in the active storage backend (Redis or local filesystem)
  5. File metadata is cached in CachedFileCache with execution TTL
  6. File UUID is added to ExecutionFileCache list for the execution
  7. A FileDescriptor is returned to the client containing:
    • uuid - Unique file identifier
    • fileName - Original filename
    • mimeType - Detected MIME type
    • size - File size in bytes

2. Caching Duration

Files inherit the TTL (time-to-live) of their owning execution:
  • Default TTL: 1 hour (configurable via hub.execution.expiration)
  • Redis storage: Automatic expiration when TTL expires
  • Local storage: No automatic expiration, requires manual cleanup

3. File Transfer (Handoff Scenarios)

In mobile handoff workflows where a mobile execution needs to transfer files to its parent:
  1. Child execution uploads files during its workflow
  2. When child execution terminates with a propagate field set to parent execution UUID
  3. ExecutionCleanupProcessor calls fileCache.transferFiles(childUuid, parentUuid)
  4. Each file is rebuilt with new execution ownership:
    • Metadata updated with parent execution UUID
    • Cache entries updated
    • File removed from child’s ExecutionFileCache list
    • File added to parent’s ExecutionFileCache list
  5. Transfer completes synchronously before cleanup proceeds
File transfer has a 2-minute timeout and uses retry logic (150 retries x 200ms) for cache operations.

4. Automatic Cleanup

Files are automatically deleted when their owning execution terminates or expires:

Expiration-Based Cleanup

  • ExecutionExpirationProcessor runs every 60 seconds
  • Scans execution state store for expired executions (age > TTL)
  • Emits ExecutionExpiredEvent for each expired execution
  • Triggers cleanup via ExecutionCleanupProcessor

Termination-Based Cleanup

  • When an execution reaches a terminal state (completed/failed)
  • ExecutionTerminatedEvent is emitted
  • Triggers cleanup via ExecutionCleanupProcessor

Cleanup Process

When ExecutionCleanupProcessor receives an expiration or termination event:
  1. Transfer files to parent (if propagate UUID is set):
    • Blocks to ensure transfer completes
    • 2-minute timeout per transfer operation
    • Continues cleanup even if transfer fails
  2. Delete all execution files:
    • Retrieves all file UUIDs from ExecutionFileCache
    • For each file (with 30-second individual timeout):
      • Deletes file bytes from storage backend
      • Removes metadata from CachedFileCache
    • Clears ExecutionFileCache list for execution
    • Total cleanup timeout: 5 minutes
    • Continues on individual file errors (logged)
  3. Remove callbacks (unrelated to files)

Storage Backends

The Hub supports two pluggable storage backends for file content:

Redis File Storage

Profile: redis-file-storage Stores file bytes directly in Redis:
  • Key pattern: {prefix}:file-content:{uuid}
  • TTL support: Yes, automatic expiration
  • Use case: Single-server deployments, development environments
  • Limitations: Redis memory constraints, not suitable for large files or high volumes
Configuration:
spring:
  profiles:
    active: redis-file-storage

Local File Storage

Profile: Default (when redis-file-storage is not active) Stores files in the local filesystem:
  • Directory: Configurable via hub.uploader.cache.dir (default: ./file-cache)
  • TTL support: No automatic expiration
  • Use case: Development, single-server deployments
  • Limitations: Not suitable for multi-instance deployments (files not shared)
Configuration:
hub:
  uploader:
    cache:
      dir: ./cache/
Note: For production multi-instance deployments, integrate a cloud storage backend (S3, Azure Blob, GCS) by implementing the FileStorage interface.

Configuration and Constraints

File Size Limits

Max In-Memory Size: hub.gateway.max-in-memory
  • Default: 10MB
  • Controls Spring WebFlux codec max in-memory size
  • Limits multipart file upload size
  • Larger files will be rejected with HTTP 413 (Payload Too Large)
Configuration:
hub:
  gateway:
    max-in-memory: 10485760  # 10MB in bytes

Retention Policy

Execution Expiration: hub.execution.expiration
  • Default: 1h (1 hour)
  • Files are automatically deleted when execution expires
  • Redis storage respects TTL, local storage requires cleanup processor
Configuration:
hub:
  execution:
    expiration: 2h  # Keep files for 2 hours

JWT Token Expiration

Token TTL: jwt.expiration
  • Default: 1800 seconds (30 minutes)
  • Must be less than execution expiration
  • Tokens become invalid before files are cleaned up
Configuration:
jwt:
  expiration: 3600  # 1 hour

Cache Retry Configuration

File cache operations use retry logic for eventual consistency:
  • Max retries: 150
  • Backoff: 200ms fixed
  • Total timeout: 30 seconds
  • Progressive logging at 10, 50, 100 iterations

REST API Endpoints

Upload File

POST /api/files/cache

Uploads a new file to cache. Requires valid JWT token with execution context.
Request
Headers:
  • Authorization: Bearer <jwt-token> (required)
  • Content-Type: multipart/form-data (required)
Form Parameters:
  • file: Multipart representation of uploaded file (required)
Example:
POST /api/files/cache HTTP/1.1
Host: hub.example.com
Authorization: Bearer eyJhbGciOiJIUzI1NiJ9...
Content-Type: multipart/form-data; boundary=----WebKitFormBoundary

------WebKitFormBoundary
Content-Disposition: form-data; name="file"; filename="document.pdf"
Content-Type: application/pdf

[binary file content]
------WebKitFormBoundary--
Response
201 Created - File successfully cached
{
  "uuid": "89828e1e-c834-42a2-86f1-893209f63ab5",
  "fileName": "document.pdf",
  "mimeType": "application/pdf",
  "size": 123123
}
Response Fields:
  • uuid - Unique identifier for the cached file (use this in form submissions)
  • fileName - Original filename from upload
  • mimeType - Detected MIME type (via Apache Tika)
  • size - File size in bytes
Error Responses:
  • 400 Bad Request - Invalid file upload, file too large, or missing file parameter
  • 401 Unauthorized - Invalid or missing JWT token
  • 413 Payload Too Large - File exceeds hub.gateway.max-in-memory limit

Get File Metadata

GET /api/files/cache/

Returns file descriptor and metadata from cache. Requires valid JWT token.
Request
Path Parameters:
  • uuid - UUID of file in cache (required)
Headers:
  • Authorization: Bearer <jwt-token> (required)
Example:
GET /api/files/cache/89828e1e-c834-42a2-86f1-893209f63ab5 HTTP/1.1
Host: hub.example.com
Authorization: Bearer eyJhbGciOiJIUzI1NiJ9...
Response
200 OK - File metadata retrieved
{
  "execution": "59bb233b-2d2b-41e7-ad13-f17c14513603",
  "uri": "/api/files/cache/89828e1e-c834-42a2-86f1-893209f63ab5",
  "descriptor": {
    "uuid": "89828e1e-c834-42a2-86f1-893209f63ab5",
    "fileName": "document.pdf",
    "mimeType": "application/pdf",
    "size": 123123
  }
}
Response Fields:
  • execution - UUID of execution that owns this file
  • uri - API endpoint for downloading file content
  • descriptor - File metadata (uuid, fileName, mimeType, size)
Error Responses:
  • 401 Unauthorized - Invalid or missing JWT token
  • 404 Not Found - File descriptor not found or expired
Timeout: 25 seconds (returns 504 Gateway Timeout if exceeded)

Download File

GET /api/files/

Downloads file content. Requires valid JWT token and enforces execution ownership.
Request
Path Parameters:
  • uuid - UUID of file to download (required)
Headers:
  • Authorization: Bearer <jwt-token> (required)
Example:
GET /api/files/89828e1e-c834-42a2-86f1-893209f63ab5 HTTP/1.1
Host: hub.example.com
Authorization: Bearer eyJhbGciOiJIUzI1NiJ9...
Response
200 OK - File content returned Headers:
  • Content-Type - MIME type from file descriptor
  • Content-Disposition: attachment; filename="document.pdf" - Triggers browser download
  • Content-Length - File size in bytes
Body: Binary file content Error Responses:
  • 401 Unauthorized - Invalid or missing JWT token
  • 404 Not Found - File not found, expired, or execution ownership mismatch
Security Note: Users can only download files owned by their execution. Attempting to download files from other executions returns 404 (not 403) to prevent information disclosure.

Using Files in Workflows

File Descriptors in DSL

Files are represented as FileDescriptor objects in Hub DSL workflows:
// Route with file validation
route('upload-documents') {
    uri '/upload'
    namespace documents
    validate {
        file  // Required by default
    }
}

// Optional file upload
route('upload-optional') {
    uri '/upload-optional'
    namespace docs
    validate {
        file {
            optional()
        }
    }
}

// Terminal route that exports the uploaded file
route('complete') {
    uri '/complete'
    terminal(documents)  // Exports all data from 'documents' namespace
}

// Alternative: explicit export with custom keys
route('result') {
    uri '/result'
    export uploadedFile: documents, status: 'success'
    terminal()
}

Client Workflow

1. Upload File Separately

// Upload file first using multipart form data
const formData = new FormData();
formData.append('file', fileBlob, 'document.pdf');

const uploadResponse = await fetch('/api/files/cache', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${jwtToken}`
  },
  body: formData
});

const fileDescriptor = await uploadResponse.json();
// FileDescriptor: { uuid: "89828e1e-...", fileName: "document.pdf", mimeType: "application/pdf", size: 123123 }

2. Submit Route with File Descriptor

// Get current route from execution
const routeResponse = await fetch(`/api/request/${executionUuid}/route`, {
  headers: {
    'Authorization': `Bearer ${jwtToken}`
  }
});
const currentRoute = await routeResponse.json();

// Submit route with file descriptor (not file content)
const submitResponse = await fetch(`/api/request/${executionUuid}/route/${currentRoute.uuid}/submit`, {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${jwtToken}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    file: fileDescriptor  // Reference to cached file
  })
});

3. Query File Metadata

// Query file metadata from cache
const metadataResponse = await fetch(`/api/files/cache/${fileDescriptor.uuid}`, {
  headers: {
    'Authorization': `Bearer ${jwtToken}`
  }
});

const cachedFile = await metadataResponse.json();
// CachedFileResource: { execution: "59bb233b-...", uri: "/api/files/cache/...", descriptor: {...} }

4. Download File Content

// Download file bytes
const downloadResponse = await fetch(`/api/files/${fileDescriptor.uuid}`, {
  headers: {
    'Authorization': `Bearer ${jwtToken}`
  }
});

const blob = await downloadResponse.blob();
// Create download link or display in UI
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = fileDescriptor.fileName;
a.click();

Automatic Base64 Processing

The Hub automatically detects and processes Base64-encoded image data in payloads:
// Client submits Base64 image
fetch('/api/execute', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    payload: {
      photo: "data:image/png;base64,iVBORw0KGgoAAAANSUhEUg..."  // Long Base64 string
    }
  })
});

// Hub automatically converts to FileDescriptor
// Component receives:
// ctx.photo = { uuid: "...", fileName: "image.png", mimeType: "image/png", size: 45678 }
Requirements for automatic processing:
  • String length > 100 characters
  • Valid Base64 character set
  • Decodes to valid image MIME type (image/*)

File Transfer in Handoff Scenarios

When using mobile handoffs, files uploaded in the child execution can be transferred to the parent:
// Parent component that creates a handoff execution
static Component PARENT = ComponentBuilder.component('parent') {
    dependencies {
        component 'child'
    }

    target {
        payload ->
            // Create handoff execution
            sharable(payload) {
                workflow('upload-workflow@child')
                namespace childData
            }

            // Wait for child to complete
            await(childData, "2m")

            // Access transferred files
            route('result') {
                uri '/result'
                export childData  // Includes transferred file
                terminal()
            }
    }
}

// Child component with file upload workflow
static Component CHILD = ComponentBuilder.component('child') {
    workflow('upload-workflow') {
        payload ->
            route('upload') {
                uri '/upload'
                namespace uploaded
                validate {
                    file  // File is required
                }
            }

            route('process') {
                uri '/process'
                terminal(uploaded)  // Returns file to parent
            }
    }
}
When the child execution completes with a handoff:
  1. Files are transferred from child to parent execution
  2. Parent can access files via the namespace (e.g., childData.file)
  3. Child execution’s files are cleaned up
  4. Parent owns the files until its expiration/termination
Accessing transferred files in parent:
// In the parent's result route export
def childFile = route.export.childData.file
// childFile.uuid - file UUID
// childFile.fileName - original filename
// childFile.mimeType - MIME type
// childFile.size - file size in bytes

Security and Best Practices

Authorization Model

  • JWT-based authentication - All file operations require valid JWT token
  • Execution ownership - Files are strictly scoped to their owning execution
  • Cross-execution isolation - Users cannot access files from other executions
  • Ownership enforcement - Download endpoint validates execution ownership on every request

Security Considerations

MIME Type Detection:
  • Apache Tika automatically detects MIME types from file content
  • Prevents MIME type spoofing attacks
  • MIME type stored in descriptor for download Content-Type header
File Validation:
  • Implement validation in DSL routes to enforce file requirements
  • Check file size, MIME type, and filename in validation functions
  • Reject files that don’t meet business requirements before processing
Storage Security:
  • Redis storage should use authentication and TLS in production
  • Local file storage directory should have restricted permissions
  • Consider implementing virus scanning for uploaded files in connectors
Token Expiration:
  • JWT tokens should expire before execution expiration
  • Prevents token reuse after execution completes
  • Default: 30 minutes (configurable)

Best Practices

Upload Strategy:
  • Upload files separately before form submission for better UX
  • Show upload progress indicators to users
  • Validate file size client-side before upload
  • Handle upload failures gracefully with retry logic
Workflow Design:
  • Use file descriptors in payloads, not file content
  • Validate files early in workflow (first route after upload)
  • Export file descriptors in terminal routes for client access
  • Consider file size when designing workflows (large files impact performance)
Error Handling:
  • Handle 404 errors when accessing expired files
  • Implement client-side caching of file descriptors
  • Provide clear user feedback for upload failures
  • Retry failed uploads with exponential backoff
Performance:
  • Keep file sizes reasonable (under 10MB default limit)
  • Use appropriate storage backend for deployment model
  • Monitor Redis memory usage if using Redis storage
  • Consider CDN for frequently accessed files
Cleanup:
  • Files are automatically cleaned up - no manual intervention needed
  • Execution expiration controls file retention duration
  • Increase execution TTL if users need longer file access
  • Monitor storage usage and adjust TTL if needed

Monitoring and Troubleshooting

Key Metrics to Monitor

  • Upload failure rate - High rate indicates size limits or storage issues
  • File cache hit/miss ratio - Cache effectiveness
  • Storage backend latency - Redis or filesystem performance
  • Cleanup processor lag - Indicates backlog in file deletion
  • Storage usage - Redis memory or disk space consumption

Common Issues

Files disappear unexpectedly:
  • Check execution expiration configuration (hub.execution.expiration)
  • Verify execution hasn’t terminated prematurely
  • Check Redis TTL settings and memory eviction policy
Upload failures:
  • Verify file size is under hub.gateway.max-in-memory limit
  • Check Redis connectivity and authentication
  • Ensure storage directory exists and has write permissions (local storage)
  • Monitor Redis memory usage (may be full)
Slow file operations:
  • Check Redis latency and network issues
  • Monitor filesystem I/O (local storage)
  • Verify cache retry logs (may indicate repeated retries)
  • Check execution cleanup processor lag
404 on file download:
  • Verify JWT token contains correct execution UUID
  • Check if file has expired (execution TTL)
  • Confirm file was successfully uploaded (check logs)
  • Validate execution ownership (cannot download other execution’s files)

Logging

File cache operations log at various levels:
  • INFO: File upload, transfer, and deletion completions
  • WARN: Retry attempts exceeding thresholds (10, 50, 100 iterations)
  • ERROR: Storage backend failures, timeout errors, transfer failures
Relevant log categories:
  • com.zenoo.hub.uploader.cache.FileCacheBean - Core file operations
  • com.zenoo.hub.execution.ExecutionCleanupProcessor - Cleanup and transfer
  • com.zenoo.hub.cache.RedisTTLCacheSupport - Cache retry operations

Implementation References

Core Classes

  • FileCache - Backend interface for file operations (backend/src/main/java/com/zenoo/hub/uploader/cache/FileCache.java)
  • FileCacheBean - Main implementation (backend/src/main/java/com/zenoo/hub/uploader/cache/FileCacheBean.java)
  • ExecutionCleanupProcessor - Cleanup orchestration (backend/src/main/java/com/zenoo/hub/execution/ExecutionCleanupProcessor.java)
  • FileDescriptor - DSL model (backend/src/main/java/com/zenoo/hub/uploader/api/FileDescriptor.groovy)

REST Controllers

  • FileCacheAPI - Upload endpoint (backend/src/main/java/com/zenoo/hub/gateway/api/uploader/FileCacheAPI.java)
  • CachedFileAPI - Metadata query (backend/src/main/java/com/zenoo/hub/gateway/api/uploader/CachedFileAPI.java)
  • FileAPI - Download endpoint (backend/src/main/java/com/zenoo/hub/gateway/api/uploader/FileAPI.java)

Storage Backends

  • RedisFileStorage - Redis implementation (backend/src/main/java/com/zenoo/hub/uploader/storage/RedisFileStorage.java)
  • LocalFileStorage - Filesystem implementation (backend/src/main/java/com/zenoo/hub/uploader/storage/LocalFileStorage.java)

Cache Structures

  • CachedFileCache - File metadata cache (backend/src/main/java/com/zenoo/hub/cache/CachedFileCacheBean.java)
  • ExecutionFileCache - Execution-file mapping (backend/src/main/java/com/zenoo/hub/cache/ExecutionFileCacheBean.java)