Documentation Index
Fetch the complete documentation index at: https://platform.docs.zenoo.com/llms.txt
Use this file to discover all available pages before exploring further.
File Handling in the Hub
Overview
The Zenoo Hub provides a comprehensive file handling system that allows client applications to upload, cache, and manage files as part of execution workflows. The file caching architecture is designed to:- Improve user experience - Users can upload files separately from form submissions, providing faster perceived performance
- Simplify workflow processing - Components work with file descriptors instead of multipart data
- Ensure execution isolation - Files are strictly scoped to their owning execution with authorization enforcement
- Enable handoff scenarios - Files can be transferred between parent and child executions (e.g., mobile handoffs)
- Provide automatic cleanup - Files are automatically removed when executions expire or terminate
Architecture
The file handling system consists of several key components:- FileCache - Core service managing file storage, retrieval, and lifecycle
- Storage Backends - Pluggable storage implementations (Redis or local filesystem)
- Redis Caches - Two cache structures for file metadata and execution-file mappings
- REST API - Client-facing endpoints for upload, download, and metadata queries
- Cleanup Processors - Kafka Streams processors handling automatic file deletion
Execution Ownership Model
Files are always owned by exactly one execution:- Files uploaded with a JWT token are associated with the execution in that token
- Execution ownership determines access rights (users can only download files from their execution)
- Ownership can be transferred to parent executions in handoff scenarios
- When an execution terminates or expires, all its files are automatically deleted
File Lifecycle
1. Upload Phase
When a file is uploaded viaPOST /api/files/cache:
- Client submits multipart file with valid JWT token
- Hub extracts execution UUID from JWT
- Apache Tika detects the file’s MIME type
- File bytes are stored in the active storage backend (Redis or local filesystem)
- File metadata is cached in
CachedFileCachewith execution TTL - File UUID is added to
ExecutionFileCachelist for the execution - A
FileDescriptoris returned to the client containing:uuid- Unique file identifierfileName- Original filenamemimeType- Detected MIME typesize- File size in bytes
2. Caching Duration
Files inherit the TTL (time-to-live) of their owning execution:- Default TTL: 1 hour (configurable via
hub.execution.expiration) - Redis storage: Automatic expiration when TTL expires
- Local storage: No automatic expiration, requires manual cleanup
3. File Transfer (Handoff Scenarios)
In mobile handoff workflows where a mobile execution needs to transfer files to its parent:- Child execution uploads files during its workflow
- When child execution terminates with a
propagatefield set to parent execution UUID ExecutionCleanupProcessorcallsfileCache.transferFiles(childUuid, parentUuid)- Each file is rebuilt with new execution ownership:
- Metadata updated with parent execution UUID
- Cache entries updated
- File removed from child’s ExecutionFileCache list
- File added to parent’s ExecutionFileCache list
- Transfer completes synchronously before cleanup proceeds
4. Automatic Cleanup
Files are automatically deleted when their owning execution terminates or expires:Expiration-Based Cleanup
ExecutionExpirationProcessorruns every 60 seconds- Scans execution state store for expired executions (age > TTL)
- Emits
ExecutionExpiredEventfor each expired execution - Triggers cleanup via
ExecutionCleanupProcessor
Termination-Based Cleanup
- When an execution reaches a terminal state (completed/failed)
ExecutionTerminatedEventis emitted- Triggers cleanup via
ExecutionCleanupProcessor
Cleanup Process
WhenExecutionCleanupProcessor receives an expiration or termination event:
-
Transfer files to parent (if
propagateUUID is set):- Blocks to ensure transfer completes
- 2-minute timeout per transfer operation
- Continues cleanup even if transfer fails
-
Delete all execution files:
- Retrieves all file UUIDs from
ExecutionFileCache - For each file (with 30-second individual timeout):
- Deletes file bytes from storage backend
- Removes metadata from
CachedFileCache
- Clears
ExecutionFileCachelist for execution - Total cleanup timeout: 5 minutes
- Continues on individual file errors (logged)
- Retrieves all file UUIDs from
- Remove callbacks (unrelated to files)
Storage Backends
The Hub supports two pluggable storage backends for file content:Redis File Storage
Profile:redis-file-storage
Stores file bytes directly in Redis:
- Key pattern:
{prefix}:file-content:{uuid} - TTL support: Yes, automatic expiration
- Use case: Single-server deployments, development environments
- Limitations: Redis memory constraints, not suitable for large files or high volumes
Local File Storage
Profile: Default (whenredis-file-storage is not active)
Stores files in the local filesystem:
- Directory: Configurable via
hub.uploader.cache.dir(default:./file-cache) - TTL support: No automatic expiration
- Use case: Development, single-server deployments
- Limitations: Not suitable for multi-instance deployments (files not shared)
FileStorage interface.
Configuration and Constraints
File Size Limits
Max In-Memory Size:hub.gateway.max-in-memory
- Default:
10MB - Controls Spring WebFlux codec max in-memory size
- Limits multipart file upload size
- Larger files will be rejected with HTTP 413 (Payload Too Large)
Retention Policy
Execution Expiration:hub.execution.expiration
- Default:
1h(1 hour) - Files are automatically deleted when execution expires
- Redis storage respects TTL, local storage requires cleanup processor
JWT Token Expiration
Token TTL:jwt.expiration
- Default:
1800seconds (30 minutes) - Must be less than execution expiration
- Tokens become invalid before files are cleaned up
Cache Retry Configuration
File cache operations use retry logic for eventual consistency:- Max retries: 150
- Backoff: 200ms fixed
- Total timeout: 30 seconds
- Progressive logging at 10, 50, 100 iterations
REST API Endpoints
Upload File
POST /api/files/cache
Uploads a new file to cache. Requires valid JWT token with execution context.Request
Headers:Authorization: Bearer <jwt-token>(required)Content-Type: multipart/form-data(required)
file: Multipart representation of uploaded file (required)
Response
201 Created - File successfully cacheduuid- Unique identifier for the cached file (use this in form submissions)fileName- Original filename from uploadmimeType- Detected MIME type (via Apache Tika)size- File size in bytes
400 Bad Request- Invalid file upload, file too large, or missing file parameter401 Unauthorized- Invalid or missing JWT token413 Payload Too Large- File exceedshub.gateway.max-in-memorylimit
Get File Metadata
GET /api/files/cache/
Returns file descriptor and metadata from cache. Requires valid JWT token.Request
Path Parameters:uuid- UUID of file in cache (required)
Authorization: Bearer <jwt-token>(required)
Response
200 OK - File metadata retrievedexecution- UUID of execution that owns this fileuri- API endpoint for downloading file contentdescriptor- File metadata (uuid, fileName, mimeType, size)
401 Unauthorized- Invalid or missing JWT token404 Not Found- File descriptor not found or expired
Download File
GET /api/files/
Downloads file content. Requires valid JWT token and enforces execution ownership.Request
Path Parameters:uuid- UUID of file to download (required)
Authorization: Bearer <jwt-token>(required)
Response
200 OK - File content returned Headers:Content-Type- MIME type from file descriptorContent-Disposition: attachment; filename="document.pdf"- Triggers browser downloadContent-Length- File size in bytes
401 Unauthorized- Invalid or missing JWT token404 Not Found- File not found, expired, or execution ownership mismatch
Using Files in Workflows
File Descriptors in DSL
Files are represented asFileDescriptor objects in Hub DSL workflows:
Client Workflow
1. Upload File Separately
2. Submit Route with File Descriptor
3. Query File Metadata
4. Download File Content
Automatic Base64 Processing
The Hub automatically detects and processes Base64-encoded image data in payloads:- String length > 100 characters
- Valid Base64 character set
- Decodes to valid image MIME type (
image/*)
File Transfer in Handoff Scenarios
When using mobile handoffs, files uploaded in the child execution can be transferred to the parent:- Files are transferred from child to parent execution
- Parent can access files via the namespace (e.g.,
childData.file) - Child execution’s files are cleaned up
- Parent owns the files until its expiration/termination
Security and Best Practices
Authorization Model
- JWT-based authentication - All file operations require valid JWT token
- Execution ownership - Files are strictly scoped to their owning execution
- Cross-execution isolation - Users cannot access files from other executions
- Ownership enforcement - Download endpoint validates execution ownership on every request
Security Considerations
MIME Type Detection:- Apache Tika automatically detects MIME types from file content
- Prevents MIME type spoofing attacks
- MIME type stored in descriptor for download Content-Type header
- Implement validation in DSL routes to enforce file requirements
- Check file size, MIME type, and filename in validation functions
- Reject files that don’t meet business requirements before processing
- Redis storage should use authentication and TLS in production
- Local file storage directory should have restricted permissions
- Consider implementing virus scanning for uploaded files in connectors
- JWT tokens should expire before execution expiration
- Prevents token reuse after execution completes
- Default: 30 minutes (configurable)
Best Practices
Upload Strategy:- Upload files separately before form submission for better UX
- Show upload progress indicators to users
- Validate file size client-side before upload
- Handle upload failures gracefully with retry logic
- Use file descriptors in payloads, not file content
- Validate files early in workflow (first route after upload)
- Export file descriptors in terminal routes for client access
- Consider file size when designing workflows (large files impact performance)
- Handle 404 errors when accessing expired files
- Implement client-side caching of file descriptors
- Provide clear user feedback for upload failures
- Retry failed uploads with exponential backoff
- Keep file sizes reasonable (under 10MB default limit)
- Use appropriate storage backend for deployment model
- Monitor Redis memory usage if using Redis storage
- Consider CDN for frequently accessed files
- Files are automatically cleaned up - no manual intervention needed
- Execution expiration controls file retention duration
- Increase execution TTL if users need longer file access
- Monitor storage usage and adjust TTL if needed
Monitoring and Troubleshooting
Key Metrics to Monitor
- Upload failure rate - High rate indicates size limits or storage issues
- File cache hit/miss ratio - Cache effectiveness
- Storage backend latency - Redis or filesystem performance
- Cleanup processor lag - Indicates backlog in file deletion
- Storage usage - Redis memory or disk space consumption
Common Issues
Files disappear unexpectedly:- Check execution expiration configuration (
hub.execution.expiration) - Verify execution hasn’t terminated prematurely
- Check Redis TTL settings and memory eviction policy
- Verify file size is under
hub.gateway.max-in-memorylimit - Check Redis connectivity and authentication
- Ensure storage directory exists and has write permissions (local storage)
- Monitor Redis memory usage (may be full)
- Check Redis latency and network issues
- Monitor filesystem I/O (local storage)
- Verify cache retry logs (may indicate repeated retries)
- Check execution cleanup processor lag
- Verify JWT token contains correct execution UUID
- Check if file has expired (execution TTL)
- Confirm file was successfully uploaded (check logs)
- Validate execution ownership (cannot download other execution’s files)
Logging
File cache operations log at various levels:- INFO: File upload, transfer, and deletion completions
- WARN: Retry attempts exceeding thresholds (10, 50, 100 iterations)
- ERROR: Storage backend failures, timeout errors, transfer failures
com.zenoo.hub.uploader.cache.FileCacheBean- Core file operationscom.zenoo.hub.execution.ExecutionCleanupProcessor- Cleanup and transfercom.zenoo.hub.cache.RedisTTLCacheSupport- Cache retry operations
Implementation References
Core Classes
FileCache- Backend interface for file operations (backend/src/main/java/com/zenoo/hub/uploader/cache/FileCache.java)FileCacheBean- Main implementation (backend/src/main/java/com/zenoo/hub/uploader/cache/FileCacheBean.java)ExecutionCleanupProcessor- Cleanup orchestration (backend/src/main/java/com/zenoo/hub/execution/ExecutionCleanupProcessor.java)FileDescriptor- DSL model (backend/src/main/java/com/zenoo/hub/uploader/api/FileDescriptor.groovy)
REST Controllers
FileCacheAPI- Upload endpoint (backend/src/main/java/com/zenoo/hub/gateway/api/uploader/FileCacheAPI.java)CachedFileAPI- Metadata query (backend/src/main/java/com/zenoo/hub/gateway/api/uploader/CachedFileAPI.java)FileAPI- Download endpoint (backend/src/main/java/com/zenoo/hub/gateway/api/uploader/FileAPI.java)
Storage Backends
RedisFileStorage- Redis implementation (backend/src/main/java/com/zenoo/hub/uploader/storage/RedisFileStorage.java)LocalFileStorage- Filesystem implementation (backend/src/main/java/com/zenoo/hub/uploader/storage/LocalFileStorage.java)
Cache Structures
CachedFileCache- File metadata cache (backend/src/main/java/com/zenoo/hub/cache/CachedFileCacheBean.java)ExecutionFileCache- Execution-file mapping (backend/src/main/java/com/zenoo/hub/cache/ExecutionFileCacheBean.java)