Local Resources
This page describes what data the AI Kit stores on your servers when you run it on-premise — the storage areas, what lives in each, and how to protect them.
What is stored locally
For an on-premise installation, the AI Kit keeps all application data on the host filesystem (or whichever volume you mounted as its data directory). There is no separate database server to install or back up.
The data falls into a few categories:
Configuration
For each workspace, the AI Kit stores:
- Workflow and agent definitions.
- Knowledge configurations (sources, refresh schedules, embedding model references).
- Model configurations (provider, model ID, the encrypted API key).
- Persona definitions.
- OAuth provider configurations (the issuer, the scopes, but not the per-user tokens).
This is the "source code" of the workspace — small in size, high in value.
Job history
For each run of each automation:
- The full sequence of steps that ran.
- The values written to memory at each step.
- Logs of each step's execution.
- A status (running, completed, failed, waiting).
- Metrics (tokens, duration, cost estimate).
Job history grows with usage. The platform automatically prunes finished workflow jobs older than a configurable retention period. Agent jobs are kept forever by design — they are intended as a permanent record.
Knowledge content
For each knowledge base:
- The chunks extracted from each source.
- The embeddings (vectors) for those chunks.
- A small index used to make queries fast.
Knowledge content is by far the largest category in most installations. A knowledge with a few thousand documents can occupy several gigabytes.
Authentication
The AI Kit stores its own auth database:
- User records (email, name, language, hashed passwords, registered Passkeys).
- Workspace memberships and roles.
- One-time codes during their (very short) lifetime.
Secrets
Every field marked as a secured field across the platform (API keys, SMTP passwords, database passwords, OAuth tokens) is encrypted with a workspace-specific key before being written to disk. Even an attacker who reads the raw files cannot recover these values without also obtaining the encryption key.
Where on disk things live
The AI Kit reads two environment variables for its data paths:
AIKIT_CONFIG_DIR— workspace configuration. Defaults to/_configinside the container.AIKIT_JOB_DIR— job history and knowledge content. Defaults to/_jobinside the container.
In a Docker deployment, both are typically mounted under a single host directory (for example /opt/aikit/data/config and /opt/aikit/data/job).
A useful rule of thumb:
- Config is small, stable, and irreplaceable. Back it up.
- Jobs is large, growing, replaceable. Back it up if you need history; otherwise focus elsewhere.
- Knowledge content is large, expensive to re-create (re-embedding costs money and time). Back it up.
Backups
A simple backup strategy is enough for most installations:
- Stop the AI Kit (or rely on a filesystem snapshot if your storage supports atomic snapshots).
- Copy the data volume to a backup location.
- Restart the AI Kit.
Recovery is the reverse: restore the volume, restart, done.
The platform does not require any complex multi-step backup procedure. There is no separate database to dump, no separate index to rebuild.
📷 SCREENSHOT: A simple architecture diagram showing the AI Kit container with two mounted volumes labeled "config" and "job", and a backup arrow pointing to an external backup target.
File permissions
The AI Kit's user inside the container owns the data files. On the host:
- The mounted directories should be owned by a user that maps to the container's user (usually UID 1000).
- They should be readable and writable only by that user. Avoid world-readable mounts.
- They should not be accessible to other unrelated services on the same host.
Encryption at rest
The AI Kit encrypts secured fields with an application-level key. For storage-level encryption (the whole volume), use the operating system or the storage system:
- Linux: LUKS / dm-crypt for the partition.
- Cloud storage: most cloud volumes encrypt by default.
Both layers are useful. Application-level encryption protects against a leaked database snapshot; storage-level encryption protects against a stolen disk.
Recommendations
- ✅ Use a filesystem snapshot (ZFS, Btrfs, LVM) for cheap, frequent backups of the data volume.
- ✅ Encrypt the host's storage at rest. The AI Kit's own encryption is a complement, not a replacement.
- ✅ Document where backups live, who has access, and how to restore. The day you need them is not the day to figure this out.
- ✅ Test a restore at least once. Untested backups are wishes, not backups.
- ⚠️ Some files inside the data volume change continuously while the AI Kit runs. Without filesystem snapshots, take backups during a brief stop-the-world window for consistency.
- ⚠️ Knowledge content can be re-embedded from the original sources if lost, but re-embedding has a real cost. Prefer to back up.
- ❌ Do not edit files inside the data volume by hand. Use the platform's user interface or its export/import features.
- ❌ Do not store backups on the same disk as the production volume. A failed disk takes both with it.
What to do next
- Network Access — the network side of an on-premise installation.
- Best Practices → Logging and Compliance — what is written to logs.