Skip to content

Storage Backends

Artifact Keeper supports multiple storage backends to accommodate different deployment scenarios and scale requirements.

Storage Backend Types

Filesystem Storage

The default backend stores artifacts on the local filesystem or network-attached storage.

Configuration

Terminal window
STORAGE_BACKEND=filesystem
STORAGE_PATH=/var/lib/artifact-keeper/artifacts

Advantages

  • Simple setup, no external dependencies
  • Predictable performance
  • Easy to backup with standard tools
  • Works well with NFS/NAS for shared storage

Limitations

  • Scaling requires network storage
  • No built-in redundancy
  • Manual backup procedures

Directory Structure

/var/lib/artifact-keeper/artifacts/
├── repositories/
│ ├── repo-{id}/
│ │ ├── packages/
│ │ │ ├── {package-name}/
│ │ │ │ ├── {version}/
│ │ │ │ │ ├── {artifact-file}
│ │ │ │ │ └── metadata.json
├── temp/ # Temporary upload staging
└── cache/ # Downloaded edge cache

Permissions

Ensure the backend process has read/write access:

Terminal window
sudo mkdir -p /var/lib/artifact-keeper/artifacts
sudo chown -R artifact-keeper:artifact-keeper /var/lib/artifact-keeper
sudo chmod -R 750 /var/lib/artifact-keeper/artifacts

S3-Compatible Storage

Use Amazon S3 or compatible object storage (MinIO, Wasabi, DigitalOcean Spaces, etc.) for cloud-native deployments.

Configuration

Terminal window
STORAGE_BACKEND=s3
S3_BUCKET=artifact-keeper-prod
S3_REGION=us-east-1
S3_ENDPOINT=https://s3.amazonaws.com # Optional, for S3-compatible services
S3_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
S3_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
S3_PATH_PREFIX=artifacts/ # Optional, prefix for all keys

AWS S3

Terminal window
STORAGE_BACKEND=s3
S3_BUCKET=my-artifact-bucket
S3_REGION=us-west-2
# Use IAM roles for credentials (recommended)
# Or set S3_ACCESS_KEY_ID and S3_SECRET_ACCESS_KEY

MinIO (Self-Hosted)

Terminal window
STORAGE_BACKEND=s3
S3_BUCKET=artifacts
S3_REGION=us-east-1
S3_ENDPOINT=https://minio.example.com
S3_ACCESS_KEY_ID=minioadmin
S3_SECRET_ACCESS_KEY=minioadmin
S3_FORCE_PATH_STYLE=true # Required for MinIO

DigitalOcean Spaces

Terminal window
STORAGE_BACKEND=s3
S3_BUCKET=my-spaces-bucket
S3_REGION=nyc3
S3_ENDPOINT=https://nyc3.digitaloceanspaces.com
S3_ACCESS_KEY_ID=your-spaces-key
S3_SECRET_ACCESS_KEY=your-spaces-secret

Advantages

  • Unlimited scalability
  • Built-in redundancy and durability
  • Geographic distribution
  • No filesystem management
  • Pay-as-you-go pricing

Considerations

  • Network latency for uploads/downloads
  • Data transfer costs
  • Requires internet connectivity
  • Credential management

MinIO as S3 Alternative

MinIO provides S3-compatible object storage that you can self-host.

Docker Compose Setup

version: '3.8'
services:
minio:
image: minio/minio:latest
command: server /data --console-address ":9001"
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
ports:
- "9000:9000"
- "9001:9001"
volumes:
- minio_data:/data
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
interval: 30s
timeout: 20s
retries: 3
artifact-keeper:
image: artifact-keeper-backend:latest
environment:
STORAGE_BACKEND: s3
S3_BUCKET: artifacts
S3_REGION: us-east-1
S3_ENDPOINT: http://minio:9000
S3_ACCESS_KEY_ID: minioadmin
S3_SECRET_ACCESS_KEY: minioadmin
S3_FORCE_PATH_STYLE: "true"
depends_on:
- minio
volumes:
minio_data:

Create Bucket

Access MinIO console at http://localhost:9001 and create the artifacts bucket, or use the CLI:

Terminal window
mc alias set local http://localhost:9000 minioadmin minioadmin
mc mb local/artifacts
mc policy set download local/artifacts # Optional: public read

Storage Layout

Regardless of backend, artifacts are organized hierarchically:

Key/Path Structure

{repository_id}/packages/{package_name}/{version}/{artifact_filename}

Examples:

repo-123/packages/my-app/1.0.0/my-app-1.0.0.tar.gz
repo-456/packages/@scope/package/2.1.3/package-2.1.3.tgz
repo-789/packages/my-image/latest/manifest.json

Metadata Storage

Artifact metadata is stored in PostgreSQL, not in the storage backend. The storage backend only contains the binary artifact files.

Garbage Collection

Remove orphaned artifacts that are no longer referenced in the database.

Manual Cleanup

Terminal window
curl -X POST https://registry.example.com/api/v1/admin/cleanup \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"dry_run": true,
"older_than_days": 30
}'

Scheduled Cleanup

Configure automatic garbage collection:

Terminal window
GC_ENABLED=true
GC_SCHEDULE="0 2 * * *" # Daily at 2 AM
GC_RETENTION_DAYS=90 # Keep artifacts for 90 days

What Gets Cleaned

  • Artifacts marked as deleted but still on disk
  • Incomplete multipart uploads (>24 hours old)
  • Temporary files from failed uploads
  • Orphaned chunks from interrupted edge transfers

Dry Run Mode

Always test with dry run first:

Terminal window
GC_DRY_RUN=true

This logs what would be deleted without actually removing anything.

Storage Migration

From Filesystem to S3

  1. Configure S3 backend settings
  2. Run migration tool:
Terminal window
cargo run --bin migrate-storage -- \
--from filesystem \
--from-path /var/lib/artifact-keeper/artifacts \
--to s3 \
--s3-bucket artifact-keeper-prod
  1. Verify migration:
Terminal window
cargo run --bin migrate-storage -- --verify
  1. Update backend configuration to use S3
  2. Restart backend services

From S3 to Filesystem

Same process in reverse:

Terminal window
cargo run --bin migrate-storage -- \
--from s3 \
--s3-bucket artifact-keeper-prod \
--to filesystem \
--to-path /var/lib/artifact-keeper/artifacts

Performance Tuning

Filesystem

Terminal window
# Use faster filesystem for metadata
STORAGE_PATH=/mnt/ssd/artifacts
# Enable direct I/O for large files
STORAGE_DIRECT_IO=true
# Adjust buffer sizes
STORAGE_BUFFER_SIZE=1048576 # 1 MB

S3

Terminal window
# Multipart upload threshold
S3_MULTIPART_THRESHOLD=104857600 # 100 MB
# Chunk size for multipart uploads
S3_MULTIPART_CHUNK_SIZE=10485760 # 10 MB
# Connection pooling
S3_MAX_CONNECTIONS=50
# Enable transfer acceleration (AWS S3 only)
S3_USE_TRANSFER_ACCELERATION=true

Backup Considerations

Filesystem Backend

Use standard backup tools:

Terminal window
# rsync to backup location
rsync -av /var/lib/artifact-keeper/artifacts/ /backup/artifacts/
# Tar archive
tar -czf artifacts-backup.tar.gz /var/lib/artifact-keeper/artifacts

S3 Backend

Enable versioning and lifecycle policies:

Terminal window
# Enable versioning
aws s3api put-bucket-versioning \
--bucket artifact-keeper-prod \
--versioning-configuration Status=Enabled
# Lifecycle rule for old versions
aws s3api put-bucket-lifecycle-configuration \
--bucket artifact-keeper-prod \
--lifecycle-configuration file://lifecycle.json

Use S3 replication for disaster recovery:

Terminal window
aws s3api put-bucket-replication \
--bucket artifact-keeper-prod \
--replication-configuration file://replication.json

Monitoring

Storage Metrics

Monitor these metrics:

  • Total storage size
  • Number of artifacts
  • Upload/download throughput
  • Error rates (failed uploads/downloads)
  • Storage backend latency

Prometheus Metrics

artifact_keeper_storage_size_bytes
artifact_keeper_storage_objects_total
artifact_keeper_storage_upload_duration_seconds
artifact_keeper_storage_download_duration_seconds
artifact_keeper_storage_errors_total

Health Checks

Terminal window
# Check storage backend connectivity
curl https://registry.example.com/api/v1/admin/health/storage

Troubleshooting

Filesystem Permission Errors

Terminal window
# Check ownership
ls -la /var/lib/artifact-keeper/artifacts
# Fix permissions
sudo chown -R artifact-keeper:artifact-keeper /var/lib/artifact-keeper

S3 Connection Issues

Terminal window
# Test credentials with AWS CLI
aws s3 ls s3://artifact-keeper-prod --profile your-profile
# Verify endpoint connectivity
curl -v https://s3.us-east-1.amazonaws.com

High Storage Costs

  • Enable garbage collection
  • Set retention policies
  • Use S3 lifecycle rules to move to cheaper storage classes
  • Compress artifacts before upload
  • Deduplicate using content-addressable storage