Skip to content

Caching

HBIA includes an optional SHA-256 hash-based caching system for pure vertices. When enabled, pure vertices skip re-execution if their inputs haven't changed.

How It Works

Cache Key Generation

For each vertex, the cache manager computes a key by hashing:

SHA256(vertex_name + handler_repr + JSON(resolved_inputs))

This means the cache hit requires:

  • Same vertex name
  • Same handler function
  • Exact same input values

Cache Flow

Vertex Execution:
  1. Compute cache key from vertex + inputs
  2. Check cache: key exists?
     ├─ HIT:  Return cached result (skip handler execution)
     └─ MISS: Execute handler → save result to cache → return

What Gets Cached

Only pure vertices (effect: pure) are cached. Side-effect vertices are always executed, even if the cache is enabled.

normalize:
  effect: pure          # ✅ Cached when cache_enabled=True

save_to_db:
  effect: side_effect   # ❌ Never cached (side effects)

Enabling Caching

Per-Call

result = run_flow(
    graph, handlers={...}, initial_data={...},
    cache_enabled=True,
)

Via Settings

from honey_badgeria.conf import Settings, configure

configure(Settings(CACHE_ENABLED=True))

Via Environment

export HBIA_CACHE_ENABLED=true

CacheManager

The CacheManager handles storage, retrieval, and maintenance:

from honey_badgeria.back.runtime.cache import CacheManager

manager = CacheManager(cache_dir=".hbia_cache")

# Build a cache key
key = manager.build_key(vertex, resolved_inputs)

# Load cached result (None on miss)
cached = manager.load(key)

# Save result
manager.save(key, {"result": 42})

Cache Directory

Results are stored as JSON files in the cache directory (default: .hbia_cache/):

.hbia_cache/
├── a1b2c3d4e5f6...json
├── f7e8d9c0b1a2...json
└── ...

Management

# Get cache statistics
info = manager.info()
# {"entries": 42, "size_bytes": 512000}

# Remove old entries
manager.prune()

# Clear all cached data
manager.clear()

CLI Commands

# View cache statistics
hbia cache-info

# Remove old entries
hbia cache-prune

# Clear all cache data
hbia clear-cache

Cache Bypass

Caching is automatically bypassed in these cases:

Condition Why
effect: side_effect Side effects must always execute
Vertex in atomic group with no_cache: true Atomic groups need fresh execution
CACHE_ENABLED=False (default) Caching is opt-in

Configuration

Setting Default Description
CACHE_ENABLED False Enable/disable caching
CACHE_DIR ".hbia_cache" Directory for cache files
MAX_CACHE_ENTRIES 500 Maximum number of cached results