feat: adding a semantic cache layer
This commit is contained in:
parent
c3d47c7ffe
commit
dd4b12da6a
13 changed files with 1138 additions and 22 deletions
|
|
@ -82,10 +82,23 @@ sudo systemctl status nomyo-router
|
|||
|
||||
## 2. Docker Deployment
|
||||
|
||||
### Image variants
|
||||
|
||||
| Tag | Semantic cache | Image size |
|
||||
|---|---|---|
|
||||
| `latest` | ❌ exact match only | ~300 MB |
|
||||
| `latest-semantic` | ✅ sentence-transformers + `all-MiniLM-L6-v2` pre-baked | ~800 MB |
|
||||
|
||||
The `:semantic` variant enables `cache_similarity < 1.0` in `config.yaml`. The lean image falls back to exact-match caching with a warning if semantic mode is configured.
|
||||
|
||||
### Build the Image
|
||||
|
||||
```bash
|
||||
# Lean build (exact match cache, default)
|
||||
docker build -t nomyo-router .
|
||||
|
||||
# Semantic build (~500 MB larger, all-MiniLM-L6-v2 model baked in at build time)
|
||||
docker build --build-arg SEMANTIC_CACHE=true -t nomyo-router:semantic .
|
||||
```
|
||||
|
||||
### Run the Container
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue