mirror of https://github.com/katanemo/plano.git synced 2026-04-25 08:46:24 +02:00

Adil Hafeez 4438dc7979 remove embeddings config from config (#64 ) * remove embeddings config from config * remove embedding provider		2024-09-19 17:49:42 -07:00
..
grafana	improve service names (#54 )	2024-09-17 08:47:35 -07:00
prometheus	improve service names (#54 )	2024-09-17 08:47:35 -07:00
Bolt-FC-1B-Q3_K_L.model_file	improve service names (#54 )	2024-09-17 08:47:35 -07:00
Bolt-FC-1B-Q4_K_M.model_file	improve service names (#54 )	2024-09-17 08:47:35 -07:00
bolt_config.yaml	remove embeddings config from config (#64 )	2024-09-19 17:49:42 -07:00
docker-compose.yaml	fix webui url and dependencies (#66 )	2024-09-19 17:48:50 -07:00
README.md	improve service names (#54 )	2024-09-17 08:47:35 -07:00

README.md

Function calling

This demo shows how you can use intelligent prompt gateway to do function calling. This demo assumes you are using ollama running natively. If you want to run ollama running inside docker then please update ollama endpoint in docker-compose file.

Startig the demo

Ensure that submodule is up to date
```
git submodule sync --recursive
```
Create .env file and set OpenAI key using env var OPENAI_API_KEY
Start services
```
docker compose up
```
Download Bolt-FC model. This demo assumes we have downloaded Bolt-Function-Calling-1B:Q4_K_M to local folder.
If running ollama natively run
```
ollama serve
```

Create model file in ollama repository

ollama create Bolt-Function-Calling-1B:Q4_K_M -f Bolt-FC-1B-Q4_K_M.model_file

Navigate to http://localhost:18080/
You can type in queries like "how is the weather in Seattle"
- You can also ask follow up questions like "show me sunny days"
To see metrics navigate to "http://localhost:3000/" (use admin/grafana for login)
- Open up dahsboard named "Intelligent Gateway Overview"
- On this dashboard you can see reuqest latency and number of requests

Here is sample interaction,