mirror of https://github.com/katanemo/plano.git synced 2026-05-08 07:12:42 +02:00

Salman Paracha 8654d3d5c5 simplify developer getting started experience (#102 ) * Fixed build. Now, we have a bare bones version of the docker-compose file with only two services, archgw and archgw-model-server. Tested using CLI * some pre-commit fixes * fixed cargo formatting issues * fixed model server conflict changes --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>		2024-10-01 10:02:23 -07:00
..
api_server	Rename bolt_config to arch_config (#100 )	2024-09-30 18:47:35 -07:00
grafana	fix demos code (#76 )	2024-09-24 14:34:22 -07:00
prometheus	Rename bolt_config to arch_config (#100 )	2024-09-30 18:47:35 -07:00
Bolt-FC-1B-Q4_K_M.model_file	fix demos code (#76 )	2024-09-24 14:34:22 -07:00
bolt_config.yaml	Added Float type to the function parameter values (#77 )	2024-09-25 13:29:20 -07:00
docker-compose.yaml	simplify developer getting started experience (#102 )	2024-10-01 10:02:23 -07:00
README.md	Added Float type to the function parameter values (#77 )	2024-09-25 13:29:20 -07:00

README.md

Function calling

This demo shows how you can use intelligent prompt gateway to act a copilot for calling the correct proc by capturing the required and optional parametrs from the prompt. This demo assumes you are using ollama running natively. If you want to run ollama running inside docker then please update ollama endpoint in docker-compose file.

Starting the demo

Create .env file and set OpenAI key using env var OPENAI_API_KEY
Start services
```
docker compose up
```
Download Bolt-FC model. This demo assumes we have downloaded Bolt-Function-Calling-1B:Q4_K_M to local folder.
If running ollama natively run
```
ollama serve
```

Create model file in ollama repository

ollama create Bolt-Function-Calling-1B:Q4_K_M -f Bolt-FC-1B-Q4_K_M.model_file

Navigate to http://localhost:18080/
You can type in queries like "show me the top 5 employees in each department with highest salary"
- You can also ask follow up questions like "just show the top 2"
To see metrics navigate to "http://localhost:3000/" (use admin/grafana for login)
- Open up dahsboard named "Intelligent Gateway Overview"
- On this dashboard you can see reuqest latency and number of requests