plano/demos/function_calling/README.md

# Function calling
This demo shows how you can use intelligent prompt gateway to do function calling. This demo assumes you are using ollama running natively. If you want to run ollama running inside docker then please update ollama endpoint in docker-compose file.

# Startig the demo
1. Ensure that submodule is up to date
   ```sh
   git submodule sync --recursive
   ```
1. Create `.env` file and set OpenAI key using env var `OPENAI_API_KEY`
1. Start services
   ```sh
   docker compose up
   ```
1. Download Bolt-FC model. This demo assumes we have downloaded [Bolt-Function-Calling-1B:Q4_K_M](https://huggingface.co/katanemolabs/Bolt-Function-Calling-1B.gguf/blob/main/Bolt-Function-Calling-1B-Q4_K_M.gguf) to local folder.
1. If running ollama natively run
   ```sh
   ollama serve
   ```
2. Create model file in ollama repository
   ```sh
   ollama create Bolt-Function-Calling-1B:Q4_K_M -f Bolt-FC-1B-Q4_K_M.model_file
   ```
3. Navigate to http://localhost:18080/
4. You can type in queries like "how is the weather in Seattle"
   - You can also ask follow up questions like "show me sunny days"
5. To see metrics navigate to "http://localhost:3000/" (use admin/grafana for login)
   - Open up dahsboard named "Intelligent Gateway Overview"
   - On this dashboard you can see reuqest latency and number of requests

Here is sample interaction,

<img width="575" alt="image" src="https://github.com/user-attachments/assets/e0929490-3eb2-4130-ae87-a732aea4d059">
Add function calling support using bolt-fc-1b (#35) 2024-09-10 14:24:46 -07:00			`# Function calling`
			`This demo shows how you can use intelligent prompt gateway to do function calling. This demo assumes you are using ollama running natively. If you want to run ollama running inside docker then please update ollama endpoint in docker-compose file.`

			`# Startig the demo`
			`1. Ensure that submodule is up to date`
			```sh
			`git submodule sync --recursive`
			```
			1. Create `.env` file and set OpenAI key using env var `OPENAI_API_KEY`
			`1. Start services`
			```sh
			`docker compose up`
			```
update readme 2024-09-10 14:42:58 -07:00			`1. Download Bolt-FC model. This demo assumes we have downloaded [Bolt-Function-Calling-1B:Q4_K_M](https://huggingface.co/katanemolabs/Bolt-Function-Calling-1B.gguf/blob/main/Bolt-Function-Calling-1B-Q4_K_M.gguf) to local folder.`
Add small clarification to function calling demo (#48) Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com> 2024-09-11 11:37:53 -07:00			`1. If running ollama natively run`
			```sh
			`ollama serve`
			```
Add function calling support using bolt-fc-1b (#35) 2024-09-10 14:24:46 -07:00			`2. Create model file in ollama repository`
			```sh
			`ollama create Bolt-Function-Calling-1B:Q4_K_M -f Bolt-FC-1B-Q4_K_M.model_file`
			```
			`3. Navigate to http://localhost:18080/`
			`4. You can type in queries like "how is the weather in Seattle"`
update readme 2024-09-10 14:42:58 -07:00			`- You can also ask follow up questions like "show me sunny days"`
Add function calling support using bolt-fc-1b (#35) 2024-09-10 14:24:46 -07:00			`5. To see metrics navigate to "http://localhost:3000/" (use admin/grafana for login)`
update readme 2024-09-10 14:42:58 -07:00			`- Open up dahsboard named "Intelligent Gateway Overview"`
			`- On this dashboard you can see reuqest latency and number of requests`
Update README.md 2024-09-10 14:25:52 -07:00
			`Here is sample interaction,`

			`<img width="575" alt="image" src="https://github.com/user-attachments/assets/e0929490-3eb2-4130-ae87-a732aea4d059">`