feat: enhance load balancing
correct me if I am wrong, but what you describe is WRR (weighted round robin) mechanism.
the existing implementation would do:
2…
issue: api/version not found
issue: api/version not found
refactored code should work now for all endpoints reliably. feel free to re-open if it doesn't work with your specific config. thanks for reporting this @JTHesse - highly appreaciated!
dev-v0.7.x -> main
dev-v0.7.x -> main
feat: enhance load balancing
the current load-balancing approach with model aware routing requires equally powerful endpoints. as this is rather an ideal world scenario I like the idea of having a mechanism to support…
issue: api/version not found
as ollama also provides a /v1 endpoint there is logic to seperate ollama endpoints and "regular" openai compatible endpoints. it seems to not be triggered successfully in this case.
can you…