openai compatible endpoints not used #128

New issue

Closed

opened 2026-06-25 10:08:20 +02:00 by JTHesse · 3 comments

JTHesse commented

2026-06-25 10:08:20 +02:00

Hi alpha nerd, I am not entirely sure if I get this right, but I think that openai compatible endpoints won't be used in combination with ollama endpoints that promote the same model.
I am currently adding a vllm server to the nomyo config. If the endpoint is used solely, everything works just fine.
However, if nomyo is trying to check whether a model is already loaded in _fetch_loaded_models_internal openai compatible endpoints will be filtered out. I guess an exception for those should fix this, so for openai endpoints models are always loaded.
Thank you for considering this ;)

Hi alpha nerd, I am not entirely sure if I get this right, but I think that openai compatible endpoints won't be used in combination with ollama endpoints that promote the same model. I am currently adding a vllm server to the nomyo config. If the endpoint is used solely, everything works just fine. However, if nomyo is trying to check whether a model is already loaded in `_fetch_loaded_models_internal` openai compatible endpoints will be filtered out. I guess an exception for those should fix this, so for openai endpoints models are always loaded. Thank you for considering this ;)

JTHesse commented

2026-06-25 10:19:54 +02:00

Author

In addition to this nomyo also sometimes tries to use models that are not promoted by the endpoint:
[generate_proxy] upstream error from (http://192.168.0.1:8000/v1, qwen2.5-coder:1.5b-base) status=404 type=NotFoundError: Error code: 404 - {'error': {'message': 'The model qwen2.5-coder:1.5b-base does not exist.', 'type': 'NotFoundError', 'param': 'model', 'code': 404}}

In addition to this nomyo also sometimes tries to use models that are not promoted by the endpoint: `[generate_proxy] upstream error from (http://192.168.0.1:8000/v1, qwen2.5-coder:1.5b-base) status=404 type=NotFoundError: Error code: 404 - {'error': {'message': 'The model `qwen2.5-coder:1.5b-base` does not exist.', 'type': 'NotFoundError', 'param': 'model', 'code': 404}}`

alpha-nerd added the

bug

label 2026-06-25 10:55:39 +02:00

alpha-nerd self-assigned this 2026-06-25 10:56:15 +02:00

alpha-nerd commented

2026-06-25 14:11:14 +02:00

Owner

@JTHesse wrote in #128 (comment):

Hi alpha nerd, I am not entirely sure if I get this right, but I think that openai compatible endpoints won't be used in combination with ollama endpoints that promote the same model. I am currently adding a vllm server to the nomyo config. If the endpoint is used solely, everything works just fine. However, if nomyo is trying to check whether a model is already loaded in _fetch_loaded_models_internal openai compatible endpoints will be filtered out. I guess an exception for those should fix this, so for openai endpoints models are always loaded. Thank you for considering this ;)

This is a clear bug in the routing engine and might have been introduced during refactoring.
Thank you for pointing this out. A fix is on the way.

@JTHesse wrote in https://bitfreedom.net/code/nomyo-ai/nomyo-router/issues/128#issue-2255: > Hi alpha nerd, I am not entirely sure if I get this right, but I think that openai compatible endpoints won't be used in combination with ollama endpoints that promote the same model. I am currently adding a vllm server to the nomyo config. If the endpoint is used solely, everything works just fine. However, if nomyo is trying to check whether a model is already loaded in `_fetch_loaded_models_internal` openai compatible endpoints will be filtered out. I guess an exception for those should fix this, so for openai endpoints models are always loaded. Thank you for considering this ;) This is a clear bug in the routing engine and might have been introduced during refactoring. Thank you for pointing this out. A fix is on the way.

👍 1

alpha-nerd referenced this issue from a commit

2026-06-25 14:23:01 +02:00

fix: routing error for openai compatible endpoints #128

alpha-nerd referenced this issue from a commit

2026-06-25 14:23:01 +02:00

fix: treat external openai compatible endpoint always as loaded for the advertised models #128

alpha-nerd referenced this issue from a pull request that will close it,

2026-06-25 14:32:23 +02:00

dev-1.0.x -> main #129

alpha-nerd closed this issue

2026-06-26 08:56:42 +02:00