Commit graph

3 commits

Author SHA1 Message Date
434b6d4cca finalize feat:
Mixture of Experts:
- prefix any ollama model with "moe-" on api/chat and the original user request gets passed to the selected model 3 times with temp=1 to get response variants. Each variant is then revisited and finally scored to find the best repsonse among them all and finally returned to the user. Runs longer, uses more tokens for expected better quality response.
2025-12-16 09:46:36 +01:00
19a13cc613 fix(enhance.py): correct typo in function name from 'moe_select_candiadate' to 'moe_select_candidate'
feat(router.py): add helper function _make_chat_request for handling enhancing chat requests to endpoints
2025-12-15 10:35:56 +01:00
34d6abd28b refactor: optimize token aggregation query and enhance chat proxy
- Refactored token aggregation query in db.py to use a single SQL query with SUM() instead of iterating through rows, improving performance
- Combined import statements in db.py and router.py to reduce lines of code
- Enhanced chat proxy in router.py to handle "moe-" prefixed models with multiple query execution and critique generation
- Added last_user_content() helper function to extract user content from messages
- Improved code readability and maintainability through these structural changes
2025-12-13 11:58:49 +01:00