- Renamed Feedback class to follow PascalCase convention
- Fixed candidate enumeration start index from 0 to 1
- Simplified candidate content access by removing .message.content
- Updated CONFIG_PATH environment variable name to CONFIG_PATH_ARG
- Bumped version from 0.5 to 0.6
- Removed unnecessary return statement and trailing newline
Mixture of Experts:
- prefix any ollama model with "moe-" on api/chat and the original user request gets passed to the selected model 3 times with temp=1 to get response variants. Each variant is then revisited and finally scored to find the best repsonse among them all and finally returned to the user. Runs longer, uses more tokens for expected better quality response.
- Refactored token aggregation query in db.py to use a single SQL query with SUM() instead of iterating through rows, improving performance
- Combined import statements in db.py and router.py to reduce lines of code
- Enhanced chat proxy in router.py to handle "moe-" prefixed models with multiple query execution and critique generation
- Added last_user_content() helper function to extract user content from messages
- Improved code readability and maintainability through these structural changes