Commit graph

141 commits

Author SHA1 Message Date
3ccaf78e5d fix: simplify model version handling in proxy functions
Simplify the logic for handling model versions in `openai_chat_completions_proxy` and `openai_completions_proxy` by removing redundant conditions and initializing `local_model` earlier. This makes the code more readable and maintains the same functionality.
2025-12-13 12:34:24 +01:00
34d6abd28b refactor: optimize token aggregation query and enhance chat proxy
- Refactored token aggregation query in db.py to use a single SQL query with SUM() instead of iterating through rows, improving performance
- Combined import statements in db.py and router.py to reduce lines of code
- Enhanced chat proxy in router.py to handle "moe-" prefixed models with multiple query execution and critique generation
- Added last_user_content() helper function to extract user content from messages
- Improved code readability and maintainability through these structural changes
2025-12-13 11:58:49 +01:00
59a8ef3abb refactor: use a persistent WAL-enabled connection with async locks
- Introduce a lazily initialized, shared aiosqlite connection stored in self._db and two asyncio locks (_db_lock, _operation_lock) for safe concurrent access
- Ensure the database directory exists before connecting and enable WAL journaling and foreign keys on first connect
- Add close method to gracefully close the persistent connection
- Guard initialization and write operations with _operation_lock to ensure single-threaded schema setup
- Switch to ON CONFLICT UPSERT for token_counts updates and initialize token_time_series table
- Add typing for _db (Optional[aiosqlite.Connection]) and adjust imports accordingly

addition: Frontend button with total stats aggregation task and feedback span element to keep user informed and a small database footprint
2025-12-02 12:18:23 +01:00
0ffb321154 fixing total stats model, button, labels and code clean up 2025-11-28 14:59:29 +01:00
1c3f9a9dc4 fix model naming to allow correct decrement usage counter in /v1 endpoints 2025-11-24 09:33:54 +01:00
7b50a5a299 adding usage metrics to /v1 endpoints if stream == True 2025-11-21 09:56:42 +01:00
45d1d442ee sqlite: adding connection pooling and WAL 2025-11-20 15:37:04 +01:00
aa23a4dd81 fixing timezone issues 2025-11-20 12:53:18 +01:00
0d187e91b9 fixing chart timescales 2025-11-20 09:53:28 +01:00
e0c6861f2f aggregating token_counts for stats over all endpoints and adjusting the color mapping 2025-11-20 09:22:45 +01:00
3f77a8ec62 chart enhancements 2025-11-19 17:28:31 +01:00
79a7ca972b initial chart view 2025-11-19 17:05:25 +01:00
541f2826e0 fixing token_queue, prepping chart view 2025-11-18 19:02:36 +01:00
baf5d98318 adding token timeseries counting in db for future data viz 2025-11-18 11:16:21 +01:00
8a05f2ac44 cache loaded models to decrease load on ollamas 2025-11-17 14:40:24 +01:00
06103e5f01 cache loaded models to decrease load on ollamas 2025-11-17 14:40:22 +01:00
4c7ebb5af4 cancel token_worker_task only if running 2025-11-14 15:53:26 +01:00
b9933e000f rollback - needs more logic in v1/embedding 2025-11-13 13:32:46 +01:00
9f90bc9cd0 fixing /v1/embedding ollama notations 2025-11-13 12:40:40 +01:00
8aef941385 stopping the token_worker_task gracefully on shutdown 2025-11-13 10:13:10 +01:00
f14d9dc7da don't query non-Ollama endpoints for health status 2025-11-13 10:06:23 +01:00
1427e98e6d various performance improvements and json replacement orjson 2025-11-10 15:37:46 +01:00
c6c1059ede
Merge pull request #12 from nomyo-ai/dev-v0.4.x
token usage counter for non-stream openai ollama endpoints and improvements
2025-11-08 11:54:33 +01:00
4e0b2f9fee
Merge pull request #11 from YetheSamartaka:main
Add Docker support
2025-11-07 16:17:45 +01:00
YetheSamartaka
9a4bcb6f97 Add Docker support
Adds comprehensive docker support
2025-11-07 13:59:16 +01:00
47a39184ad token usage counter for non-stream openai ollama endpoints added 2025-11-06 14:27:34 +01:00
f0f6069577 Merge branch 'dev-v0.4.x' of https://github.com/nomyo-ai/nomyo-router into dev-v0.4.x 2025-11-04 17:55:22 +01:00
4c9ec5b1b2 record and display total token usage on ollama endpoints using ollama client 2025-11-04 17:55:19 +01:00
60694e885b
Update README.md 2025-10-31 13:54:22 +01:00
9007f686c2 performance increase of iso8601_ns ~49% 2025-10-30 10:17:18 +01:00
20f4d1ac96
Merge pull request #10 from nomyo-ai/dev-v0.3.x
Fixes and Improvements for 0.4 release
2025-10-30 09:26:12 +01:00
b55f56333f
Merge pull request #9 from nomyo-ai/dependabot/pip/starlette-0.49.1
Bump starlette from 0.47.2 to 0.49.1
2025-10-30 09:23:00 +01:00
26dcbf9c02 fixing app logic and eventListeners in frontend 2025-10-30 09:06:21 +01:00
dependabot[bot]
b04f0e3a44
Bump starlette from 0.47.2 to 0.49.1
Bumps [starlette](https://github.com/Kludex/starlette) from 0.47.2 to 0.49.1.
- [Release notes](https://github.com/Kludex/starlette/releases)
- [Changelog](https://github.com/Kludex/starlette/blob/main/docs/release-notes.md)
- [Commits](https://github.com/Kludex/starlette/compare/0.47.2...0.49.1)

---
updated-dependencies:
- dependency-name: starlette
  dependency-version: 0.49.1
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-28 22:30:50 +00:00
3585f90437 fixing typos and smaller issues 2025-10-28 11:08:52 +01:00
b72673d693 check for base64 encoded images and remove alpha channel 2025-10-03 10:04:50 +02:00
11f6e2dca6 data-url handling and removing alpha channel in images 2025-09-24 18:10:17 +02:00
ac25feadf8 requirements fix 2025-09-24 16:40:26 +02:00
e66c0ed0fc new requirement for image preprocessing to downsize and convert to png for faster and safer transaction 2025-09-24 11:46:38 +02:00
738d981157 poc: messsage translation with images 2025-09-23 17:33:15 +02:00
8327ab4ae1 rm print statements 2025-09-23 14:47:55 +02:00
1668cb1577
Merge pull request #7 from nomyo-ai/dev-v0.3.x
Dev v0.3.x to v0.3.2
2025-09-23 13:12:58 +02:00
fcfabbe926 mitigating div by zero due to google genai sending completion_token=0 in first chunk 2025-09-23 13:08:17 +02:00
a74cc5be0f fixing endpoint usage metrics 2025-09-23 12:51:37 +02:00
19df75afa9 fixing types and params 2025-09-22 19:01:14 +02:00
c43dc4139f adding optional parameters in ollama to openai translation 2025-09-22 14:04:19 +02:00
18d2fca027 formatting Response Objects in rechunk and fixing TypeErrors in /api/chat and /api/generate 2025-09-22 09:30:27 +02:00
aeca77c1a1 formatting, condensing rechunk 2025-09-21 16:33:43 +02:00
ffee2baab8
Merge pull request #6 from nomyo-ai/dev-v0.3.x
This is adding quite a few improvements, fixes and already preparations for v0.4
increasing compatibility and stability and even performance.
2025-09-21 16:23:18 +02:00
43d95fbf38 fixing headers, using ollama.Responses in rechunk class, fixing reseverd words var usage, fixing embedding output, fixing model naming in frontend 2025-09-21 16:20:36 +02:00