3ccaf78e5d
fix: simplify model version handling in proxy functions
...
Simplify the logic for handling model versions in `openai_chat_completions_proxy` and `openai_completions_proxy` by removing redundant conditions and initializing `local_model` earlier. This makes the code more readable and maintains the same functionality.
2025-12-13 12:34:24 +01:00
34d6abd28b
refactor: optimize token aggregation query and enhance chat proxy
...
- Refactored token aggregation query in db.py to use a single SQL query with SUM() instead of iterating through rows, improving performance
- Combined import statements in db.py and router.py to reduce lines of code
- Enhanced chat proxy in router.py to handle "moe-" prefixed models with multiple query execution and critique generation
- Added last_user_content() helper function to extract user content from messages
- Improved code readability and maintainability through these structural changes
2025-12-13 11:58:49 +01:00
59a8ef3abb
refactor: use a persistent WAL-enabled connection with async locks
...
- Introduce a lazily initialized, shared aiosqlite connection stored in self._db and two asyncio locks (_db_lock, _operation_lock) for safe concurrent access
- Ensure the database directory exists before connecting and enable WAL journaling and foreign keys on first connect
- Add close method to gracefully close the persistent connection
- Guard initialization and write operations with _operation_lock to ensure single-threaded schema setup
- Switch to ON CONFLICT UPSERT for token_counts updates and initialize token_time_series table
- Add typing for _db (Optional[aiosqlite.Connection]) and adjust imports accordingly
addition: Frontend button with total stats aggregation task and feedback span element to keep user informed and a small database footprint
2025-12-02 12:18:23 +01:00
0ffb321154
fixing total stats model, button, labels and code clean up
2025-11-28 14:59:29 +01:00
1c3f9a9dc4
fix model naming to allow correct decrement usage counter in /v1 endpoints
2025-11-24 09:33:54 +01:00
7b50a5a299
adding usage metrics to /v1 endpoints if stream == True
2025-11-21 09:56:42 +01:00
45d1d442ee
sqlite: adding connection pooling and WAL
2025-11-20 15:37:04 +01:00
aa23a4dd81
fixing timezone issues
2025-11-20 12:53:18 +01:00
0d187e91b9
fixing chart timescales
2025-11-20 09:53:28 +01:00
e0c6861f2f
aggregating token_counts for stats over all endpoints and adjusting the color mapping
2025-11-20 09:22:45 +01:00
3f77a8ec62
chart enhancements
2025-11-19 17:28:31 +01:00
79a7ca972b
initial chart view
2025-11-19 17:05:25 +01:00
541f2826e0
fixing token_queue, prepping chart view
2025-11-18 19:02:36 +01:00
baf5d98318
adding token timeseries counting in db for future data viz
2025-11-18 11:16:21 +01:00
8a05f2ac44
cache loaded models to decrease load on ollamas
2025-11-17 14:40:24 +01:00
06103e5f01
cache loaded models to decrease load on ollamas
2025-11-17 14:40:22 +01:00
4c7ebb5af4
cancel token_worker_task only if running
2025-11-14 15:53:26 +01:00
b9933e000f
rollback - needs more logic in v1/embedding
2025-11-13 13:32:46 +01:00
9f90bc9cd0
fixing /v1/embedding ollama notations
2025-11-13 12:40:40 +01:00
8aef941385
stopping the token_worker_task gracefully on shutdown
2025-11-13 10:13:10 +01:00
f14d9dc7da
don't query non-Ollama endpoints for health status
2025-11-13 10:06:23 +01:00
1427e98e6d
various performance improvements and json replacement orjson
2025-11-10 15:37:46 +01:00
c6c1059ede
Merge pull request #12 from nomyo-ai/dev-v0.4.x
...
token usage counter for non-stream openai ollama endpoints and improvements
2025-11-08 11:54:33 +01:00
4e0b2f9fee
Merge pull request #11 from YetheSamartaka:main
...
Add Docker support
2025-11-07 16:17:45 +01:00
YetheSamartaka
9a4bcb6f97
Add Docker support
...
Adds comprehensive docker support
2025-11-07 13:59:16 +01:00
47a39184ad
token usage counter for non-stream openai ollama endpoints added
2025-11-06 14:27:34 +01:00
f0f6069577
Merge branch 'dev-v0.4.x' of https://github.com/nomyo-ai/nomyo-router into dev-v0.4.x
2025-11-04 17:55:22 +01:00
4c9ec5b1b2
record and display total token usage on ollama endpoints using ollama client
2025-11-04 17:55:19 +01:00
60694e885b
Update README.md
2025-10-31 13:54:22 +01:00
9007f686c2
performance increase of iso8601_ns ~49%
2025-10-30 10:17:18 +01:00
20f4d1ac96
Merge pull request #10 from nomyo-ai/dev-v0.3.x
...
Fixes and Improvements for 0.4 release
2025-10-30 09:26:12 +01:00
b55f56333f
Merge pull request #9 from nomyo-ai/dependabot/pip/starlette-0.49.1
...
Bump starlette from 0.47.2 to 0.49.1
2025-10-30 09:23:00 +01:00
26dcbf9c02
fixing app logic and eventListeners in frontend
2025-10-30 09:06:21 +01:00
dependabot[bot]
b04f0e3a44
Bump starlette from 0.47.2 to 0.49.1
...
Bumps [starlette](https://github.com/Kludex/starlette ) from 0.47.2 to 0.49.1.
- [Release notes](https://github.com/Kludex/starlette/releases )
- [Changelog](https://github.com/Kludex/starlette/blob/main/docs/release-notes.md )
- [Commits](https://github.com/Kludex/starlette/compare/0.47.2...0.49.1 )
---
updated-dependencies:
- dependency-name: starlette
dependency-version: 0.49.1
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
2025-10-28 22:30:50 +00:00
3585f90437
fixing typos and smaller issues
2025-10-28 11:08:52 +01:00
b72673d693
check for base64 encoded images and remove alpha channel
2025-10-03 10:04:50 +02:00
11f6e2dca6
data-url handling and removing alpha channel in images
2025-09-24 18:10:17 +02:00
ac25feadf8
requirements fix
2025-09-24 16:40:26 +02:00
e66c0ed0fc
new requirement for image preprocessing to downsize and convert to png for faster and safer transaction
2025-09-24 11:46:38 +02:00
738d981157
poc: messsage translation with images
2025-09-23 17:33:15 +02:00
8327ab4ae1
rm print statements
2025-09-23 14:47:55 +02:00
1668cb1577
Merge pull request #7 from nomyo-ai/dev-v0.3.x
...
Dev v0.3.x to v0.3.2
2025-09-23 13:12:58 +02:00
fcfabbe926
mitigating div by zero due to google genai sending completion_token=0 in first chunk
2025-09-23 13:08:17 +02:00
a74cc5be0f
fixing endpoint usage metrics
2025-09-23 12:51:37 +02:00
19df75afa9
fixing types and params
2025-09-22 19:01:14 +02:00
c43dc4139f
adding optional parameters in ollama to openai translation
2025-09-22 14:04:19 +02:00
18d2fca027
formatting Response Objects in rechunk and fixing TypeErrors in /api/chat and /api/generate
2025-09-22 09:30:27 +02:00
aeca77c1a1
formatting, condensing rechunk
2025-09-21 16:33:43 +02:00
ffee2baab8
Merge pull request #6 from nomyo-ai/dev-v0.3.x
...
This is adding quite a few improvements, fixes and already preparations for v0.4
increasing compatibility and stability and even performance.
2025-09-21 16:23:18 +02:00
43d95fbf38
fixing headers, using ollama.Responses in rechunk class, fixing reseverd words var usage, fixing embedding output, fixing model naming in frontend
2025-09-21 16:20:36 +02:00