nomyo-router

Author	SHA1	Message	Date
alpha-nerd-nomyo	3ccaf78e5d	fix: simplify model version handling in proxy functions Simplify the logic for handling model versions in `openai_chat_completions_proxy` and `openai_completions_proxy` by removing redundant conditions and initializing `local_model` earlier. This makes the code more readable and maintains the same functionality.	2025-12-13 12:34:24 +01:00
alpha-nerd-nomyo	34d6abd28b	refactor: optimize token aggregation query and enhance chat proxy - Refactored token aggregation query in db.py to use a single SQL query with SUM() instead of iterating through rows, improving performance - Combined import statements in db.py and router.py to reduce lines of code - Enhanced chat proxy in router.py to handle "moe-" prefixed models with multiple query execution and critique generation - Added last_user_content() helper function to extract user content from messages - Improved code readability and maintainability through these structural changes	2025-12-13 11:58:49 +01:00
alpha-nerd-nomyo	59a8ef3abb	refactor: use a persistent WAL-enabled connection with async locks - Introduce a lazily initialized, shared aiosqlite connection stored in self._db and two asyncio locks (_db_lock, _operation_lock) for safe concurrent access - Ensure the database directory exists before connecting and enable WAL journaling and foreign keys on first connect - Add close method to gracefully close the persistent connection - Guard initialization and write operations with _operation_lock to ensure single-threaded schema setup - Switch to ON CONFLICT UPSERT for token_counts updates and initialize token_time_series table - Add typing for _db (Optional[aiosqlite.Connection]) and adjust imports accordingly addition: Frontend button with total stats aggregation task and feedback span element to keep user informed and a small database footprint	2025-12-02 12:18:23 +01:00
alpha-nerd-nomyo	0ffb321154	fixing total stats model, button, labels and code clean up	2025-11-28 14:59:29 +01:00
alpha-nerd-nomyo	1c3f9a9dc4	fix model naming to allow correct decrement usage counter in /v1 endpoints	2025-11-24 09:33:54 +01:00
alpha-nerd-nomyo	7b50a5a299	adding usage metrics to /v1 endpoints if stream == True	2025-11-21 09:56:42 +01:00
alpha-nerd-nomyo	45d1d442ee	sqlite: adding connection pooling and WAL	2025-11-20 15:37:04 +01:00
alpha-nerd-nomyo	aa23a4dd81	fixing timezone issues	2025-11-20 12:53:18 +01:00
alpha-nerd-nomyo	0d187e91b9	fixing chart timescales	2025-11-20 09:53:28 +01:00
alpha-nerd-nomyo	e0c6861f2f	aggregating token_counts for stats over all endpoints and adjusting the color mapping	2025-11-20 09:22:45 +01:00
alpha-nerd-nomyo	3f77a8ec62	chart enhancements	2025-11-19 17:28:31 +01:00
alpha-nerd-nomyo	79a7ca972b	initial chart view	2025-11-19 17:05:25 +01:00
alpha-nerd-nomyo	541f2826e0	fixing token_queue, prepping chart view	2025-11-18 19:02:36 +01:00
alpha-nerd-nomyo	baf5d98318	adding token timeseries counting in db for future data viz	2025-11-18 11:16:21 +01:00
alpha-nerd-nomyo	8a05f2ac44	cache loaded models to decrease load on ollamas	2025-11-17 14:40:24 +01:00
alpha-nerd-nomyo	06103e5f01	cache loaded models to decrease load on ollamas	2025-11-17 14:40:22 +01:00
alpha-nerd-nomyo	4c7ebb5af4	cancel token_worker_task only if running	2025-11-14 15:53:26 +01:00
alpha-nerd-nomyo	b9933e000f	rollback - needs more logic in v1/embedding	2025-11-13 13:32:46 +01:00
alpha-nerd-nomyo	9f90bc9cd0	fixing /v1/embedding ollama notations	2025-11-13 12:40:40 +01:00
alpha-nerd-nomyo	8aef941385	stopping the token_worker_task gracefully on shutdown	2025-11-13 10:13:10 +01:00
alpha-nerd-nomyo	f14d9dc7da	don't query non-Ollama endpoints for health status	2025-11-13 10:06:23 +01:00
alpha-nerd-nomyo	1427e98e6d	various performance improvements and json replacement orjson	2025-11-10 15:37:46 +01:00
alpha-nerd-nomyo	c6c1059ede	Merge pull request #12 from nomyo-ai/dev-v0.4.x token usage counter for non-stream openai ollama endpoints and improvements	2025-11-08 11:54:33 +01:00
alpha-nerd-nomyo	4e0b2f9fee	Merge pull request #11 from YetheSamartaka:main Add Docker support	2025-11-07 16:17:45 +01:00
YetheSamartaka	9a4bcb6f97	Add Docker support Adds comprehensive docker support	2025-11-07 13:59:16 +01:00
alpha-nerd-nomyo	47a39184ad	token usage counter for non-stream openai ollama endpoints added	2025-11-06 14:27:34 +01:00
alpha-nerd-nomyo	f0f6069577	Merge branch 'dev-v0.4.x' of https://github.com/nomyo-ai/nomyo-router into dev-v0.4.x	2025-11-04 17:55:22 +01:00
alpha-nerd-nomyo	4c9ec5b1b2	record and display total token usage on ollama endpoints using ollama client	2025-11-04 17:55:19 +01:00
alpha-nerd-nomyo	60694e885b	Update README.md	2025-10-31 13:54:22 +01:00
alpha-nerd-nomyo	9007f686c2	performance increase of iso8601_ns ~49%	2025-10-30 10:17:18 +01:00
alpha-nerd-nomyo	20f4d1ac96	Merge pull request #10 from nomyo-ai/dev-v0.3.x Fixes and Improvements for 0.4 release	2025-10-30 09:26:12 +01:00
alpha-nerd-nomyo	b55f56333f	Merge pull request #9 from nomyo-ai/dependabot/pip/starlette-0.49.1 Bump starlette from 0.47.2 to 0.49.1	2025-10-30 09:23:00 +01:00
alpha-nerd-nomyo	26dcbf9c02	fixing app logic and eventListeners in frontend	2025-10-30 09:06:21 +01:00
dependabot[bot]	b04f0e3a44	Bump starlette from 0.47.2 to 0.49.1 Bumps [starlette](https://github.com/Kludex/starlette) from 0.47.2 to 0.49.1. - [Release notes](https://github.com/Kludex/starlette/releases) - [Changelog](https://github.com/Kludex/starlette/blob/main/docs/release-notes.md) - [Commits](https://github.com/Kludex/starlette/compare/0.47.2...0.49.1) --- updated-dependencies: - dependency-name: starlette dependency-version: 0.49.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-10-28 22:30:50 +00:00
alpha-nerd-nomyo	3585f90437	fixing typos and smaller issues	2025-10-28 11:08:52 +01:00
alpha-nerd-nomyo	b72673d693	check for base64 encoded images and remove alpha channel	2025-10-03 10:04:50 +02:00
alpha-nerd-nomyo	11f6e2dca6	data-url handling and removing alpha channel in images	2025-09-24 18:10:17 +02:00
alpha-nerd-nomyo	ac25feadf8	requirements fix	2025-09-24 16:40:26 +02:00
alpha-nerd-nomyo	e66c0ed0fc	new requirement for image preprocessing to downsize and convert to png for faster and safer transaction	2025-09-24 11:46:38 +02:00
alpha-nerd-nomyo	738d981157	poc: messsage translation with images	2025-09-23 17:33:15 +02:00
alpha-nerd-nomyo	8327ab4ae1	rm print statements	2025-09-23 14:47:55 +02:00
alpha-nerd-nomyo	1668cb1577	Merge pull request #7 from nomyo-ai/dev-v0.3.x Dev v0.3.x to v0.3.2	2025-09-23 13:12:58 +02:00
alpha-nerd-nomyo	fcfabbe926	mitigating div by zero due to google genai sending completion_token=0 in first chunk	2025-09-23 13:08:17 +02:00
alpha-nerd-nomyo	a74cc5be0f	fixing endpoint usage metrics	2025-09-23 12:51:37 +02:00
alpha-nerd-nomyo	19df75afa9	fixing types and params	2025-09-22 19:01:14 +02:00
alpha-nerd-nomyo	c43dc4139f	adding optional parameters in ollama to openai translation	2025-09-22 14:04:19 +02:00
alpha-nerd-nomyo	18d2fca027	formatting Response Objects in rechunk and fixing TypeErrors in /api/chat and /api/generate	2025-09-22 09:30:27 +02:00
alpha-nerd-nomyo	aeca77c1a1	formatting, condensing rechunk	2025-09-21 16:33:43 +02:00
alpha-nerd-nomyo	ffee2baab8	Merge pull request #6 from nomyo-ai/dev-v0.3.x This is adding quite a few improvements, fixes and already preparations for v0.4 increasing compatibility and stability and even performance.	2025-09-21 16:23:18 +02:00
alpha-nerd-nomyo	43d95fbf38	fixing headers, using ollama.Responses in rechunk class, fixing reseverd words var usage, fixing embedding output, fixing model naming in frontend	2025-09-21 16:20:36 +02:00

1 2 3

141 commits