Commit graph

6814 commits

Author SHA1 Message Date
CREDO23
a987ef81b2 add format_title helper for notification titles 2026-06-17 15:06:05 +02:00
CREDO23
5d20cf7c03 add notification TITLE_MAX_LENGTH constant 2026-06-17 15:06:05 +02:00
CREDO23
aee0f1ef7d add persist_scratch_index unit tests 2026-06-17 14:59:24 +02:00
CREDO23
a8a1f01945 update index batch parallel tests 2026-06-17 14:59:24 +02:00
CREDO23
aca23b4731 wire persist_scratch_index into scratch reindex 2026-06-17 14:59:24 +02:00
CREDO23
34de6c6f87 batch chunk inserts in persist_scratch_index 2026-06-17 14:59:24 +02:00
CREDO23
220d9c4fbb add INDEXING_CHUNK_INSERT_BATCH_SIZE config 2026-06-17 14:59:19 +02:00
DESKTOP-RTLN3BA\$punk
0fe650fd8e Merge commit '7ce409c580' into dev 2026-06-16 22:48:14 -07:00
Rohan Verma
f75878f907
Merge pull request #1506 from okxint/fix/xinference-relative-image-url
fix(image-gen): resolve relative URLs returned by Xinference and compatible backends
2026-06-16 22:41:52 -07:00
okxint
a12cd21f2f fix(image-gen): resolve relative URLs returned by Xinference and compatible backends
Some OpenAI-compatible image backends (e.g. Xinference) return a relative
URL like /files/image.png in data[0].url instead of an absolute one.
Browsers cannot resolve these, causing images to fail to load.

Track the provider's api_base after resolving model config via to_litellm().
When the returned URL starts with "/", extract the origin (scheme + host + port)
from api_base and prepend it to produce a full absolute URL.

No behaviour change for providers that return absolute URLs (OpenAI, Azure, etc).

Closes #1496
2026-06-17 10:57:39 +05:30
Rohan Verma
a49103870b
Merge pull request #1503 from dmitrymaranik/fix/connector-index-cross-tenant-authz
fix(connectors): scope index endpoint authorization to the connector's own search space
2026-06-16 17:01:13 -07:00
Rohan Verma
7ce409c580
Merge pull request #1502 from MODSetter/fix/db-startup-index-lock-hang
hotpatch: Fix/db startup index lock hang
2026-06-16 16:28:38 -07:00
DESKTOP-RTLN3BA\$punk
b9702b3245 chore: linting 2026-06-16 16:27:16 -07:00
DESKTOP-RTLN3BA\$punk
da64433439 fix(db): reap orphaned idle-in-transaction sessions on the Celery engine
The long-running ingestion/podcast/video tasks run on a separate Celery
engine (NullPool), so the web engine's idle_in_transaction_session_timeout
did not cover them — which is exactly where the original 11h zombie
(INSERT INTO chunks) came from. Apply the same protection to the Celery
engine with a generous 60-minute default so a worker that hangs/crashes
mid-transaction can't hold locks on documents/chunks indefinitely, while
never reaping a legitimate per-document embed window.

- config + .env.example: DB_CELERY_IDLE_IN_TX_TIMEOUT_MS (default 3600000).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-16 16:26:04 -07:00
DESKTOP-RTLN3BA\$punk
89cc3b37ee fix(db): prevent boot-time index DDL from hanging FastAPI startup
A single abandoned "idle in transaction" session held locks on the
documents table, which blocked the non-concurrent CREATE INDEX (hnsw)
run inside the FastAPI lifespan. Each API restart queued another
CREATE INDEX behind an advisory lock, leaving the server stuck at
"Waiting for application startup." indefinitely and freezing ingestion
writes.

Changes:
- setup_indexes(): build every index with CREATE INDEX CONCURRENTLY
  (non-blocking ShareUpdateExclusiveLock) under a per-session
  lock_timeout, and make each statement non-fatal so a contended/slow
  build is retried next boot instead of wedging startup. Drop leftover
  invalid indexes before rebuilding.
- create_db_and_tables(): apply lock_timeout to extension/create_all
  DDL and gate the whole bootstrap behind DB_BOOTSTRAP_ON_STARTUP.
- engine: set idle_in_transaction_session_timeout (asyncpg) so an
  abandoned transaction is reaped automatically.
- config + .env.example: DB_BOOTSTRAP_ON_STARTUP, DB_DDL_LOCK_TIMEOUT_MS,
  DB_IDLE_IN_TX_TIMEOUT_MS.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-16 16:18:49 -07:00
Dmitry Maranik
81fc467187 test(connectors): regression tests for cross-search-space index authorization
Two integration tests pinning the connector index endpoint's authorization:

- cross-space index (attacker owns space B, connector lives in victim's
  space A, request passes search_space_id=B) is rejected with 404 at the
  search-space reconciliation, before the permission check (which would
  otherwise pass for the attacker's own space).
- same-space index authorizes check_permission against the connector's
  own search space, not the caller-supplied query param.

Mirrors the existing tests/integration harness (direct handler calls with
the savepoint-rolled-back db_session; check_permission patched so the test
needs no real RBAC wiring).
2026-06-16 16:18:40 -07:00
Dmitry Maranik
e1ea82d7cf fix(connectors): scope index endpoint authorization to the connector's own search space
The POST /search-source-connectors/{connector_id}/index endpoint loaded
the connector by id and then called check_permission() against the
client-supplied search_space_id query parameter (the caller's own space)
rather than the connector's own search_space_id, and never verified that
the two matched.

A user could therefore index another user's connector by passing their
own search_space_id: the indexer ran with the victim connector's stored
credentials and wrote the fetched content into the attacker's search
space. The read/update/delete handlers already authorize against
connector.search_space_id; this brings the index handler in line.

Reject a connector that does not belong to the requested search space
(404, to avoid disclosing connectors in other spaces) and authorize the
permission check against connector.search_space_id.
2026-06-16 15:58:30 -07:00
DESKTOP-RTLN3BA\$punk
8172f0f586 chore(migration): added dead users cleanup 2026-06-16 15:48:17 -07:00
DESKTOP-RTLN3BA\$punk
5d99489f4b feat(migration): implement chunk position backfill with batched updates and indexing for improved performance 2026-06-16 15:19:56 -07:00
Thierry CH.
284df841ef
Merge pull request #1501 from CREDO23/feat/podcast-brief-duration-seconds
feat(podcasts): short default brief length with seconds and unit picker
2026-06-16 15:04:07 -07:00
CREDO23
7584312712 style(podcasts): fix ruff issues in podcast spec schema
Remove duplicate typing import and format legacy minute coercion guard.
2026-06-16 23:57:36 +02:00
CREDO23
7a415b61ea test: align QuotaInsufficientError fixtures with balance_micros API
Billable calls now raise quota errors with balance_micros instead of
used_micros/limit_micros; update mocks so CI passes on main.
2026-06-16 23:56:11 +02:00
CREDO23
fd96c930bf test(podcasts): cover seconds duration and legacy minute specs 2026-06-16 23:38:28 +02:00
CREDO23
f997b6464e test(podcasts): update renderer test for second-based duration 2026-06-16 23:38:28 +02:00
CREDO23
cb70b64a70 test(podcasts): update unit fixtures for second-based duration 2026-06-16 23:38:28 +02:00
CREDO23
38991c7db8 test(podcasts): update integration fixtures for second-based duration 2026-06-16 23:38:28 +02:00
CREDO23
bab3f7c0d4 feat(web): add unit dropdown for podcast brief target length 2026-06-16 23:38:28 +02:00
CREDO23
608620d649 feat(web): add seconds-based podcast duration types with legacy support 2026-06-16 23:38:28 +02:00
CREDO23
16d226e5ce refactor(podcasts): plan transcript length from midpoint seconds 2026-06-16 23:38:28 +02:00
CREDO23
116c38feac refactor(podcasts): build DurationTarget from brief seconds config 2026-06-16 23:38:28 +02:00
CREDO23
af08e2f033 refactor(podcasts): propose brief with min_seconds and max_seconds 2026-06-16 23:38:28 +02:00
CREDO23
d0ed5b94d9 refactor(podcasts): use shared second-based brief duration defaults 2026-06-16 23:38:28 +02:00
CREDO23
845653cbac feat(podcasts): pass min_seconds and max_seconds when proposing brief 2026-06-16 23:38:27 +02:00
CREDO23
085442ed9a feat(podcasts): use seconds defaults on create podcast request 2026-06-16 23:38:27 +02:00
CREDO23
32e0d21604 feat(podcasts): store brief duration in seconds with legacy load 2026-06-16 23:38:27 +02:00
CREDO23
9583e8f250 feat(podcasts): add shared duration limit constants 2026-06-16 23:38:27 +02:00
Rohan Verma
b6d25d3828
Merge pull request #1499 from AnishSarkar22/feat/reverse-proxy
feat: Add single-origin reverse proxy deployment with runtime web config
2026-06-16 14:03:27 -07:00
Anish Sarkar
61f071ae68 refactor(web): replace Card component with Alert for messaging channels notification to enhance user experience 2026-06-17 00:06:41 +05:30
Anish Sarkar
9b7e278114 refactor(config): update GATEWAY_ENABLED variable to FALSE and adjust related configurations for improved messaging gateway handling 2026-06-16 23:49:26 +05:30
Thierry CH.
683a827300
Merge pull request #1500 from CREDO23/fix/podcast-stream-missing-audio
fix(podcasts): guard stream when audio missing and share object store volume
2026-06-16 11:16:27 -07:00
CREDO23
a7be41d50a fix(docker): share persistent object_store volume in dev 2026-06-16 20:09:08 +02:00
CREDO23
fc045d200d fix(docker): share persistent object_store volume across services 2026-06-16 20:09:08 +02:00
CREDO23
1048d0afc3 test(podcasts): cover public stream missing-object 404 2026-06-16 20:09:08 +02:00
CREDO23
810ded2dde test(podcasts): cover in-flight 409 and missing-object 404 2026-06-16 20:09:08 +02:00
CREDO23
86a8833fb4 test(podcasts): add exists to fake storage backend 2026-06-16 20:09:08 +02:00
CREDO23
1d70af4684 fix(podcasts): guard public stream against missing audio 2026-06-16 20:09:08 +02:00
CREDO23
0c2808640a fix(podcasts): guard stream against missing audio 2026-06-16 20:09:08 +02:00
CREDO23
d2558e546e feat(podcasts): add audio_exists storage helper 2026-06-16 20:09:08 +02:00
Anish Sarkar
4ed6343b91 refactor(docker): remove docker-entrypoint.sh and update Dockerfile to use CMD for server execution 2026-06-16 22:01:23 +05:30
Anish Sarkar
55c2e5c0d8 refactor(web): enhance redirect response in callback route 2026-06-16 21:00:53 +05:30