Fix ontology selector defaults, add bypass mode, enforce domain/range (#929)

- Align similarity_threshold default to 0.3 everywhere (class signature
  had stale 0.7). Fix matching contradiction in tech-spec.
- Add bypass_selector_below parameter (default 5) to skip vector
  similarity selection when ontology element count is small enough.
- Enforce domain/range constraints in TripleConverter for object
  properties and datatype properties, with subclass hierarchy support.
  Properties with no declared domain/range pass through unchanged.
- Add unit tests for domain/range validation, subclass acceptance,
  polymorphic pass-through, and selector bypass.

Fixes #908, #920
This commit is contained in:
cybermaggedon 2026-05-16 15:13:38 +01:00 committed by GitHub
parent aea4c2df8e
commit 38d9c746a8
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 501 additions and 13 deletions

View file

@ -278,7 +278,7 @@ The system uses **FAISS (Facebook AI Similarity Search)** with IndexFlatIP for e
3. **Similarity Search**:
- For each text segment embedding, search the vector store
- Retrieve top-k (e.g., 10) most similar ontology elements
- Apply similarity threshold (e.g., 0.7) to filter weak matches
- Apply similarity threshold (e.g., 0.3) to filter weak matches
- Aggregate results across all segments, tracking match frequencies
4. **Dependency Resolution**: