mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-24 02:38:06 +02:00
* exec/query: pushdown IR filters via DataFusion Expr (Scanner::filter_expr) Switches `execute_node_scan` from string-flattened Lance SQL pushdown (`build_lance_filter` + `scanner.filter(&str)`) to structured DataFusion Expr pushdown (`build_lance_filter_expr` + `scanner.filter_expr(Expr)`). ## What this enables 1. **`CompOp::Contains` now pushes down.** `ir_filter_to_sql` returned `None` for list-contains (the comment said *"Can't pushdown list contains"*) because string SQL can't easily express it. With Expr, it lowers to DataFusion's `array_has(col, value)` builtin via the `nested_expressions` feature, and pushes down to Lance's scan layer the same way Eq/Lt/etc. do. Pinned by the new regression test `end_to_end::ir_filter_with_list_contains_pushes_down`. 2. **DataFusion 53's optimizer rules now reach our predicates.** Once the Expr lands at the Lance scanner, DF's planner runs: - `IN`-list vectorized eq kernel (DF #20528) - `PhysicalExprSimplifier` (DF #20111) - CASE WHEN x THEN y ELSE NULL shortcut (DF #20097) - Push limit into hash join (DF #20228) None of these were applicable before because the string SQL path short-circuited the optimizer. ## Scope This is one of three string-flattened pushdown sites; the other two (`hydrate_nodes`/Expand pushdown at query.rs:771-796 and the mutation delete path in `exec/mutation.rs::predicate_to_sql`) stay on the SQL string path for now: - The Expand pushdown still serializes through `hydrate_nodes`'s `extra_filter_sql: Option<&str>` parameter. Migrating it changes the `TableStorage` trait surface (`scan_stream(filter: Option<&str>)` → `Option<Expr>`) and the cascading call sites — out of scope for this MR. - The mutation delete predicate still goes through `Dataset::delete(&str)` in Lance 6.0.1. MR-A (delete two-phase via Lance #6658, gated on the Lance v7 bump per issue #112) will migrate that path to `DeleteBuilder::execute_uncommitted` taking an Expr. The existing `ir_filter_to_sql` / `ir_expr_to_sql` / `literal_to_sql` helpers stay in place to serve the remaining string-SQL consumers (mutation predicates). They get retired when the other call sites migrate. ## Cargo Enables the `nested_expressions` feature on the `datafusion` workspace dep. Lance already pulls in `datafusion-functions-nested` transitively (it's listed in their feature set), so this just exposes the `datafusion::functions_nested::expr_fn::array_has` re-export. No transitive dep change (Cargo.lock unchanged). ## Tests - New: `ir_filter_with_list_contains_pushes_down` — pins the case that was previously impossible (`ir_filter_to_sql` returning `None`). - 906/906 workspace tests still pass. - 417/417 engine integration tests pass (was 416 + the new one). - 19/19 failpoints (recovery canary). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: pin rustfs/rustfs to 1.0.0-beta.3 (last known-good before creds-policy break) The RustFS S3 Integration job started failing 2026-05-23 with all 3 tests panicking on the first PUT: HTTP error: error sending request The "Dump RustFS logs on failure" step revealed the container was dying at startup: [FATAL] Server encountered an error and is shutting down: Default root credentials are not allowed on non-loopback listeners; set RUSTFS_ACCESS_KEY and RUSTFS_SECRET_KEY to non-default values, bind to loopback, or set RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true for local development only `rustfs/rustfs:latest` was updated 2026-05-21 (1.0.0-beta.4) with a credentials-policy check that rejects `rustfsadmin`/`rustfsadmin` as "default" values. PR #111 passed yesterday because it ran against beta.3; today's runs against beta.4 fail at container startup. This is unrelated to PR #113's Expr-pushdown refactor — the bump just happened to hit the same week. Pin to 1.0.0-beta.3 (2026-05-14, last tag before the change). The right long-term fix is one of: - Rotate the CI creds to less-default values (less coupling to RustFS's "default" set definition) - Set `RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true` per the error message - Use a workflow service container with controlled lifecycle Deferred — pinning is the minimal restore. Also incidentally documents *which* version we tested against, which `:latest` never did. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1930 lines
58 KiB
Rust
1930 lines
58 KiB
Rust
mod helpers;
|
|
|
|
use arrow_array::{Array, Int32Array, RecordBatch, StringArray};
|
|
use futures::TryStreamExt;
|
|
|
|
use omnigraph::db::{Omnigraph, ReadTarget};
|
|
use omnigraph::loader::{LoadMode, load_jsonl, load_jsonl_file};
|
|
use omnigraph_compiler::ir::ParamMap;
|
|
|
|
use helpers::*;
|
|
|
|
// ─── Init + Load ────────────────────────────────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn init_creates_schema_file_and_manifest() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
|
|
let db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
|
|
|
|
assert!(dir.path().join("_schema.pg").exists());
|
|
assert!(dir.path().join("__manifest").exists());
|
|
assert_eq!(db.catalog().node_types.len(), 2);
|
|
assert_eq!(db.catalog().edge_types.len(), 2);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn open_restores_full_state() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
|
|
let original = init_and_load(&dir).await;
|
|
let v = version_main(&original).await.unwrap();
|
|
drop(original);
|
|
|
|
let reopened = Omnigraph::open(uri).await.unwrap();
|
|
assert_eq!(reopened.catalog().node_types.len(), 2);
|
|
assert_eq!(reopened.catalog().edge_types.len(), 2);
|
|
// Version should be what we left it at
|
|
// (manifest was committed during load)
|
|
assert!(version_main(&reopened).await.unwrap() >= v);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn load_populates_all_types() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let db = init_and_load(&dir).await;
|
|
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
|
|
// 4 persons
|
|
let person_ds = snap.open("node:Person").await.unwrap();
|
|
assert_eq!(person_ds.count_rows(None).await.unwrap(), 4);
|
|
|
|
// 2 companies
|
|
let company_ds = snap.open("node:Company").await.unwrap();
|
|
assert_eq!(company_ds.count_rows(None).await.unwrap(), 2);
|
|
|
|
// 3 Knows edges
|
|
let knows_ds = snap.open("edge:Knows").await.unwrap();
|
|
assert_eq!(knows_ds.count_rows(None).await.unwrap(), 3);
|
|
|
|
// 2 WorksAt edges
|
|
let works_at_ds = snap.open("edge:WorksAt").await.unwrap();
|
|
assert_eq!(works_at_ds.count_rows(None).await.unwrap(), 2);
|
|
}
|
|
|
|
// ─── Read consistency ───────────────────────────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn node_ids_are_key_values() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let db = init_and_load(&dir).await;
|
|
|
|
let batches = read_table(&db, "node:Person").await;
|
|
let mut ids = collect_column_strings(&batches, "id");
|
|
ids.sort();
|
|
assert_eq!(ids, vec!["Alice", "Bob", "Charlie", "Diana"]);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn node_properties_are_correct() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let db = init_and_load(&dir).await;
|
|
|
|
let batches = read_table(&db, "node:Person").await;
|
|
let batch = &batches[0];
|
|
let ids = batch
|
|
.column_by_name("id")
|
|
.unwrap()
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
let ages = batch
|
|
.column_by_name("age")
|
|
.unwrap()
|
|
.as_any()
|
|
.downcast_ref::<Int32Array>()
|
|
.unwrap();
|
|
|
|
// Find Alice's row and check age
|
|
let alice_idx = (0..ids.len()).find(|&i| ids.value(i) == "Alice").unwrap();
|
|
assert_eq!(ages.value(alice_idx), 30);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn entity_at_returns_typed_json_values() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let schema = r#"
|
|
node Flagged {
|
|
slug: String @key
|
|
active: Bool
|
|
rating: I32?
|
|
}
|
|
"#;
|
|
let data = r#"{"type":"Flagged","data":{"slug":"alpha","active":true,"rating":42}}"#;
|
|
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let entity = db
|
|
.entity_at_target(ReadTarget::branch("main"), "node:Flagged", "alpha")
|
|
.await
|
|
.unwrap()
|
|
.unwrap();
|
|
assert_eq!(entity["id"], serde_json::json!("alpha"));
|
|
assert_eq!(entity["active"], serde_json::json!(true));
|
|
assert_eq!(entity["rating"], serde_json::json!(42));
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn nullable_vectors_round_trip_as_null() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let schema = r#"
|
|
node Doc {
|
|
slug: String @key
|
|
embedding: Vector(2)?
|
|
}
|
|
"#;
|
|
let data = r#"{"type":"Doc","data":{"slug":"a"}}
|
|
{"type":"Doc","data":{"slug":"b","embedding":[1.0,2.0]}}"#;
|
|
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let missing = db
|
|
.entity_at_target(ReadTarget::branch("main"), "node:Doc", "a")
|
|
.await
|
|
.unwrap()
|
|
.unwrap();
|
|
let present = db
|
|
.entity_at_target(ReadTarget::branch("main"), "node:Doc", "b")
|
|
.await
|
|
.unwrap()
|
|
.unwrap();
|
|
|
|
assert!(missing["embedding"].is_null());
|
|
assert_eq!(present["embedding"], serde_json::json!([1.0, 2.0]));
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn edge_src_dst_reference_node_ids() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let db = init_and_load(&dir).await;
|
|
|
|
let batches = read_table(&db, "edge:Knows").await;
|
|
let batch = &batches[0];
|
|
let srcs = batch
|
|
.column_by_name("src")
|
|
.unwrap()
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
let dsts = batch
|
|
.column_by_name("dst")
|
|
.unwrap()
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
|
|
// Collect all (src, dst) pairs
|
|
let mut edges: Vec<(&str, &str)> = (0..batch.num_rows())
|
|
.map(|i| (srcs.value(i), dsts.value(i)))
|
|
.collect();
|
|
edges.sort();
|
|
|
|
assert_eq!(
|
|
edges,
|
|
vec![("Alice", "Bob"), ("Alice", "Charlie"), ("Bob", "Diana")]
|
|
);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn edge_ids_are_unique_strings() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let db = init_and_load(&dir).await;
|
|
|
|
let batches = read_table(&db, "edge:Knows").await;
|
|
let batch = &batches[0];
|
|
let ids = batch
|
|
.column_by_name("id")
|
|
.unwrap()
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
|
|
let id_values: Vec<&str> = (0..ids.len()).map(|i| ids.value(i)).collect();
|
|
// All unique
|
|
let mut deduped = id_values.clone();
|
|
deduped.sort();
|
|
deduped.dedup();
|
|
assert_eq!(id_values.len(), deduped.len());
|
|
// All non-empty
|
|
assert!(id_values.iter().all(|id| !id.is_empty()));
|
|
}
|
|
|
|
// ─── Load modes ─────────────────────────────────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn overwrite_replaces_data() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
|
|
|
|
// Load full data
|
|
load_jsonl(&mut db, TEST_DATA, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Overwrite with just one person
|
|
let small = r#"{"type": "Person", "data": {"name": "Zara", "age": 40}}"#;
|
|
load_jsonl(&mut db, small, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let batches = read_table(&db, "node:Person").await;
|
|
let batch = &batches[0];
|
|
assert_eq!(batch.num_rows(), 1);
|
|
let ids = batch
|
|
.column_by_name("id")
|
|
.unwrap()
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
assert_eq!(ids.value(0), "Zara");
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn append_adds_rows() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
|
|
|
|
let batch1 = r#"{"type": "Person", "data": {"name": "Alice", "age": 30}}"#;
|
|
let batch2 = r#"{"type": "Person", "data": {"name": "Bob", "age": 25}}"#;
|
|
|
|
load_jsonl(&mut db, batch1, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
load_jsonl(&mut db, batch2, LoadMode::Append).await.unwrap();
|
|
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
let ds = snap.open("node:Person").await.unwrap();
|
|
assert_eq!(ds.count_rows(None).await.unwrap(), 2);
|
|
}
|
|
|
|
// ─── Load from fixture file ─────────────────────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn load_from_file_works() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
|
|
|
|
let fixture_path = concat!(env!("CARGO_MANIFEST_DIR"), "/tests/fixtures/test.jsonl");
|
|
load_jsonl_file(&mut db, fixture_path, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
let ds = snap.open("node:Person").await.unwrap();
|
|
assert_eq!(ds.count_rows(None).await.unwrap(), 4);
|
|
}
|
|
|
|
// ─── Signals fixture (complex @key schema) ──────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn signals_fixture_loads_correctly() {
|
|
let schema = include_str!("fixtures/signals.pg");
|
|
let data = include_str!("fixtures/signals.jsonl");
|
|
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
|
|
// Verify some types have data
|
|
let company_ds = snap.open("node:Company").await.unwrap();
|
|
assert!(company_ds.count_rows(None).await.unwrap() > 0);
|
|
|
|
// Verify node IDs are @key values (slug)
|
|
let batches: Vec<arrow_array::RecordBatch> = company_ds
|
|
.scan()
|
|
.try_into_stream()
|
|
.await
|
|
.unwrap()
|
|
.try_collect()
|
|
.await
|
|
.unwrap();
|
|
let ids = collect_column_strings(&batches, "id");
|
|
// Should contain slug values like "aws", "openai", etc.
|
|
assert!(ids.contains(&"aws".to_string()));
|
|
assert!(ids.contains(&"openai".to_string()));
|
|
}
|
|
|
|
// ─── Query execution ────────────────────────────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn query_get_person_by_name() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let result = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"get_person",
|
|
¶ms(&[("$name", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.num_rows(), 1);
|
|
let batch = &result.batches()[0];
|
|
let names = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
assert_eq!(names.value(0), "Alice");
|
|
|
|
let ages = batch
|
|
.column(1)
|
|
.as_any()
|
|
.downcast_ref::<Int32Array>()
|
|
.unwrap();
|
|
assert_eq!(ages.value(0), 30);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn query_get_person_not_found() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let result = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"get_person",
|
|
¶ms(&[("$name", "Nobody")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.num_rows(), 0);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn query_adults_filtered_and_ordered() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let result = query_main(&mut db, TEST_QUERIES, "adults", &ParamMap::new())
|
|
.await
|
|
.unwrap();
|
|
|
|
// Only Charlie (35) matches age > 30, ordered desc
|
|
assert_eq!(result.num_rows(), 1);
|
|
let batch = &result.batches()[0];
|
|
let names = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
assert_eq!(names.value(0), "Charlie");
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn query_top_by_age_with_limit() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let result = query_main(&mut db, TEST_QUERIES, "top_by_age", &ParamMap::new())
|
|
.await
|
|
.unwrap();
|
|
|
|
// Top 2 by age desc: Charlie (35), Alice (30)
|
|
assert_eq!(result.num_rows(), 2);
|
|
let batch = &result.batches()[0];
|
|
let names = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
assert_eq!(names.value(0), "Charlie");
|
|
assert_eq!(names.value(1), "Alice");
|
|
|
|
let ages = batch
|
|
.column(1)
|
|
.as_any()
|
|
.downcast_ref::<Int32Array>()
|
|
.unwrap();
|
|
assert_eq!(ages.value(0), 35);
|
|
assert_eq!(ages.value(1), 30);
|
|
}
|
|
|
|
// ─── Graph traversal ─────────────────────────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn query_friends_of() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let result = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"friends_of",
|
|
¶ms(&[("$name", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Alice knows Bob and Charlie
|
|
let batch = result.concat_batches().unwrap();
|
|
let names = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
let mut friend_names: Vec<&str> = (0..names.len()).map(|i| names.value(i)).collect();
|
|
friend_names.sort();
|
|
assert_eq!(friend_names, vec!["Bob", "Charlie"]);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn query_employees_of() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let result = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"employees_of",
|
|
¶ms(&[("$company", "Acme")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Alice works at Acme (reverse traversal)
|
|
let batch = result.concat_batches().unwrap();
|
|
let names = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
assert_eq!(names.len(), 1);
|
|
assert_eq!(names.value(0), "Alice");
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn query_friends_of_friends() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let result = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"friends_of_friends",
|
|
¶ms(&[("$name", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Alice→Bob→Diana (Alice→Charlie→nobody)
|
|
let batch = result.concat_batches().unwrap();
|
|
let names = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
let mut fof_names: Vec<&str> = (0..names.len()).map(|i| names.value(i)).collect();
|
|
fof_names.sort();
|
|
assert_eq!(fof_names, vec!["Diana"]);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn query_unemployed() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let result = query_main(&mut db, TEST_QUERIES, "unemployed", &ParamMap::new())
|
|
.await
|
|
.unwrap();
|
|
|
|
// Charlie and Diana have no WorksAt edges
|
|
let batch = result.concat_batches().unwrap();
|
|
let names = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
let mut unemployed: Vec<&str> = (0..names.len()).map(|i| names.value(i)).collect();
|
|
unemployed.sort();
|
|
assert_eq!(unemployed, vec!["Charlie", "Diana"]);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn query_anti_join_all_have_edges() {
|
|
let schema = r#"
|
|
node Person { name: String @key }
|
|
node Company { name: String @key }
|
|
edge WorksAt: Person -> Company
|
|
"#;
|
|
let data = r#"{"type": "Person", "data": {"name": "Alice"}}
|
|
{"type": "Person", "data": {"name": "Bob"}}
|
|
{"type": "Company", "data": {"name": "Acme"}}
|
|
{"edge": "WorksAt", "from": "Alice", "to": "Acme"}
|
|
{"edge": "WorksAt", "from": "Bob", "to": "Acme"}
|
|
"#;
|
|
let queries = r#"
|
|
query unemployed() {
|
|
match {
|
|
$p: Person
|
|
not { $p worksAt $_ }
|
|
}
|
|
return { $p.name }
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let result = query_main(&mut db, queries, "unemployed", &ParamMap::new())
|
|
.await
|
|
.unwrap();
|
|
|
|
// Everyone has a WorksAt edge → empty result
|
|
assert_eq!(result.num_rows(), 0);
|
|
}
|
|
|
|
// ─── Mutations ───────────────────────────────────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn mutation_insert_node() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let result = mutate_main(
|
|
&mut db,
|
|
MUTATION_QUERIES,
|
|
"insert_person",
|
|
&mixed_params(&[("$name", "Eve")], &[("$age", 22)]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.affected_nodes, 1);
|
|
assert_eq!(result.affected_edges, 0);
|
|
|
|
// Query it back
|
|
let qr = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"get_person",
|
|
¶ms(&[("$name", "Eve")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(qr.num_rows(), 1);
|
|
let batch = &qr.batches()[0];
|
|
let names = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
assert_eq!(names.value(0), "Eve");
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_insert_edge() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
// Insert Eve
|
|
mutate_main(
|
|
&mut db,
|
|
MUTATION_QUERIES,
|
|
"insert_person",
|
|
&mixed_params(&[("$name", "Eve")], &[("$age", 22)]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Add edge Eve → Alice
|
|
let result = mutate_main(
|
|
&mut db,
|
|
MUTATION_QUERIES,
|
|
"add_friend",
|
|
¶ms(&[("$from", "Eve"), ("$to", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.affected_nodes, 0);
|
|
assert_eq!(result.affected_edges, 1);
|
|
|
|
// Verify traversal
|
|
let qr = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"friends_of",
|
|
¶ms(&[("$name", "Eve")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(qr.num_rows(), 1);
|
|
let batch = qr.concat_batches().unwrap();
|
|
let names = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
assert_eq!(names.value(0), "Alice");
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_multi_insert_node_and_edge() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
// In one atomic mutation: insert Eve + edge Eve→Alice
|
|
let result = mutate_main(
|
|
&mut db,
|
|
MUTATION_QUERIES,
|
|
"insert_person_and_friend",
|
|
&mixed_params(&[("$name", "Eve"), ("$friend", "Alice")], &[("$age", 22)]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.affected_nodes, 1);
|
|
assert_eq!(result.affected_edges, 1);
|
|
|
|
// Verify traversal: Eve → Alice
|
|
let qr = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"friends_of",
|
|
¶ms(&[("$name", "Eve")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(qr.num_rows(), 1);
|
|
let batch = qr.concat_batches().unwrap();
|
|
let names = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
assert_eq!(names.value(0), "Alice");
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_update_node() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let result = mutate_main(
|
|
&mut db,
|
|
MUTATION_QUERIES,
|
|
"set_age",
|
|
&mixed_params(&[("$name", "Alice")], &[("$age", 31)]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.affected_nodes, 1);
|
|
assert_eq!(result.affected_edges, 0);
|
|
|
|
// Verify the update
|
|
let qr = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"get_person",
|
|
¶ms(&[("$name", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(qr.num_rows(), 1);
|
|
let batch = &qr.batches()[0];
|
|
let ages = batch
|
|
.column(1)
|
|
.as_any()
|
|
.downcast_ref::<Int32Array>()
|
|
.unwrap();
|
|
assert_eq!(ages.value(0), 31);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_delete_node_cascades_edges() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
// Alice has: 2 outgoing Knows (Alice→Bob, Alice→Charlie) + 1 WorksAt (Alice→Acme) = 3 edges
|
|
let result = mutate_main(
|
|
&mut db,
|
|
MUTATION_QUERIES,
|
|
"remove_person",
|
|
¶ms(&[("$name", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.affected_nodes, 1);
|
|
assert!(
|
|
result.affected_edges >= 3,
|
|
"expected at least 3 cascaded edges, got {}",
|
|
result.affected_edges
|
|
);
|
|
|
|
// Alice should be gone
|
|
let qr = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"get_person",
|
|
¶ms(&[("$name", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(qr.num_rows(), 0);
|
|
|
|
// Verify no edges reference Alice
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
for edge_key in &["edge:Knows", "edge:WorksAt"] {
|
|
let ds = snap.open(edge_key).await.unwrap();
|
|
let batches: Vec<arrow_array::RecordBatch> = ds
|
|
.scan()
|
|
.try_into_stream()
|
|
.await
|
|
.unwrap()
|
|
.try_collect()
|
|
.await
|
|
.unwrap();
|
|
for batch in &batches {
|
|
let srcs = batch
|
|
.column_by_name("src")
|
|
.unwrap()
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
let dsts = batch
|
|
.column_by_name("dst")
|
|
.unwrap()
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
for i in 0..batch.num_rows() {
|
|
assert_ne!(
|
|
srcs.value(i),
|
|
"Alice",
|
|
"found edge src=Alice in {}",
|
|
edge_key
|
|
);
|
|
assert_ne!(
|
|
dsts.value(i),
|
|
"Alice",
|
|
"found edge dst=Alice in {}",
|
|
edge_key
|
|
);
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_delete_edge() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
// Delete all Knows edges from Alice (Alice→Bob, Alice→Charlie)
|
|
let result = mutate_main(
|
|
&mut db,
|
|
MUTATION_QUERIES,
|
|
"remove_friendship",
|
|
¶ms(&[("$from", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.affected_nodes, 0);
|
|
assert_eq!(result.affected_edges, 2);
|
|
|
|
// Alice should still exist
|
|
let qr = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"get_person",
|
|
¶ms(&[("$name", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(qr.num_rows(), 1);
|
|
|
|
// But has no friends
|
|
let qr = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"friends_of",
|
|
¶ms(&[("$name", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(qr.num_rows(), 0);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_insert_duplicate_key_upserts() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
// Alice already exists with age=30. Insert again with age=99.
|
|
let result = mutate_main(
|
|
&mut db,
|
|
MUTATION_QUERIES,
|
|
"insert_person",
|
|
&mixed_params(&[("$name", "Alice")], &[("$age", 99)]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.affected_nodes, 1);
|
|
|
|
// Should still be exactly 1 Alice (upsert, not duplicate)
|
|
let qr = query_main(
|
|
&mut db,
|
|
TEST_QUERIES,
|
|
"get_person",
|
|
¶ms(&[("$name", "Alice")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(qr.num_rows(), 1);
|
|
|
|
// Age should be updated to 99
|
|
let batch = &qr.batches()[0];
|
|
let ages = batch
|
|
.column(1)
|
|
.as_any()
|
|
.downcast_ref::<Int32Array>()
|
|
.unwrap();
|
|
assert_eq!(ages.value(0), 99);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_update_key_property_rejected() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
let queries = r#"
|
|
query rename_person($old_name: String, $new_name: String) {
|
|
update Person set { name: $new_name } where name = $old_name
|
|
}
|
|
"#;
|
|
|
|
let result = mutate_main(
|
|
&mut db,
|
|
queries,
|
|
"rename_person",
|
|
¶ms(&[("$old_name", "Alice"), ("$new_name", "Bob")]),
|
|
)
|
|
.await;
|
|
|
|
assert!(result.is_err());
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(err.contains("@key"), "error should mention @key: {}", err);
|
|
}
|
|
|
|
// ─── Blob support ────────────────────────────────────────────────────────────
|
|
|
|
const BLOB_SCHEMA: &str = r#"
|
|
node Document {
|
|
title: String @key
|
|
content: Blob?
|
|
}
|
|
"#;
|
|
|
|
const BLOB_QUERIES: &str = r#"
|
|
query all_docs() {
|
|
match { $d: Document }
|
|
return { $d.title, $d.content }
|
|
}
|
|
|
|
query get_doc($title: String) {
|
|
match { $d: Document { title: $title } }
|
|
return { $d.title, $d.content }
|
|
}
|
|
"#;
|
|
|
|
const BLOB_MUTATIONS: &str = r#"
|
|
query insert_doc($title: String, $content: Blob) {
|
|
insert Document { title: $title, content: $content }
|
|
}
|
|
|
|
query update_doc_content($title: String, $content: Blob) {
|
|
update Document set { content: $content } where title = $title
|
|
}
|
|
"#;
|
|
|
|
#[tokio::test]
|
|
async fn blob_schema_parses_and_init_succeeds() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
assert!(
|
|
db.catalog().node_types["Document"]
|
|
.blob_properties
|
|
.contains("content")
|
|
);
|
|
assert_eq!(db.catalog().node_types["Document"].properties.len(), 2);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn blob_load_base64_inline() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
// "Hello World" = "SGVsbG8gV29ybGQ="
|
|
let data = r#"{"type": "Document", "data": {"title": "readme", "content": "base64:SGVsbG8gV29ybGQ="}}
|
|
{"type": "Document", "data": {"title": "empty"}}
|
|
"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
let ds = snap.open("node:Document").await.unwrap();
|
|
assert_eq!(ds.count_rows(None).await.unwrap(), 2);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn blob_query_returns_metadata() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
let data = r#"{"type": "Document", "data": {"title": "readme", "content": "base64:SGVsbG8gV29ybGQ="}}"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let result = query_main(
|
|
&mut db,
|
|
BLOB_QUERIES,
|
|
"get_doc",
|
|
¶ms(&[("$title", "readme")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.num_rows(), 1);
|
|
|
|
let json = result.to_sdk_json();
|
|
let row = json.as_array().unwrap().first().unwrap();
|
|
assert_eq!(row["d.title"], "readme");
|
|
// Blob columns return null in query projections — data is accessed via take_blobs API.
|
|
// (Lance bug: BlobsDescriptions + filter triggers assertion, so blobs are excluded from scan)
|
|
assert!(
|
|
row["d.content"].is_null(),
|
|
"blob column should return null in query projection"
|
|
);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn blob_null_returns_null_in_query() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
let data = r#"{"type": "Document", "data": {"title": "empty"}}"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let result = query_main(
|
|
&mut db,
|
|
BLOB_QUERIES,
|
|
"get_doc",
|
|
¶ms(&[("$title", "empty")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.num_rows(), 1);
|
|
let json = result.to_sdk_json();
|
|
let row = json.as_array().unwrap().first().unwrap();
|
|
assert_eq!(row["d.title"], "empty");
|
|
// Nullable blob with no value should return null
|
|
assert!(
|
|
row["d.content"].is_null(),
|
|
"null blob should return null, got: {}",
|
|
row["d.content"]
|
|
);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn blob_insert_mutation() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
let result = mutate_main(
|
|
&mut db,
|
|
BLOB_MUTATIONS,
|
|
"insert_doc",
|
|
¶ms(&[("$title", "new-doc"), ("$content", "base64:AQID")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.affected_nodes, 1);
|
|
|
|
// Query it back
|
|
let qr = query_main(
|
|
&mut db,
|
|
BLOB_QUERIES,
|
|
"get_doc",
|
|
¶ms(&[("$title", "new-doc")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(qr.num_rows(), 1);
|
|
let json = qr.to_sdk_json();
|
|
let row = json.as_array().unwrap().first().unwrap();
|
|
assert_eq!(row["d.title"], "new-doc");
|
|
// Blob column present but null in query projection (data accessed via take_blobs)
|
|
assert!(
|
|
row.get("d.content").is_some(),
|
|
"content column should be present"
|
|
);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn blob_update_mutation() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
// First insert a doc with blob
|
|
mutate_main(
|
|
&mut db,
|
|
BLOB_MUTATIONS,
|
|
"insert_doc",
|
|
¶ms(&[("$title", "updatable"), ("$content", "base64:AQID")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Update the blob
|
|
let result = mutate_main(
|
|
&mut db,
|
|
BLOB_MUTATIONS,
|
|
"update_doc_content",
|
|
¶ms(&[("$title", "updatable"), ("$content", "base64:BAUG")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
assert_eq!(result.affected_nodes, 1);
|
|
|
|
let blob = db
|
|
.read_blob("Document", "updatable", "content")
|
|
.await
|
|
.unwrap();
|
|
let bytes = blob.read().await.unwrap();
|
|
assert_eq!(&bytes[..], &[4, 5, 6]);
|
|
}
|
|
|
|
// ─── Blob read API ───────────────────────────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn blob_read_returns_bytes() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
// "Hello World" = base64 "SGVsbG8gV29ybGQ="
|
|
let data = r#"{"type": "Document", "data": {"title": "readme", "content": "base64:SGVsbG8gV29ybGQ="}}"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let blob = db.read_blob("Document", "readme", "content").await.unwrap();
|
|
assert_eq!(blob.size(), 11); // "Hello World" = 11 bytes
|
|
|
|
let bytes = blob.read().await.unwrap();
|
|
assert_eq!(&bytes[..], b"Hello World");
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn blob_read_not_found_errors() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
let data = r#"{"type": "Document", "data": {"title": "readme", "content": "base64:SGVsbG8="}}"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Non-existent ID
|
|
let err = db.read_blob("Document", "nonexistent", "content").await;
|
|
assert!(err.is_err());
|
|
|
|
// Non-blob property
|
|
let err = db.read_blob("Document", "readme", "title").await;
|
|
assert!(err.is_err());
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn blob_read_after_mutation_insert() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
// Insert via mutation (base64 for bytes [1, 2, 3])
|
|
mutate_main(
|
|
&mut db,
|
|
BLOB_MUTATIONS,
|
|
"insert_doc",
|
|
¶ms(&[("$title", "inserted"), ("$content", "base64:AQID")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
let blob = db
|
|
.read_blob("Document", "inserted", "content")
|
|
.await
|
|
.unwrap();
|
|
let bytes = blob.read().await.unwrap();
|
|
assert_eq!(&bytes[..], &[1, 2, 3]);
|
|
}
|
|
|
|
// ─── Blob low-level: probe BlobHandling::BlobsDescriptions ───────────────
|
|
|
|
#[tokio::test]
|
|
async fn blob_scan_with_descriptions_on_nonempty_dataset() {
|
|
use lance::datatypes::BlobHandling;
|
|
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
let data = r#"{"type": "Document", "data": {"title": "readme", "content": "base64:SGVsbG8gV29ybGQ="}}"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Open the dataset directly and try BlobsDescriptions
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
let ds = snap.open("node:Document").await.unwrap();
|
|
assert_eq!(ds.count_rows(None).await.unwrap(), 1);
|
|
|
|
// BlobsDescriptions works without filter
|
|
let mut scanner = ds.scan();
|
|
scanner.blob_handling(BlobHandling::BlobsDescriptions);
|
|
let stream = scanner.try_into_stream().await.unwrap();
|
|
let batches: Vec<RecordBatch> = stream.try_collect().await.unwrap();
|
|
assert_eq!(batches.len(), 1);
|
|
assert_eq!(batches[0].num_rows(), 1);
|
|
|
|
// Blob descriptor is a struct with kind, position, size, blob_id, blob_uri
|
|
let content_col = batches[0].column_by_name("content").unwrap();
|
|
assert!(
|
|
matches!(content_col.data_type(), arrow_schema::DataType::Struct(_)),
|
|
"blob column should be Struct, got {:?}",
|
|
content_col.data_type()
|
|
);
|
|
}
|
|
|
|
// ─── Constraint enforcement ──────────────────────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn range_constraint_rejects_out_of_bounds() {
|
|
let schema = r#"
|
|
node Person {
|
|
name: String @key
|
|
age: I32?
|
|
@range(age, 0..200)
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
// age = 300 exceeds max of 200
|
|
let data = r#"{"type": "Person", "data": {"name": "Old", "age": 300}}"#;
|
|
let result = load_jsonl(&mut db, data, LoadMode::Overwrite).await;
|
|
assert!(result.is_err(), "expected range violation");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(err.contains("@range violation"), "error: {}", err);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn range_constraint_allows_within_bounds() {
|
|
let schema = r#"
|
|
node Person {
|
|
name: String @key
|
|
age: I32?
|
|
@range(age, 0..200)
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
let data = r#"{"type": "Person", "data": {"name": "Alice", "age": 30}}"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
let ds = snap.open("node:Person").await.unwrap();
|
|
assert_eq!(ds.count_rows(None).await.unwrap(), 1);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn range_constraint_float_rejects_out_of_bounds() {
|
|
let schema = r#"
|
|
node Measurement {
|
|
name: String @key
|
|
temperature: F64?
|
|
@range(temperature, 0.0..100.0)
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
let data = r#"{"type": "Measurement", "data": {"name": "hot", "temperature": 150.5}}"#;
|
|
let result = load_jsonl(&mut db, data, LoadMode::Overwrite).await;
|
|
assert!(result.is_err(), "expected range violation for float");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(err.contains("@range violation"), "error: {}", err);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn range_constraint_float_allows_within_bounds() {
|
|
let schema = r#"
|
|
node Measurement {
|
|
name: String @key
|
|
temperature: F64?
|
|
@range(temperature, 0.0..100.0)
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
let data = r#"{"type": "Measurement", "data": {"name": "warm", "temperature": 37.5}}"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
let ds = snap.open("node:Measurement").await.unwrap();
|
|
assert_eq!(ds.count_rows(None).await.unwrap(), 1);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn range_constraint_negative_float_bounds() {
|
|
let schema = r#"
|
|
node Measurement {
|
|
name: String @key
|
|
temperature: F64?
|
|
@range(temperature, -40.0..60.0)
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
// Within bounds — should succeed
|
|
let data = r#"{"type": "Measurement", "data": {"name": "cold", "temperature": -20.0}}"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Below minimum — should fail
|
|
let data = r#"{"type": "Measurement", "data": {"name": "arctic", "temperature": -50.0}}"#;
|
|
let result = load_jsonl(&mut db, data, LoadMode::Overwrite).await;
|
|
assert!(result.is_err(), "expected range violation for -50.0");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(err.contains("@range violation"), "error: {}", err);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn check_constraint_rejects_bad_pattern() {
|
|
let schema = r#"
|
|
node Order {
|
|
code: String @key
|
|
@check(code, "^[A-Z]{3}-[0-9]+$")
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
let data = r#"{"type": "Order", "data": {"code": "invalid"}}"#;
|
|
let result = load_jsonl(&mut db, data, LoadMode::Overwrite).await;
|
|
assert!(result.is_err(), "expected check violation");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(err.contains("@check violation"), "error: {}", err);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn check_constraint_allows_matching_pattern() {
|
|
let schema = r#"
|
|
node Order {
|
|
code: String @key
|
|
@check(code, "^[A-Z]{3}-[0-9]+$")
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
let data = r#"{"type": "Order", "data": {"code": "ABC-123"}}"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
let ds = snap.open("node:Order").await.unwrap();
|
|
assert_eq!(ds.count_rows(None).await.unwrap(), 1);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_insert_rejects_range_violation() {
|
|
let schema = r#"
|
|
node Person {
|
|
name: String @key
|
|
age: I32?
|
|
@range(age, 0..200)
|
|
}
|
|
"#;
|
|
let queries = r#"
|
|
query insert_person($name: String, $age: I32) {
|
|
insert Person { name: $name, age: $age }
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
let result = mutate_main(&mut db, queries, "insert_person", &{
|
|
let mut p = omnigraph_compiler::ir::ParamMap::new();
|
|
p.insert(
|
|
"name".to_string(),
|
|
omnigraph_compiler::query::ast::Literal::String("Old".to_string()),
|
|
);
|
|
p.insert(
|
|
"age".to_string(),
|
|
omnigraph_compiler::query::ast::Literal::Integer(300),
|
|
);
|
|
p
|
|
})
|
|
.await;
|
|
assert!(result.is_err(), "expected range violation");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(err.contains("@range violation"), "error: {}", err);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_update_rejects_range_violation() {
|
|
let schema = r#"
|
|
node Person {
|
|
name: String @key
|
|
age: I32?
|
|
@range(age, 0..200)
|
|
}
|
|
"#;
|
|
let queries = r#"
|
|
query set_age($name: String, $age: I32) {
|
|
update Person set { age: $age } where name = $name
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
load_jsonl(
|
|
&mut db,
|
|
r#"{"type": "Person", "data": {"name": "Alice", "age": 30}}"#,
|
|
LoadMode::Overwrite,
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
let result = mutate_main(&mut db, queries, "set_age", &{
|
|
let mut p = omnigraph_compiler::ir::ParamMap::new();
|
|
p.insert(
|
|
"name".to_string(),
|
|
omnigraph_compiler::query::ast::Literal::String("Alice".to_string()),
|
|
);
|
|
p.insert(
|
|
"age".to_string(),
|
|
omnigraph_compiler::query::ast::Literal::Integer(300),
|
|
);
|
|
p
|
|
})
|
|
.await;
|
|
assert!(result.is_err(), "expected range violation");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(err.contains("@range violation"), "error: {}", err);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_insert_rejects_check_violation() {
|
|
let schema = r#"
|
|
node Order {
|
|
code: String @key
|
|
@check(code, "^[A-Z]{3}-[0-9]+$")
|
|
}
|
|
"#;
|
|
let queries = r#"
|
|
query insert_order($code: String) {
|
|
insert Order { code: $code }
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
let result = mutate_main(&mut db, queries, "insert_order", &{
|
|
let mut p = omnigraph_compiler::ir::ParamMap::new();
|
|
p.insert(
|
|
"code".to_string(),
|
|
omnigraph_compiler::query::ast::Literal::String("invalid".to_string()),
|
|
);
|
|
p
|
|
})
|
|
.await;
|
|
assert!(result.is_err(), "expected check violation");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(err.contains("@check violation"), "error: {}", err);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn mutation_update_rejects_check_violation() {
|
|
let schema = r#"
|
|
node Order {
|
|
code: String @key
|
|
label: String?
|
|
@check(label, "^[A-Z]+$")
|
|
}
|
|
"#;
|
|
let queries = r#"
|
|
query set_label($code: String, $label: String) {
|
|
update Order set { label: $label } where code = $code
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
load_jsonl(
|
|
&mut db,
|
|
r#"{"type": "Order", "data": {"code": "ABC-123", "label": "VALID"}}"#,
|
|
LoadMode::Overwrite,
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
let result = mutate_main(&mut db, queries, "set_label", &{
|
|
let mut p = omnigraph_compiler::ir::ParamMap::new();
|
|
p.insert(
|
|
"code".to_string(),
|
|
omnigraph_compiler::query::ast::Literal::String("ABC-123".to_string()),
|
|
);
|
|
p.insert(
|
|
"label".to_string(),
|
|
omnigraph_compiler::query::ast::Literal::String("invalid".to_string()),
|
|
);
|
|
p
|
|
})
|
|
.await;
|
|
assert!(result.is_err(), "expected check violation");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(err.contains("@check violation"), "error: {}", err);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn edge_cardinality_max_enforced() {
|
|
let schema = r#"
|
|
node Person { name: String @key }
|
|
node Company { name: String @key }
|
|
edge WorksAt: Person -> Company @card(0..1)
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
// Alice works at two companies — violates @card(0..1)
|
|
let data = r#"{"type": "Person", "data": {"name": "Alice"}}
|
|
{"type": "Company", "data": {"name": "Acme"}}
|
|
{"type": "Company", "data": {"name": "Globex"}}
|
|
{"edge": "WorksAt", "from": "Alice", "to": "Acme"}
|
|
{"edge": "WorksAt", "from": "Alice", "to": "Globex"}
|
|
"#;
|
|
let result = load_jsonl(&mut db, data, LoadMode::Overwrite).await;
|
|
assert!(result.is_err(), "expected cardinality violation");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(err.contains("@card violation"), "error: {}", err);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn edge_cardinality_allows_within_bounds() {
|
|
let schema = r#"
|
|
node Person { name: String @key }
|
|
node Company { name: String @key }
|
|
edge WorksAt: Person -> Company @card(0..1)
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
let data = r#"{"type": "Person", "data": {"name": "Alice"}}
|
|
{"type": "Company", "data": {"name": "Acme"}}
|
|
{"edge": "WorksAt", "from": "Alice", "to": "Acme"}
|
|
"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
let ds = snap.open("edge:WorksAt").await.unwrap();
|
|
assert_eq!(ds.count_rows(None).await.unwrap(), 1);
|
|
}
|
|
|
|
// ─── Regression: apply_assignments with blob mid-schema ──────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn update_with_blob_mid_schema_does_not_panic() {
|
|
// Blob column in the MIDDLE of schema — not last. This previously caused
|
|
// a column-index mismatch in apply_assignments (batch.column(idx) used
|
|
// schema position but the batch had blob columns excluded from projection).
|
|
let schema = r#"
|
|
node Article {
|
|
slug: String @key
|
|
attachment: Blob?
|
|
summary: String?
|
|
rating: I32?
|
|
}
|
|
"#;
|
|
let mutations = r#"
|
|
query insert_article($slug: String, $summary: String, $rating: I32) {
|
|
insert Article { slug: $slug, summary: $summary, rating: $rating }
|
|
}
|
|
query update_summary($slug: String, $summary: String) {
|
|
update Article set { summary: $summary } where slug = $slug
|
|
}
|
|
query get_article($slug: String) {
|
|
match { $a: Article { slug: $slug } }
|
|
return { $a.slug, $a.summary, $a.rating }
|
|
}
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
mutate_main(
|
|
&mut db,
|
|
mutations,
|
|
"insert_article",
|
|
&mixed_params(
|
|
&[("$slug", "a1"), ("$summary", "hello")],
|
|
&[("$rating", 42)],
|
|
),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
|
|
// This would panic with the old batch.column(idx) code
|
|
let result = mutate_main(
|
|
&mut db,
|
|
mutations,
|
|
"update_summary",
|
|
¶ms(&[("$slug", "a1"), ("$summary", "updated")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(result.affected_nodes, 1);
|
|
|
|
// Verify the update applied correctly
|
|
let qr = query_main(
|
|
&mut db,
|
|
mutations,
|
|
"get_article",
|
|
¶ms(&[("$slug", "a1")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(qr.num_rows(), 1);
|
|
}
|
|
|
|
// ─── Regression: blob update null → non-null ─────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn blob_update_null_to_non_null() {
|
|
// Regression: updating a blob column that was previously all-null panicked
|
|
// with assertion `left: 0, right: 1` in lance-table stream.rs because the
|
|
// two-phase blob update sent a blob-only batch to merge_insert on a dataset
|
|
// with zero blob fragments.
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
|
|
// Load a row with blob = null (no blob data in dataset)
|
|
let data = r#"{"type": "Document", "data": {"title": "kid-a"}}"#;
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Update: null → non-null blob. Previously panicked with assertion
|
|
// `left: 0, right: 1` in lance-table stream.rs.
|
|
let result = mutate_main(
|
|
&mut db,
|
|
BLOB_MUTATIONS,
|
|
"update_doc_content",
|
|
¶ms(&[("$title", "kid-a"), ("$content", "base64:AQID")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(result.affected_nodes, 1);
|
|
|
|
let blob = db.read_blob("Document", "kid-a", "content").await.unwrap();
|
|
let bytes = blob.read().await.unwrap();
|
|
assert_eq!(&bytes[..], &[1, 2, 3]);
|
|
}
|
|
|
|
// ─── Regression: blob load with external file URI ────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn blob_load_external_file_uri() {
|
|
// Regression: loading blobs with external file:// URIs was rejected with
|
|
// "External blob URI '...' is outside registered external bases" because
|
|
// allow_external_blob_outside_bases was not set on data table write paths.
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
|
|
// Create a temp file to reference
|
|
let blob_dir = tempfile::tempdir().unwrap();
|
|
let blob_path = blob_dir.path().join("test.txt");
|
|
std::fs::write(&blob_path, b"Hello from file").unwrap();
|
|
let file_uri = format!("file://{}", blob_path.display());
|
|
|
|
let mut db = Omnigraph::init(uri, BLOB_SCHEMA).await.unwrap();
|
|
let data = format!(
|
|
r#"{{"type": "Document", "data": {{"title": "from-file", "content": "{}"}}}}"#,
|
|
file_uri
|
|
);
|
|
|
|
// Load with external URI
|
|
load_jsonl(&mut db, &data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
// Verify the blob is accessible
|
|
let blob = db
|
|
.read_blob("Document", "from-file", "content")
|
|
.await
|
|
.unwrap();
|
|
assert!(blob.uri().is_some(), "external blob should have a URI");
|
|
}
|
|
|
|
// ─── Regression: execute_update on edge type ─────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn update_edge_type_returns_error_not_panic() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
// The typechecker should reject this, but even if bypassed,
|
|
// execute_update must not panic with HashMap key-not-found.
|
|
let mutations = r#"
|
|
query update_edge($from: String) {
|
|
update Knows set { since: "2025-01-01" } where from = $from
|
|
}
|
|
"#;
|
|
let result = mutate_main(
|
|
&mut db,
|
|
mutations,
|
|
"update_edge",
|
|
¶ms(&[("$from", "Alice")]),
|
|
)
|
|
.await;
|
|
assert!(result.is_err(), "should return error, not panic");
|
|
}
|
|
|
|
// ─── Regression: Date/DateTime SQL literal escaping ──────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn date_literal_with_quote_is_escaped() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
|
|
// A date-like value with a single-quote must not cause SQL injection.
|
|
// This tests that literal_to_sql escapes Date/DateTime values.
|
|
let queries = r#"
|
|
query filter_date($d: String) {
|
|
match { $p: Person { name: $d } }
|
|
return { $p.name }
|
|
}
|
|
"#;
|
|
// Pass a value with a single-quote — should not error or return all rows
|
|
let result = query_main(
|
|
&mut db,
|
|
queries,
|
|
"filter_date",
|
|
¶ms(&[("$d", "2025-01-01' OR '1'='1")]),
|
|
)
|
|
.await
|
|
.unwrap();
|
|
assert_eq!(result.num_rows(), 0);
|
|
}
|
|
|
|
// ─── Regression: manifest row_count tracks total, not batch size ─────────────
|
|
|
|
#[tokio::test]
|
|
async fn append_mode_manifest_row_count_is_total() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await; // Overwrite: 4 persons
|
|
|
|
let extra = r#"{"type": "Person", "data": {"name": "Eve", "age": 22}}"#;
|
|
load_jsonl(&mut db, extra, LoadMode::Append).await.unwrap();
|
|
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
let entry = snap.entry("node:Person").unwrap();
|
|
// Must be total rows (4 + 1 = 5), not just the appended batch size (1)
|
|
assert_eq!(entry.row_count, 5);
|
|
|
|
// Verify actual dataset count matches manifest
|
|
let ds = snap.open("node:Person").await.unwrap();
|
|
assert_eq!(ds.count_rows(None).await.unwrap() as u64, entry.row_count);
|
|
}
|
|
|
|
// ─── Regression: cardinality violation must not commit manifest ───────────────
|
|
|
|
#[tokio::test]
|
|
async fn cardinality_violation_does_not_commit_manifest() {
|
|
let schema = r#"
|
|
node Person { name: String @key }
|
|
node Company { name: String @key }
|
|
edge WorksAt: Person -> Company @card(0..1)
|
|
"#;
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
|
|
|
// Alice works at two companies — violates @card(0..1) (at most 1)
|
|
let data = r#"
|
|
{"type": "Person", "data": {"name": "Alice"}}
|
|
{"type": "Company", "data": {"name": "Acme"}}
|
|
{"type": "Company", "data": {"name": "Beta"}}
|
|
{"edge": "WorksAt", "from": "Alice", "to": "Acme"}
|
|
{"edge": "WorksAt", "from": "Alice", "to": "Beta"}
|
|
"#;
|
|
|
|
let v_before = version_main(&db).await.unwrap();
|
|
let result = load_jsonl(&mut db, data, LoadMode::Overwrite).await;
|
|
assert!(result.is_err(), "cardinality violation should be rejected");
|
|
assert!(
|
|
result.unwrap_err().to_string().contains("@card violation"),
|
|
"error should mention @card"
|
|
);
|
|
|
|
// Manifest must NOT have advanced — invalid data was not committed
|
|
assert_eq!(version_main(&db).await.unwrap(), v_before);
|
|
}
|
|
|
|
// ─── Regression: dangling edge references are rejected ───────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn dangling_edge_dst_rejected_on_load() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
|
|
|
|
let data = r#"
|
|
{"type": "Person", "data": {"name": "Alice", "age": 30}}
|
|
{"type": "Company", "data": {"name": "Acme"}}
|
|
{"edge": "Knows", "from": "Alice", "to": "NonExistent"}
|
|
"#;
|
|
let result = load_jsonl(&mut db, data, LoadMode::Overwrite).await;
|
|
assert!(result.is_err(), "dangling edge dst should be rejected");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(
|
|
err.contains("not found"),
|
|
"error should mention 'not found': {}",
|
|
err
|
|
);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn dangling_edge_src_rejected_on_load() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let uri = dir.path().to_str().unwrap();
|
|
let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
|
|
|
|
let data = r#"
|
|
{"type": "Person", "data": {"name": "Alice", "age": 30}}
|
|
{"type": "Company", "data": {"name": "Acme"}}
|
|
{"edge": "WorksAt", "from": "Ghost", "to": "Acme"}
|
|
"#;
|
|
let result = load_jsonl(&mut db, data, LoadMode::Overwrite).await;
|
|
assert!(result.is_err(), "dangling edge src should be rejected");
|
|
let err = result.unwrap_err().to_string();
|
|
assert!(
|
|
err.contains("not found"),
|
|
"error should mention 'not found': {}",
|
|
err
|
|
);
|
|
}
|
|
|
|
// ─── Regression: ensure_indices is idempotent ────────────────────────────────
|
|
|
|
#[tokio::test]
|
|
async fn ensure_indices_does_not_error_on_repeated_call() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = init_and_load(&dir).await;
|
|
let version_after_load = version_main(&db).await.unwrap();
|
|
|
|
// load commits now enforce required indices; repeated ensure_indices calls
|
|
// should be a no-op at the manifest level.
|
|
db.ensure_indices().await.unwrap();
|
|
let version_after_first = version_main(&db).await.unwrap();
|
|
db.ensure_indices().await.unwrap();
|
|
let version_after_second = version_main(&db).await.unwrap();
|
|
|
|
assert_eq!(version_after_first, version_after_load);
|
|
assert_eq!(version_after_second, version_after_load);
|
|
|
|
// Data should still be queryable after index operations
|
|
let snap = snapshot_main(&db).await.unwrap();
|
|
let ds = snap.open("node:Person").await.unwrap();
|
|
assert_eq!(ds.count_rows(None).await.unwrap(), 4);
|
|
}
|
|
|
|
// ─── DataFusion-Expr filter pushdown (Tier-1 follow-up to the Lance v6 bump) ──
|
|
|
|
/// Regression for `CompOp::Contains` pushdown via `array_has` in
|
|
/// `ir_filter_to_expr`. Before the Expr-pushdown refactor, the
|
|
/// `ir_filter_to_sql` family returned `None` for list-contains (the
|
|
/// comment said *"Can't pushdown list contains"*) and the predicate was
|
|
/// applied post-scan in memory. With `Scanner::filter_expr(Expr)` and
|
|
/// DF's `array_has` builtin, the contains predicate now pushes down to
|
|
/// Lance — the test confirms results are correct AND the pushdown path
|
|
/// is exercised (a regression on the pushdown would land all rows in
|
|
/// the scan, then be filtered post-hoc; that still produces the right
|
|
/// count so this test pins correctness, while `lance_surface_guards.rs`
|
|
/// is the structural pin for the surface itself).
|
|
#[tokio::test]
|
|
async fn ir_filter_with_list_contains_pushes_down() {
|
|
let schema = r#"
|
|
node Doc {
|
|
slug: String @key
|
|
tags: [String]
|
|
}
|
|
"#;
|
|
let data = r#"{"type":"Doc","data":{"slug":"alpha","tags":["red","blue"]}}
|
|
{"type":"Doc","data":{"slug":"bravo","tags":["green"]}}
|
|
{"type":"Doc","data":{"slug":"charlie","tags":["red","green"]}}
|
|
{"type":"Doc","data":{"slug":"delta","tags":[]}}"#;
|
|
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut db = Omnigraph::init(dir.path().to_str().unwrap(), schema)
|
|
.await
|
|
.unwrap();
|
|
load_jsonl(&mut db, data, LoadMode::Overwrite)
|
|
.await
|
|
.unwrap();
|
|
|
|
let queries = r#"
|
|
query docs_with_tag($tag: String) {
|
|
match {
|
|
$d: Doc
|
|
$d.tags contains $tag
|
|
}
|
|
return { $d.slug }
|
|
}
|
|
"#;
|
|
let result = query_main(&mut db, queries, "docs_with_tag", ¶ms(&[("$tag", "red")]))
|
|
.await
|
|
.unwrap();
|
|
|
|
let batch = result.concat_batches().unwrap();
|
|
let slugs = batch
|
|
.column(0)
|
|
.as_any()
|
|
.downcast_ref::<StringArray>()
|
|
.unwrap();
|
|
let mut got: Vec<&str> = (0..slugs.len()).map(|i| slugs.value(i)).collect();
|
|
got.sort();
|
|
assert_eq!(
|
|
got,
|
|
vec!["alpha", "charlie"],
|
|
"contains-pushdown should return exactly the rows whose tags list contains 'red'"
|
|
);
|
|
}
|