Merge pull request #227 from trustgraph-ai/maint/update-generation

Update generation
2026-05-30 17:55:13 +02:00 · 2024-12-29 10:57:08 +00:00 · 2024-12-29 10:57:08 +00:00 · 5d3a33f85c
commit 5d3a33f85c
parent f80424c795 4ed5e34ab2
5 changed files with 149 additions and 9 deletions
--- a/templates/README.md
+++ b/templates/README.md
@ -0,0 +1,125 @@
 # TrustGraph template generation
 There are two utilities here:
 - `generate`: Generates a single Docker Compose launch configuration
  based on configuration you provide.
 - `generate-all`: Generates the release bundle for releases.  You won't
  need to use this unless you are managing releases.
 ## `generate-all`
 Previously, this generates a full set of all vector DB / triple store / LLM
 combinations, and put them in a single ZIP file.  But this got out of
 hand, so at the time of writing, this generates a single configuraton
 using Qdrant vector DB, Ollama LLM support and Cassandra for a triple store.
 The combinations are contained withing the code, it takes two arguments:
 - output ZIP file (is over-written)
 - TrustGraph version number
 ```
 templates/generate-all output.zip 0.18.11
 ```
 ## `generate`
 This utility takes a configuration file describing the components to bundle,
 and outputs a Docker Compose YAML file.
 ### Input configuration
 The input configuration is a JSON file, an array of components to pull into
 the configuration.  For each component, there is a name and a (possibly empty)
 object describing addtional parameters for that component.
 Example:
 ```
 [
    {
        "name": "cassandra",
        "parameters": {}
    },
    {
        "name": "pulsar",
        "parameters": {}
    },
    {
        "name": "qdrant",
        "parameters": {}
    },
    {
        "name": "embeddings-hf",
        "parameters": {}
    },
    {
        "name": "graph-rag",
        "parameters": {}
    },
    {
        "name": "grafana",
        "parameters": {}
    },
    {
        "name": "trustgraph",
        "parameters": {}
    },
    {
        "name": "googleaistudio",
        "parameters": {
            "googleaistudio-temperature": 0.3,
            "googleaistudio-max-output-tokens": 2048,
            "googleaistudio-model": "gemini-1.5-pro-002"
        }
    },
    {
        "name": "prompt-template",
        "parameters": {}
    },
    {
        "name": "override-recursive-chunker",
        "parameters": {
            "chunk-size": 1000,
            "chunk-overlap": 50
        }
    },
    {
        "name": "workbench-ui",
        "parameters": {}
    },
    {
        "name": "agent-manager-react",
        "parameters": {}
    }
 ]
 ```
 If you want to make your own configuration you could try changing the
 configuration above:
 - Components which are essential: pulsar, trustgraph, graph-rag, grafana,
  agent-manager-react
 - You need a triple store, one of: cassandra, memgraph, falkordb, neo4j
 - You need a vector store, one of: qdrant, pinecone
 - You need an LLM, one of: azure, azure-openai, bedrock, claude, cohere,
  llamafile, ollama, openai, vertexai.
 - You need an embeddings implementation, one of: embeddings-hf,
  embeddings-ollama
 - Optionally add the Workbench tool: workbench-ui
 Components have over-ridable parameters, look in the component definition
 in `templates/components/` to see what you can override.
 ### Invocation
 Two parameters:
 - The output ZIP file
 - The version number
 The configuration file described above is provided on standard input
 ```
 templates/generate out.zip 0.18.9 < config.json
 ```
--- a/templates/generate
+++ b/templates/generate
@ -81,7 +81,7 @@ def main():
        print("Usage:")
        print("  generate <outfile> <version> < input.json")
        print()
-        raise RuntimeError("Arg error")
+        sys.exit(1)
    outfile = sys.argv[1]
    version = sys.argv[2]
--- a/templates/generate-all
+++ b/templates/generate-all
@ -88,7 +88,7 @@ def full_config_object(
    return config_object([
        graph_store, "pulsar", vector_store, embeddings,
-        "graph-rag", "grafana", "trustgraph", llm
+        "graph-rag", "grafana", "trustgraph", llm, "workbench-ui",
    ])
 def generate_config(
@ -119,13 +119,19 @@ def generate_config(
 def generate_all(output, version):
    for platform in [
-            "docker-compose", "minikube-k8s", "gcp-k8s"
+            "docker-compose",
            # "minikube-k8s", "gcp-k8s"
    ]:
        for model in [
-                "azure", "azure-openai", "bedrock", "claude", "cohere",
+                # "azure", "azure-openai", "bedrock", "claude", "cohere",
-                "googleaistudio", "llamafile", "ollama", "openai", "vertexai",
+                # "googleaistudio", "llamafile",
                "ollama",
                # "openai", "vertexai",
        ]:
            for graph in [
                    "cassandra",
                    # "neo4j", "falkordb"
            ]:
            for graph in [ "cassandra", "neo4j", "falkordb" ]:
                y = generate_config(
                    llm=model, graph_store=graph, platform=platform,
--- a/templates/zip-readme.md
+++ b/templates/zip-readme.md
@ -1,3 +1,15 @@
 Note! this is a subset of possible configurations, to generate your own
 launch config use the config util...
 - Production: https://config-ui.demo.trustgraph.ai
 - Early release: https://dev.config-ui.demo.trustgraph.ai
 The config util auto-generates deployment instructions for your
 configuration, so that's the recommended way to deploy.
 ----------------------------------------------------------------------------
 These are launch configurations for TrustGraph.  See https://trustgraph.ai for
 the quickstart using docker compose.
--- a/test-api/test-load-document
+++ b/test-api/test-load-document
@ -83,9 +83,6 @@ input = {
    # Additional metadata in the form of RDF triples
    "metadata": metadata,
    # Text character set.  Default is UTF-8
    "charset": "utf-8",
    # The PDF document, is presented as a base64 encoded document.
    "data": base64.b64encode(text).decode("utf-8")