Release/v1.2 (#457)

* Bump setup.py versions for 1.1 * PoC MCP server (#419) * Very initial MCP server PoC for TrustGraph * Put service on port 8000 * Add MCP container and packages to buildout * Update docs for API/CLI changes in 1.0 (#421) * Update some API basics for the 0.23/1.0 API change * Add MCP container push (#425) * Add command args to the MCP server (#426) * Host and port parameters * Added websocket arg * More docs * MCP client support (#427) - MCP client service - Tool request/response schema - API gateway support for mcp-tool - Message translation for tool request & response - Make mcp-tool using configuration service for information about where the MCP services are. * Feature/react call mcp (#428) Key Features - MCP Tool Integration: Added core MCP tool support with ToolClientSpec and ToolClient classes - API Enhancement: New mcp_tool method for flow-specific tool invocation - CLI Tooling: New tg-invoke-mcp-tool command for testing MCP integration - React Agent Enhancement: Fixed and improved multi-tool invocation capabilities - Tool Management: Enhanced CLI for tool configuration and management Changes - Added MCP tool invocation to API with flow-specific integration - Implemented ToolClientSpec and ToolClient for tool call handling - Updated agent-manager-react to invoke MCP tools with configurable types - Enhanced CLI with new commands and improved help text - Added comprehensive documentation for new CLI commands - Improved tool configuration management Testing - Added tg-invoke-mcp-tool CLI command for isolated MCP integration testing - Enhanced agent capability to invoke multiple tools simultaneously * Test suite executed from CI pipeline (#433) * Test strategy & test cases * Unit tests * Integration tests * Extending test coverage (#434) * Contract tests * Testing embeedings * Agent unit tests * Knowledge pipeline tests * Turn on contract tests * Increase storage test coverage (#435) * Fixing storage and adding tests * PR pipeline only runs quick tests * Empty configuration is returned as empty list, previously was not in response (#436) * Update config util to take files as well as command-line text (#437) * Updated CLI invocation and config model for tools and mcp (#438) * Updated CLI invocation and config model for tools and mcp * CLI anomalies * Tweaked the MCP tool implementation for new model * Update agent implementation to match the new model * Fix agent tools, now all tested * Fixed integration tests * Fix MCP delete tool params * Update Python deps to 1.2 * Update to enable knowledge extraction using the agent framework (#439) * Implement KG extraction agent (kg-extract-agent) * Using ReAct framework (agent-manager-react) * ReAct manager had an issue when emitting JSON, which conflicts which ReAct manager's own JSON messages, so refactored ReAct manager to use traditional ReAct messages, non-JSON structure. * Minor refactor to take the prompt template client out of prompt-template so it can be more readily used by other modules. kg-extract-agent uses this framework. * Migrate from setup.py to pyproject.toml (#440) * Converted setup.py to pyproject.toml * Modern package infrastructure as recommended by py docs * Install missing build deps (#441) * Install missing build deps (#442) * Implement logging strategy (#444) * Logging strategy and convert all prints() to logging invocations * Fix/startup failure (#445) * Fix loggin startup problems * Fix logging startup problems (#446) * Fix logging startup problems (#447) * Fixed Mistral OCR to use current API (#448) * Fixed Mistral OCR to use current API * Added PDF decoder tests * Fix Mistral OCR ident to be standard pdf-decoder (#450) * Fix Mistral OCR ident to be standard pdf-decoder * Correct test * Schema structure refactor (#451) * Write schema refactor spec * Implemented schema refactor spec * Structure data mvp (#452) * Structured data tech spec * Architecture principles * New schemas * Updated schemas and specs * Object extractor * Add .coveragerc * New tests * Cassandra object storage * Trying to object extraction working, issues exist * Validate librarian collection (#453) * Fix token chunker, broken API invocation (#454) * Fix token chunker, broken API invocation (#455) * Knowledge load utility CLI (#456) * Knowledge loader * More tests
2026-04-28 18:06:21 +02:00 · 2025-08-18 20:56:09 +01:00 · 2025-08-18 20:56:09 +01:00 · 89be656990
commit 89be656990
parent c85ba197be
509 changed files with 49632 additions and 5159 deletions
--- a/trustgraph-flow/trustgraph/model/prompt/init.py
+++ b/trustgraph-flow/trustgraph/model/prompt/init.py
--- a/trustgraph-flow/trustgraph/model/prompt/generic/init.py
+++ b/trustgraph-flow/trustgraph/model/prompt/generic/init.py
@ -1,3 +0,0 @@
-
-from . service import *
-
--- a/trustgraph-flow/trustgraph/model/prompt/generic/main.py
+++ b/trustgraph-flow/trustgraph/model/prompt/generic/main.py
@ -1,7 +0,0 @@
-#!/usr/bin/env python3
-
-from . service import run
-
-if __name__ == '__main__':
-    run()
-
--- a/trustgraph-flow/trustgraph/model/prompt/generic/prompts.py
+++ b/trustgraph-flow/trustgraph/model/prompt/generic/prompts.py
@ -1,176 +0,0 @@
-
-def to_relationships(text):
-
-    prompt = f"""You are a helpful assistant that performs information extraction tasks for a provided text.
-
-Read the provided text. You will model the text as an information network for a RDF knowledge graph in JSON.
-
-Information Network Rules:
- An information network has subjects connected by predicates to objects.
- A subject is a named-entity or a conceptual topic.
- One subject can have many predicates and objects.
- An object is a property or attribute of a subject.
- A subject can be connected by a predicate to another subject.
-
-Reading Instructions:
- Ignore document formatting in the provided text.
- Study the provided text carefully.
-
-Here is the text: 
-{text}
-
-Response Instructions:
- Obey the information network rules. 
- Do not return special characters.
- Respond only with well-formed JSON.
- The JSON response shall be an array of JSON objects with keys "subject", "predicate", "object", and "object-entity".
- The JSON response shall use the following structure:
-
-```json
-[{{"subject": string, "predicate": string, "object": string, "object-entity": boolean}}]
-```
-
- The key "object-entity" is TRUE only if the "object" is a subject.
- Do not write any additional text or explanations.
-"""
-    
-    return prompt
-
-def to_topics(text):
-
-    prompt = f"""You are a helpful assistant that performs information extraction tasks for a provided text.\nRead the provided text. You will identify topics and their definitions in JSON.
-
-Reading Instructions:
- Ignore document formatting in the provided text.
- Study the provided text carefully.
-
-Here is the text:
-{text}
-
-Response Instructions: 
- Do not respond with special characters.
- Return only topics that are concepts and unique to the provided text.
- Respond only with well-formed JSON.
- The JSON response shall be an array of objects with keys "topic" and "definition". 
- The JSON response shall use the following structure:
-
-```json
-[{{"topic": string, "definition": string}}]
-```
-
- Do not write any additional text or explanations.
-"""
-    
-    return prompt
-    
-def to_definitions(text):
-
-    prompt = f"""You are a helpful assistant that performs information extraction tasks for a provided text.\nRead the provided text. You will identify entities and their definitions in JSON.
-
-Reading Instructions:
- Ignore document formatting in the provided text.
- Study the provided text carefully.
-
-Here is the text:
-{text}
-
-Response Instructions:
- Do not respond with special characters.
- Return only entities that are named-entities such as: people, organizations, physical objects, locations, animals, products, commodotities, or substances.
- Respond only with well-formed JSON. 
- The JSON response shall be an array of objects with keys "entity" and "definition".
- The JSON response shall use the following structure: 
-
-```json
-[{{"entity": string, "definition": string}}]
-```
-
- Do not write any additional text or explanations.
-"""
-    
-    return prompt
-
-def to_rows(schema, text):
-
-    field_schema = [
-        f"- Name: {f.name}\n  Type: {f.type}\n  Definition: {f.description}"
-        for f in schema.fields
-    ]
-
-    field_schema = "\n".join(field_schema)
-
-    schema = f"""Object name: {schema.name}
-Description: {schema.description}
-
-Fields:
-{field_schema}"""
-
-    prompt = f"""<instructions>
-Study the following text and derive objects which match the schema provided.
-
-You must output an array of JSON objects for each object you discover
-which matches the schema.  For each object, output a JSON object whose fields
-carry the name field specified in the schema.
-</instructions>
-
-<schema>
-{schema}
-</schema>
-
-<text>
-{text}
-</text>
-
-<requirements>
-You will respond only with raw JSON format data. Do not provide
-explanations. Do not add markdown formatting or headers or prefixes.
-</requirements>"""
-    
-    return prompt
-    
-def get_cypher(kg):
-
-    sg2 = []
-
-    for f in kg:
-
-        print(f)
-
-        sg2.append(f"({f.s})-[{f.p}]->({f.o})")
-
-    print(sg2)
-
-    kg = "\n".join(sg2)
-    kg = kg.replace("\\", "-")
-
-    return kg
-
-def to_kg_query(query, kg):
-
-    cypher =  get_cypher(kg)
-
-    prompt=f"""Study the following set of knowledge statements. The statements are written in Cypher format that has been extracted from a knowledge graph. Use only the provided set of knowledge statements in your response. Do not speculate if the answer is not found in the provided set of knowledge statements.
-
-Here's the knowledge statements:
-{cypher}
-
-Use only the provided knowledge statements to respond to the following:
-{query}
-"""
-
-    return prompt
-
-def to_document_query(query, documents):
-
-    documents = "\n\n".join(documents)
-
-    prompt=f"""Study the following context. Use only the information provided in the context in your response. Do not speculate if the answer is not found in the provided set of knowledge statements.
-
-Here is the context:
-{documents}
-
-Use only the provided knowledge statements to respond to the following:
-{query}
-"""
-
-    return prompt
--- a/trustgraph-flow/trustgraph/model/prompt/generic/service.py
+++ b/trustgraph-flow/trustgraph/model/prompt/generic/service.py
@ -1,485 +0,0 @@
-"""
-Language service abstracts prompt engineering from LLM.
-"""
-
-#
-# FIXME: This module is broken, it doesn't conform to the prompt API change
-# made in 0.14, nor the prompt template support.
-#
-# It could be made to conform by using prompt-template as a starting
-# point, and hard-coding all the information.
-# 
-
-
-import json
-import re
-
-from .... schema import Definition, Relationship, Triple
-from .... schema import Topic
-from .... schema import PromptRequest, PromptResponse, Error
-from .... schema import TextCompletionRequest, TextCompletionResponse
-from .... schema import text_completion_request_queue
-from .... schema import text_completion_response_queue
-from .... schema import prompt_request_queue, prompt_response_queue
-from .... base import ConsumerProducer
-from .... clients.llm_client import LlmClient
-
-from . prompts import to_definitions, to_relationships, to_topics
-from . prompts import to_kg_query, to_document_query, to_rows
-
-module = "prompt"
-
-default_input_queue = prompt_request_queue
-default_output_queue = prompt_response_queue
-default_subscriber = module
-
-class Processor(ConsumerProducer):
-
-    def __init__(self, **params):
-
-        input_queue = params.get("input_queue", default_input_queue)
-        output_queue = params.get("output_queue", default_output_queue)
-        subscriber = params.get("subscriber", default_subscriber)
-        tc_request_queue = params.get(
-            "text_completion_request_queue", text_completion_request_queue
-        )
-        tc_response_queue = params.get(
-            "text_completion_response_queue", text_completion_response_queue
-        )
-
-        super(Processor, self).__init__(
-            **params | {
-                "input_queue": input_queue,
-                "output_queue": output_queue,
-                "subscriber": subscriber,
-                "input_schema": PromptRequest,
-                "output_schema": PromptResponse,
-                "text_completion_request_queue": tc_request_queue,
-                "text_completion_response_queue": tc_response_queue,
-            }
-        )
-
-        self.llm = LlmClient(
-            subscriber=subscriber,
-            input_queue=tc_request_queue,
-            output_queue=tc_response_queue,
-            pulsar_host = self.pulsar_host,
-            pulsar_api_key=self.pulsar_api_key,
-        )
-
-    def parse_json(self, text):
-        json_match = re.search(r'```(?:json)?(.*?)```', text, re.DOTALL)
-    
-        if json_match:
-            json_str = json_match.group(1).strip()
-        else:
-            # If no delimiters, assume the entire output is JSON
-            json_str = text.strip()
-
-        return json.loads(json_str)
-
-    async def handle(self, msg):
-
-        v = msg.value()
-
-        # Sender-produced ID
-
-        id = msg.properties()["id"]
-
-        kind = v.kind
-
-        print(f"Handling kind {kind}...", flush=True)
-
-        if kind == "extract-definitions":
-
-            await self.handle_extract_definitions(id, v)
-            return
-
-        elif kind == "extract-topics":
-
-            await self.handle_extract_topics(id, v)
-            return
-
-        elif kind == "extract-relationships":
-
-            await self.handle_extract_relationships(id, v)
-            return
-
-        elif kind == "extract-rows":
-
-            await self.handle_extract_rows(id, v)
-            return
-
-        elif kind == "kg-prompt":
-
-            await self.handle_kg_prompt(id, v)
-            return
-
-        elif kind == "document-prompt":
-
-            await self.handle_document_prompt(id, v)
-            return
-
-        else:
-
-            print("Invalid kind.", flush=True)
-            return
-
-    async def handle_extract_definitions(self, id, v):
-
-        try:
-
-            prompt = to_definitions(v.chunk)
-
-            ans = self.llm.request(prompt)
-
-            # Silently ignore JSON parse error
-            try:
-                defs = self.parse_json(ans)
-            except:
-                print("JSON parse error, ignored", flush=True)
-                defs = []
-
-            output = []
-
-            for defn in defs:
-
-                try:
-                    e = defn["entity"]
-                    d = defn["definition"]
-
-                    if e == "": continue
-                    if e is None: continue
-                    if d == "": continue
-                    if d is None: continue
-
-                    output.append(
-                        Definition(
-                            name=e, definition=d
-                        )
-                    )
-
-                except:
-                    print("definition fields missing, ignored", flush=True)
-
-            print("Send response...", flush=True)
-            r = PromptResponse(definitions=output, error=None)
-            await self.send(r, properties={"id": id})
-
-            print("Done.", flush=True)
-        
-        except Exception as e:
-
-            print(f"Exception: {e}")
-
-            print("Send error response...", flush=True)
-
-            r = PromptResponse(
-                error=Error(
-                    type = "llm-error",
-                    message = str(e),
-                ),
-                response=None,
-            )
-
-            await self.send(r, properties={"id": id})
-
-    async def handle_extract_topics(self, id, v):
-
-        try:
-
-            prompt = to_topics(v.chunk)
-
-            ans = self.llm.request(prompt)
-
-            # Silently ignore JSON parse error
-            try:
-                defs = self.parse_json(ans)
-            except:
-                print("JSON parse error, ignored", flush=True)
-                defs = []
-
-            output = []
-
-            for defn in defs:
-
-                try:
-                    e = defn["topic"]
-                    d = defn["definition"]
-
-                    if e == "": continue
-                    if e is None: continue
-                    if d == "": continue
-                    if d is None: continue
-
-                    output.append(
-                        Topic(
-                            name=e, definition=d
-                        )
-                    )
-
-                except:
-                    print("definition fields missing, ignored", flush=True)
-
-            print("Send response...", flush=True)
-            r = PromptResponse(topics=output, error=None)
-            await self.send(r, properties={"id": id})
-
-            print("Done.", flush=True)
-        
-        except Exception as e:
-
-            print(f"Exception: {e}")
-
-            print("Send error response...", flush=True)
-
-            r = PromptResponse(
-                error=Error(
-                    type = "llm-error",
-                    message = str(e),
-                ),
-                response=None,
-            )
-
-            await self.send(r, properties={"id": id})
-
-    async def handle_extract_relationships(self, id, v):
-
-        try:
-
-            prompt = to_relationships(v.chunk)
-
-            ans = self.llm.request(prompt)
-
-            # Silently ignore JSON parse error
-            try:
-                defs = self.parse_json(ans)
-            except:
-                print("JSON parse error, ignored", flush=True)
-                defs = []
-
-            output = []
-
-            for defn in defs:
-
-                try:
-
-                    s = defn["subject"]
-                    p = defn["predicate"]
-                    o = defn["object"]
-                    o_entity = defn["object-entity"]
-
-                    if s == "": continue
-                    if s is None: continue
-
-                    if p == "": continue
-                    if p is None: continue
-
-                    if o == "": continue
-                    if o is None: continue
-
-                    if o_entity == "" or o_entity is None:
-                        o_entity = False
-
-                    output.append(
-                        Relationship(
-                            s = s,
-                            p = p,
-                            o = o,
-                            o_entity = o_entity,
-                        )
-                    )
-
-                except Exception as e:
-                    print("relationship fields missing, ignored", flush=True)
-
-            print("Send response...", flush=True)
-            r = PromptResponse(relationships=output, error=None)
-            await self.send(r, properties={"id": id})
-
-            print("Done.", flush=True)
-
-        except Exception as e:
-
-            print(f"Exception: {e}")
-
-            print("Send error response...", flush=True)
-
-            r = PromptResponse(
-                error=Error(
-                    type = "llm-error",
-                    message = str(e),
-                ),
-                response=None,
-            )
-
-            await self.send(r, properties={"id": id})
-
-    async def handle_extract_rows(self, id, v):
-
-        try:
-
-            fields = v.row_schema.fields
-
-            prompt = to_rows(v.row_schema, v.chunk)
-
-            print(prompt)
-
-            ans = self.llm.request(prompt)
-
-            print(ans)
-
-            # Silently ignore JSON parse error
-            try:
-                objs = self.parse_json(ans)
-            except:
-                print("JSON parse error, ignored", flush=True)
-                objs = []
-
-            output = []
-
-            for obj in objs:
-
-                try:
-
-                    row = {}
-
-                    for f in fields:
-
-                        if f.name not in obj:
-                            print(f"Object ignored, missing field {f.name}")
-                            row = {}
-                            break
-
-                        row[f.name] = obj[f.name]
-
-                    if row == {}:
-                        continue
-
-                    output.append(row)
-
-                except Exception as e:
-                    print("row fields missing, ignored", flush=True)
-
-            for row in output:
-                print(row)
-
-            print("Send response...", flush=True)
-            r = PromptResponse(rows=output, error=None)
-            await self.send(r, properties={"id": id})
-
-            print("Done.", flush=True)
-
-        except Exception as e:
-
-            print(f"Exception: {e}")
-
-            print("Send error response...", flush=True)
-
-            r = PromptResponse(
-                error=Error(
-                    type = "llm-error",
-                    message = str(e),
-                ),
-                response=None,
-            )
-
-            await self.send(r, properties={"id": id})
-        
-    async def handle_kg_prompt(self, id, v):
-
-        try:
-
-            prompt = to_kg_query(v.query, v.kg)
-
-            print(prompt)
-
-            ans = self.llm.request(prompt)
-
-            print(ans)
-
-            print("Send response...", flush=True)
-            r = PromptResponse(answer=ans, error=None)
-            await self.send(r, properties={"id": id})
-
-            print("Done.", flush=True)
-
-        except Exception as e:
-
-            print(f"Exception: {e}")
-
-            print("Send error response...", flush=True)
-
-            r = PromptResponse(
-                error=Error(
-                    type = "llm-error",
-                    message = str(e),
-                ),
-                response=None,
-            )
-
-            await self.send(r, properties={"id": id})
-        
-    async def handle_document_prompt(self, id, v):
-
-        try:
-
-            prompt = to_document_query(v.query, v.documents)
-
-            print("prompt")
-            print(prompt)
-
-            print("Call LLM...")
-
-            ans = self.llm.request(prompt)
-
-            print(ans)
-
-            print("Send response...", flush=True)
-            r = PromptResponse(answer=ans, error=None)
-            await self.send(r, properties={"id": id})
-
-            print("Done.", flush=True)
-
-        except Exception as e:
-
-            print(f"Exception: {e}")
-
-            print("Send error response...", flush=True)
-
-            r = PromptResponse(
-                error=Error(
-                    type = "llm-error",
-                    message = str(e),
-                ),
-                response=None,
-            )
-
-            await self.send(r, properties={"id": id})
-        
-    @staticmethod
-    def add_args(parser):
-
-        ConsumerProducer.add_args(
-            parser, default_input_queue, default_subscriber,
-            default_output_queue,
-        )
-
-        parser.add_argument(
-            '--text-completion-request-queue',
-            default=text_completion_request_queue,
-            help=f'Text completion request queue (default: {text_completion_request_queue})',
-        )
-
-        parser.add_argument(
-            '--text-completion-response-queue',
-            default=text_completion_response_queue,
-            help=f'Text completion response queue (default: {text_completion_response_queue})',
-        )
-
-def run():
-
-    raise RuntimeError("NOT IMPLEMENTED")
-
-    Processor.launch(module, __doc__)
-
--- a/trustgraph-flow/trustgraph/model/prompt/template/README.md
+++ b/trustgraph-flow/trustgraph/model/prompt/template/README.md
@ -1,25 +0,0 @@
-
-prompt-template \
-    -p pulsar://localhost:6650 \
-    --system-prompt 'You are a {{attitude}}, you are called {{name}}' \
-    --global-term \
-        'name=Craig' \
-        'attitude=LOUD, SHOUTY ANNOYING BOT' \
-    --prompt \
-        'question={{question}}' \
-        'french-question={{question}}' \
-        "analyze=Find the name and age in this text, and output a JSON structure containing just the name and age fields: {{description}}.  Don't add markup, just output the raw JSON object." \
-        "graph-query=Study the following knowledge graph, and then answer the question.\\n\nGraph:\\n{% for edge in knowledge %}({{edge.0}})-[{{edge.1}}]->({{edge.2}})\\n{%endfor%}\\nQuestion:\\n{{question}}" \
-        "extract-definition=Analyse the text provided, and then return a list of terms and definitions.  The output should be a JSON array, each item in the array is an object with fields 'term' and 'definition'.Don't add markup, just output the raw JSON object.  Here is the text:\\n{{text}}" \
-    --prompt-response-type \
-        'question=text' \
-        'analyze=json' \
-        'graph-query=text' \
-        'extract-definition=json' \
-    --prompt-term \
-        'question=name:Bonny' \
-        'french-question=attitude:French-speaking bot' \
-    --prompt-schema \
-        'analyze={ "type" : "object", "properties" : { "age": { "type" : "number" }, "name": { "type" : "string" } } }' \
-        'extract-definition={ "type": "array", "items": { "type": "object", "properties": { "term": { "type": "string" }, "definition": { "type": "string" } }, "required": [ "term", "definition" ] } }'
-    
--- a/trustgraph-flow/trustgraph/model/prompt/template/init.py
+++ b/trustgraph-flow/trustgraph/model/prompt/template/init.py
@ -1,3 +0,0 @@
-
-from . service import *
-
--- a/trustgraph-flow/trustgraph/model/prompt/template/main.py
+++ b/trustgraph-flow/trustgraph/model/prompt/template/main.py
@ -1,7 +0,0 @@
-#!/usr/bin/env python3
-
-from . service import run
-
-if __name__ == '__main__':
-    run()
-
--- a/trustgraph-flow/trustgraph/model/prompt/template/prompt_manager.py
+++ b/trustgraph-flow/trustgraph/model/prompt/template/prompt_manager.py
@ -1,92 +0,0 @@
-
-import ibis
-import json
-from jsonschema import validate
-import re
-
-class PromptConfiguration:
-    def __init__(self, system_template, global_terms={}, prompts={}):
-        self.system_template = system_template
-        self.global_terms = global_terms
-        self.prompts = prompts
-
-class Prompt:
-    def __init__(self, template, response_type = "text", terms=None, schema=None):
-        self.template = template
-        self.response_type = response_type
-        self.terms = terms
-        self.schema = schema
-
-class PromptManager:
-
-    def __init__(self, config):
-        self.config = config
-        self.terms = config.global_terms
-
-        self.prompts = config.prompts
-
-        try:
-            self.system_template = ibis.Template(config.system_template)
-        except:
-            raise RuntimeError("Error in system template")
-
-        self.templates = {}
-        for k, v in self.prompts.items():
-            try:
-                self.templates[k] = ibis.Template(v.template)
-            except:
-                raise RuntimeError(f"Error in template: {k}")
-
-            if v.terms is None:
-                v.terms = {}
-
-    def parse_json(self, text):
-        json_match = re.search(r'```(?:json)?(.*?)```', text, re.DOTALL)
-    
-        if json_match:
-            json_str = json_match.group(1).strip()
-        else:
-            # If no delimiters, assume the entire output is JSON
-            json_str = text.strip()
-
-        return json.loads(json_str)
-
-    async def invoke(self, id, input, llm):
-
-        print("Invoke...", flush=True)
-
-        if id not in self.prompts:
-            raise RuntimeError("ID invalid")
-
-        terms = self.terms | self.prompts[id].terms | input
-
-        resp_type = self.prompts[id].response_type
-
-        prompt = {
-            "system": self.system_template.render(terms),
-            "prompt": self.templates[id].render(terms)
-        }
-
-        resp = await llm(**prompt)
-
-        if resp_type == "text":
-            return resp
-
-        if resp_type != "json":
-            raise RuntimeError(f"Response type {resp_type} not known")
-
-        try:
-            obj = self.parse_json(resp)
-        except:
-            print("Parse fail:", resp, flush=True)
-            raise RuntimeError("JSON parse fail")
-
-        if self.prompts[id].schema:
-            try:
-                validate(instance=obj, schema=self.prompts[id].schema)
-                print("Validated", flush=True)
-            except Exception as e:
-                raise RuntimeError(f"Schema validation fail: {e}")
-
-        return obj
-
--- a/trustgraph-flow/trustgraph/model/prompt/template/service.py
+++ b/trustgraph-flow/trustgraph/model/prompt/template/service.py
@ -1,244 +0,0 @@
-
-"""
-Language service abstracts prompt engineering from LLM.
-"""
-
-import asyncio
-import json
-import re
-
-from .... schema import Definition, Relationship, Triple
-from .... schema import Topic
-from .... schema import PromptRequest, PromptResponse, Error
-from .... schema import TextCompletionRequest, TextCompletionResponse
-
-from .... base import FlowProcessor
-from .... base import ProducerSpec, ConsumerSpec, TextCompletionClientSpec
-
-from . prompt_manager import PromptConfiguration, Prompt, PromptManager
-
-default_ident = "prompt"
-default_concurrency = 1
-
-class Processor(FlowProcessor):
-
-    def __init__(self, **params):
-
-        id = params.get("id")
-        concurrency = params.get("concurrency", 1)
-
-        # Config key for prompts
-        self.config_key = params.get("config_type", "prompt")
-
-        super(Processor, self).__init__(
-            **params | {
-                "id": id,
-                "concurrency": concurrency,
-            }
-        )
-
-        self.register_specification(
-            ConsumerSpec(
-                name = "request",
-                schema = PromptRequest,
-                handler = self.on_request,
-                concurrency = concurrency,
-            )
-        )
-
-        self.register_specification(
-            TextCompletionClientSpec(
-                request_name = "text-completion-request",
-                response_name = "text-completion-response",
-            )
-        )
-
-        self.register_specification(
-            ProducerSpec(
-                name = "response",
-                schema = PromptResponse
-            )
-        )
-
-        self.register_config_handler(self.on_prompt_config)
-
-        # Null configuration, should reload quickly
-        self.manager = PromptManager(
-            config = PromptConfiguration("", {}, {})
-        )
-
-    async def on_prompt_config(self, config, version):
-
-        print("Loading configuration version", version)
-
-        if self.config_key not in config:
-            print(f"No key {self.config_key} in config", flush=True)
-            return
-
-        config = config[self.config_key]
-
-        try:
-
-            system = json.loads(config["system"])
-            ix = json.loads(config["template-index"])
-
-            prompts = {}
-
-            for k in ix:
-
-                pc = config[f"template.{k}"]
-                data = json.loads(pc)
-
-                prompt = data.get("prompt")
-                rtype = data.get("response-type", "text")
-                schema = data.get("schema", None)
-
-                prompts[k] = Prompt(
-                    template = prompt,
-                    response_type = rtype,
-                    schema = schema,
-                    terms = {}
-                )
-
-            self.manager = PromptManager(
-                PromptConfiguration(
-                    system,
-                    {},
-                    prompts
-                )
-            )
-
-            print("Prompt configuration reloaded.", flush=True)
-
-        except Exception as e:
-
-            print("Exception:", e, flush=True)
-            print("Configuration reload failed", flush=True)
-
-    async def on_request(self, msg, consumer, flow):
-
-        v = msg.value()
-
-        # Sender-produced ID
-
-        id = msg.properties()["id"]
-
-        kind = v.id
-
-        try:
-
-            print(v.terms, flush=True)
-
-            input = {
-                k: json.loads(v)
-                for k, v in v.terms.items()
-            }
-            
-            print(f"Handling kind {kind}...", flush=True)
-
-            async def llm(system, prompt):
-
-                print(system, flush=True)
-                print(prompt, flush=True)
-
-                resp = await flow("text-completion-request").text_completion(
-                    system = system, prompt = prompt,
-                )
-
-                try:
-                    return resp
-                except Exception as e:
-                    print("LLM Exception:", e, flush=True)
-                    return None
-
-            try:
-                resp = await self.manager.invoke(kind, input, llm)
-            except Exception as e:
-                print("Invocation exception:", e, flush=True)
-                raise e
-
-            print(resp, flush=True)
-
-            if isinstance(resp, str):
-
-                print("Send text response...", flush=True)
-
-                r = PromptResponse(
-                    text=resp,
-                    object=None,
-                    error=None,
-                )
-
-                await flow("response").send(r, properties={"id": id})
-
-                return
-
-            else:
-
-                print("Send object response...", flush=True)
-                print(json.dumps(resp, indent=4), flush=True)
-
-                r = PromptResponse(
-                    text=None,
-                    object=json.dumps(resp),
-                    error=None,
-                )
-
-                await flow("response").send(r, properties={"id": id})
-
-                return
-            
-        except Exception as e:
-
-            print(f"Exception: {e}", flush=True)
-
-            print("Send error response...", flush=True)
-
-            r = PromptResponse(
-                error=Error(
-                    type = "llm-error",
-                    message = str(e),
-                ),
-                response=None,
-            )
-
-            await flow("response").send(r, properties={"id": id})
-
-        except Exception as e:
-
-            print(f"Exception: {e}", flush=True)
-
-            print("Send error response...", flush=True)
-
-            r = PromptResponse(
-                error=Error(
-                    type = "llm-error",
-                    message = str(e),
-                ),
-                response=None,
-            )
-
-            await self.send(r, properties={"id": id})
-
-    @staticmethod
-    def add_args(parser):
-
-        parser.add_argument(
-            '-c', '--concurrency',
-            type=int,
-            default=default_concurrency,
-            help=f'Concurrent processing threads (default: {default_concurrency})'
-        )
-
-        FlowProcessor.add_args(parser)
-
-        parser.add_argument(
-            '--config-type',
-            default="prompt",
-            help=f'Configuration key for prompts (default: prompt)',
-        )
-
-def run():
-
-    Processor.launch(default_ident, __doc__)
-
--- a/trustgraph-flow/trustgraph/model/text_completion/azure/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/azure/llm.py
@ -8,10 +8,14 @@ import requests
 import json
 from prometheus_client import Histogram
 import os
+import logging

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult

+# Module logger
+logger = logging.getLogger(__name__)
+
 default_ident = "text-completion"

 default_temperature = 0.0
@ -111,11 +115,11 @@ class Processor(LlmService):
            inputtokens = response['usage']['prompt_tokens']
            outputtokens = response['usage']['completion_tokens']

-            print(resp, flush=True)
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
+            logger.debug(f"LLM response: {resp}")
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")

-            print("Send response...", flush=True)
+            logger.debug("Sending response...")

            resp = LlmResult(
                text = resp,
@ -128,7 +132,7 @@ class Processor(LlmService):

        except TooManyRequests:

-            print("Rate limit...")
+            logger.warning("Rate limit exceeded")

            # Leave rate limit retries to the base handler
            raise TooManyRequests()
@ -137,10 +141,10 @@ class Processor(LlmService):

            # Apart from rate limits, treat all exceptions as unrecoverable

-            print(f"Exception: {e}")
+            logger.error(f"Azure LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

-        print("Done.", flush=True)
+        logger.debug("Azure LLM processing complete")

    @staticmethod
    def add_args(parser):
--- a/trustgraph-flow/trustgraph/model/text_completion/azure_openai/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/azure_openai/llm.py
@ -8,6 +8,10 @@ import json
 from prometheus_client import Histogram
 from openai import AzureOpenAI, RateLimitError
 import os
+import logging
+
+# Module logger
+logger = logging.getLogger(__name__)

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult
@ -84,10 +88,10 @@ class Processor(LlmService):

            inputtokens = resp.usage.prompt_tokens
            outputtokens = resp.usage.completion_tokens
-            print(resp.choices[0].message.content, flush=True)
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
-            print("Send response...", flush=True)
+            logger.debug(f"LLM response: {resp.choices[0].message.content}")
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")
+            logger.debug("Sending response...")

            r = LlmResult(
                text = resp.choices[0].message.content,
@ -100,7 +104,7 @@ class Processor(LlmService):

        except RateLimitError:

-            print("Send rate limit response...", flush=True)
+            logger.warning("Rate limit exceeded")

            # Leave rate limit retries to the base handler
            raise TooManyRequests()
@ -108,10 +112,10 @@ class Processor(LlmService):
        except Exception as e:

            # Apart from rate limits, treat all exceptions as unrecoverable
-            print(f"Exception: {e}")
+            logger.error(f"Azure OpenAI LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

-        print("Done.", flush=True)
+        logger.debug("Azure OpenAI LLM processing complete")

    @staticmethod
    def add_args(parser):
--- a/trustgraph-flow/trustgraph/model/text_completion/claude/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/claude/llm.py
@ -6,10 +6,14 @@ Input is prompt, output is response.

 import anthropic
 import os
+import logging

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult

+# Module logger
+logger = logging.getLogger(__name__)
+
 default_ident = "text-completion"

 default_model = 'claude-3-5-sonnet-20240620'
@ -42,7 +46,7 @@ class Processor(LlmService):
        self.temperature = temperature
        self.max_output = max_output

-        print("Initialised", flush=True)
+        logger.info("Claude LLM service initialized")

    async def generate_content(self, system, prompt):

@ -69,9 +73,9 @@ class Processor(LlmService):
            resp = response.content[0].text
            inputtokens = response.usage.input_tokens
            outputtokens = response.usage.output_tokens
-            print(resp, flush=True)
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
+            logger.debug(f"LLM response: {resp}")
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")

            resp = LlmResult(
                text = resp,
@ -91,7 +95,7 @@ class Processor(LlmService):

            # Apart from rate limits, treat all exceptions as unrecoverable

-            print(f"Exception: {e}")
+            logger.error(f"Claude LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

    @staticmethod
--- a/trustgraph-flow/trustgraph/model/text_completion/cohere/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/cohere/llm.py
@ -7,6 +7,10 @@ Input is prompt, output is response.
 import cohere
 from prometheus_client import Histogram
 import os
+import logging
+
+# Module logger
+logger = logging.getLogger(__name__)

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult
@ -39,7 +43,7 @@ class Processor(LlmService):
        self.temperature = temperature
        self.cohere = cohere.Client(api_key=api_key)

-        print("Initialised", flush=True)
+        logger.info("Cohere LLM service initialized")

    async def generate_content(self, system, prompt):

@ -59,9 +63,9 @@ class Processor(LlmService):
            inputtokens = int(output.meta.billed_units.input_tokens)
            outputtokens = int(output.meta.billed_units.output_tokens)

-            print(resp, flush=True)
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
+            logger.debug(f"LLM response: {resp}")
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")

            resp = LlmResult(
                text = resp,
@ -83,7 +87,7 @@ class Processor(LlmService):

            # Apart from rate limits, treat all exceptions as unrecoverable

-            print(f"Exception: {e}")
+            logger.error(f"Cohere LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

    @staticmethod
--- a/trustgraph-flow/trustgraph/model/text_completion/googleaistudio/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/googleaistudio/llm.py
@ -17,6 +17,10 @@ from google.genai import types
 from google.genai.types import HarmCategory, HarmBlockThreshold
 from google.api_core.exceptions import ResourceExhausted
 import os
+import logging
+
+# Module logger
+logger = logging.getLogger(__name__)

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult
@ -77,7 +81,7 @@ class Processor(LlmService):
            # HarmCategory.HARM_CATEGORY_CIVIC_INTEGRITY: block_level,
        ]

-        print("Initialised", flush=True)
+        logger.info("GoogleAIStudio LLM service initialized")

    async def generate_content(self, system, prompt):

@ -102,9 +106,9 @@ class Processor(LlmService):
            resp = response.text
            inputtokens = int(response.usage_metadata.prompt_token_count)
            outputtokens = int(response.usage_metadata.candidates_token_count)
-            print(resp, flush=True)
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
+            logger.debug(f"LLM response: {resp}")
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")

            resp = LlmResult(
                text = resp,
@ -117,7 +121,7 @@ class Processor(LlmService):

        except ResourceExhausted as e:

-            print("Hit rate limit:", e, flush=True)
+            logger.warning("Rate limit exceeded")

            # Leave rate limit retries to the default handler
            raise TooManyRequests()
@ -126,8 +130,7 @@ class Processor(LlmService):

            # Apart from rate limits, treat all exceptions as unrecoverable

-            print(type(e), flush=True)
-            print(f"Exception: {e}", flush=True)
+            logger.error(f"GoogleAIStudio LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

    @staticmethod
--- a/trustgraph-flow/trustgraph/model/text_completion/llamafile/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/llamafile/llm.py
@ -6,6 +6,10 @@ Input is prompt, output is response.

 from openai import OpenAI
 import os
+import logging
+
+# Module logger
+logger = logging.getLogger(__name__)

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult
@ -44,7 +48,7 @@ class Processor(LlmService):
            api_key = "sk-no-key-required",
            )

-        print("Initialised", flush=True)
+        logger.info("Llamafile LLM service initialized")

    async def generate_content(self, system, prompt):

@ -70,9 +74,9 @@ class Processor(LlmService):
            inputtokens = resp.usage.prompt_tokens
            outputtokens = resp.usage.completion_tokens

-            print(resp.choices[0].message.content, flush=True)
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
+            logger.debug(f"LLM response: {resp.choices[0].message.content}")
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")

            resp = LlmResult(
                text = resp.choices[0].message.content,
@ -87,7 +91,7 @@ class Processor(LlmService):

        except Exception as e:

-            print(f"Exception: {e}")
+            logger.error(f"Llamafile LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

    @staticmethod
--- a/trustgraph-flow/trustgraph/model/text_completion/lmstudio/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/lmstudio/llm.py
@ -6,6 +6,10 @@ Input is prompt, output is response.

 from openai import OpenAI
 import os
+import logging
+
+# Module logger
+logger = logging.getLogger(__name__)

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult
@ -44,7 +48,7 @@ class Processor(LlmService):
            api_key = "sk-no-key-required",
        )

-        print("Initialised", flush=True)
+        logger.info("LMStudio LLM service initialized")

    async def generate_content(self, system, prompt):

@ -52,7 +56,7 @@ class Processor(LlmService):

        try:

-            print(prompt)
+            logger.debug(f"Prompt: {prompt}")

            resp = self.openai.chat.completions.create(
                model=self.model,
@ -69,14 +73,14 @@ class Processor(LlmService):
                #}
            )

-            print(resp)
+            logger.debug(f"Full response: {resp}")

            inputtokens = resp.usage.prompt_tokens
            outputtokens = resp.usage.completion_tokens

-            print(resp.choices[0].message.content, flush=True)
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
+            logger.debug(f"LLM response: {resp.choices[0].message.content}")
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")

            resp = LlmResult(
                text = resp.choices[0].message.content,
@ -91,7 +95,7 @@ class Processor(LlmService):

        except Exception as e:

-            print(f"Exception: {e}")
+            logger.error(f"LMStudio LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

    @staticmethod
--- a/trustgraph-flow/trustgraph/model/text_completion/mistral/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/mistral/llm.py
@ -6,6 +6,10 @@ Input is prompt, output is response.

 from mistralai import Mistral
 import os
+import logging
+
+# Module logger
+logger = logging.getLogger(__name__)

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult
@ -42,7 +46,7 @@ class Processor(LlmService):
        self.max_output = max_output
        self.mistral = Mistral(api_key=api_key)

-        print("Initialised", flush=True)
+        logger.info("Mistral LLM service initialized")

    async def generate_content(self, system, prompt):

@ -75,9 +79,9 @@ class Processor(LlmService):
            
            inputtokens = resp.usage.prompt_tokens
            outputtokens = resp.usage.completion_tokens
-            print(resp.choices[0].message.content, flush=True)
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
+            logger.debug(f"LLM response: {resp.choices[0].message.content}")
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")

            resp = LlmResult(
                text = resp.choices[0].message.content,
@ -105,7 +109,7 @@ class Processor(LlmService):

            # Apart from rate limits, treat all exceptions as unrecoverable

-            print(f"Exception: {e}")
+            logger.error(f"Mistral LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

    @staticmethod
--- a/trustgraph-flow/trustgraph/model/text_completion/ollama/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/ollama/llm.py
@ -6,6 +6,10 @@ Input is prompt, output is response.

 from ollama import Client
 import os
+import logging
+
+# Module logger
+logger = logging.getLogger(__name__)

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult
@ -41,8 +45,8 @@ class Processor(LlmService):
            response = self.llm.generate(self.model, prompt)

            response_text = response['response']
-            print("Send response...", flush=True)
-            print(response_text, flush=True)
+            logger.debug("Sending response...")
+            logger.debug(f"LLM response: {response_text}")

            inputtokens = int(response['prompt_eval_count'])
            outputtokens = int(response['eval_count'])
@ -60,7 +64,7 @@ class Processor(LlmService):

        except Exception as e:

-            print(f"Exception: {e}")
+            logger.error(f"Ollama LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

    @staticmethod
--- a/trustgraph-flow/trustgraph/model/text_completion/openai/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/openai/llm.py
@ -6,10 +6,14 @@ Input is prompt, output is response.

 from openai import OpenAI, RateLimitError
 import os
+import logging

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult

+# Module logger
+logger = logging.getLogger(__name__)
+
 default_ident = "text-completion"

 default_model = 'gpt-3.5-turbo'
@ -52,7 +56,7 @@ class Processor(LlmService):
        else:
            self.openai = OpenAI(api_key=api_key)

-        print("Initialised", flush=True)
+        logger.info("OpenAI LLM service initialized")

    async def generate_content(self, system, prompt):

@ -85,9 +89,9 @@ class Processor(LlmService):
            
            inputtokens = resp.usage.prompt_tokens
            outputtokens = resp.usage.completion_tokens
-            print(resp.choices[0].message.content, flush=True)
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
+            logger.debug(f"LLM response: {resp.choices[0].message.content}")
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")

            resp = LlmResult(
                text = resp.choices[0].message.content,
@ -109,7 +113,7 @@ class Processor(LlmService):

            # Apart from rate limits, treat all exceptions as unrecoverable

-            print(f"Exception: {type(e)} {e}")
+            logger.error(f"OpenAI LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

    @staticmethod
--- a/trustgraph-flow/trustgraph/model/text_completion/tgi/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/tgi/llm.py
@ -6,6 +6,10 @@ Input is prompt, output is response.

 import os
 import aiohttp
+import logging
+
+# Module logger
+logger = logging.getLogger(__name__)

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult
@ -41,9 +45,8 @@ class Processor(LlmService):

        self.session = aiohttp.ClientSession()

-        print("Using TGI service at", base_url)
-
-        print("Initialised", flush=True)
+        logger.info(f"Using TGI service at {base_url}")
+        logger.info("TGI LLM service initialized")

    async def generate_content(self, system, prompt):

@ -85,9 +88,9 @@ class Processor(LlmService):
            inputtokens = resp["usage"]["prompt_tokens"]
            outputtokens = resp["usage"]["completion_tokens"]
            ans = resp["choices"][0]["message"]["content"]
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
-            print(ans, flush=True)
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")
+            logger.debug(f"LLM response: {ans}")

            resp = LlmResult(
                text = ans,
@ -104,7 +107,7 @@ class Processor(LlmService):

            # Apart from rate limits, treat all exceptions as unrecoverable

-            print(f"Exception: {type(e)} {e}")
+            logger.error(f"TGI LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

    @staticmethod
--- a/trustgraph-flow/trustgraph/model/text_completion/vllm/llm.py
+++ b/trustgraph-flow/trustgraph/model/text_completion/vllm/llm.py
@ -6,6 +6,10 @@ Input is prompt, output is response.

 import os
 import aiohttp
+import logging
+
+# Module logger
+logger = logging.getLogger(__name__)

 from .... exceptions import TooManyRequests
 from .... base import LlmService, LlmResult
@ -45,9 +49,8 @@ class Processor(LlmService):

        self.session = aiohttp.ClientSession()

-        print("Using vLLM service at", base_url)
-
-        print("Initialised", flush=True)
+        logger.info(f"Using vLLM service at {base_url}")
+        logger.info("vLLM LLM service initialized")

    async def generate_content(self, system, prompt):

@ -80,9 +83,9 @@ class Processor(LlmService):
            inputtokens = resp["usage"]["prompt_tokens"]
            outputtokens = resp["usage"]["completion_tokens"]
            ans = resp["choices"][0]["text"]
-            print(f"Input Tokens: {inputtokens}", flush=True)
-            print(f"Output Tokens: {outputtokens}", flush=True)
-            print(ans, flush=True)
+            logger.info(f"Input Tokens: {inputtokens}")
+            logger.info(f"Output Tokens: {outputtokens}")
+            logger.debug(f"LLM response: {ans}")

            resp = LlmResult(
                text = ans,
@ -99,7 +102,7 @@ class Processor(LlmService):

            # Apart from rate limits, treat all exceptions as unrecoverable

-            print(f"Exception: {type(e)} {e}")
+            logger.error(f"vLLM LLM exception ({type(e).__name__}): {e}", exc_info=True)
            raise e

    @staticmethod