Release/v1.2 (#457)

* Bump setup.py versions for 1.1

* PoC MCP server (#419)

* Very initial MCP server PoC for TrustGraph

* Put service on port 8000

* Add MCP container and packages to buildout

* Update docs for API/CLI changes in 1.0 (#421)

* Update some API basics for the 0.23/1.0 API change

* Add MCP container push (#425)

* Add command args to the MCP server (#426)

* Host and port parameters

* Added websocket arg

* More docs

* MCP client support (#427)

- MCP client service
- Tool request/response schema
- API gateway support for mcp-tool
- Message translation for tool request & response
- Make mcp-tool using configuration service for information
  about where the MCP services are.

* Feature/react call mcp (#428)

Key Features

  - MCP Tool Integration: Added core MCP tool support with ToolClientSpec and ToolClient classes
  - API Enhancement: New mcp_tool method for flow-specific tool invocation
  - CLI Tooling: New tg-invoke-mcp-tool command for testing MCP integration
  - React Agent Enhancement: Fixed and improved multi-tool invocation capabilities
  - Tool Management: Enhanced CLI for tool configuration and management

Changes

  - Added MCP tool invocation to API with flow-specific integration
  - Implemented ToolClientSpec and ToolClient for tool call handling
  - Updated agent-manager-react to invoke MCP tools with configurable types
  - Enhanced CLI with new commands and improved help text
  - Added comprehensive documentation for new CLI commands
  - Improved tool configuration management

Testing

  - Added tg-invoke-mcp-tool CLI command for isolated MCP integration testing
  - Enhanced agent capability to invoke multiple tools simultaneously

* Test suite executed from CI pipeline (#433)

* Test strategy & test cases

* Unit tests

* Integration tests

* Extending test coverage (#434)

* Contract tests

* Testing embeedings

* Agent unit tests

* Knowledge pipeline tests

* Turn on contract tests

* Increase storage test coverage (#435)

* Fixing storage and adding tests

* PR pipeline only runs quick tests

* Empty configuration is returned as empty list, previously was not in response (#436)

* Update config util to take files as well as command-line text (#437)

* Updated CLI invocation and config model for tools and mcp (#438)

* Updated CLI invocation and config model for tools and mcp

* CLI anomalies

* Tweaked the MCP tool implementation for new model

* Update agent implementation to match the new model

* Fix agent tools, now all tested

* Fixed integration tests

* Fix MCP delete tool params

* Update Python deps to 1.2

* Update to enable knowledge extraction using the agent framework (#439)

* Implement KG extraction agent (kg-extract-agent)

* Using ReAct framework (agent-manager-react)
 
* ReAct manager had an issue when emitting JSON, which conflicts which ReAct manager's own JSON messages, so refactored ReAct manager to use traditional ReAct messages, non-JSON structure.
 
* Minor refactor to take the prompt template client out of prompt-template so it can be more readily used by other modules. kg-extract-agent uses this framework.

* Migrate from setup.py to pyproject.toml (#440)

* Converted setup.py to pyproject.toml

* Modern package infrastructure as recommended by py docs

* Install missing build deps (#441)

* Install missing build deps (#442)

* Implement logging strategy (#444)

* Logging strategy and convert all prints() to logging invocations

* Fix/startup failure (#445)

* Fix loggin startup problems

* Fix logging startup problems (#446)

* Fix logging startup problems (#447)

* Fixed Mistral OCR to use current API (#448)

* Fixed Mistral OCR to use current API

* Added PDF decoder tests

* Fix Mistral OCR ident to be standard pdf-decoder (#450)

* Fix Mistral OCR ident to be standard pdf-decoder

* Correct test

* Schema structure refactor (#451)

* Write schema refactor spec

* Implemented schema refactor spec

* Structure data mvp (#452)

* Structured data tech spec

* Architecture principles

* New schemas

* Updated schemas and specs

* Object extractor

* Add .coveragerc

* New tests

* Cassandra object storage

* Trying to object extraction working, issues exist

* Validate librarian collection (#453)

* Fix token chunker, broken API invocation (#454)

* Fix token chunker, broken API invocation (#455)

* Knowledge load utility CLI (#456)

* Knowledge loader

* More tests
This commit is contained in:
cybermaggedon 2025-08-18 20:56:09 +01:00 committed by GitHub
parent c85ba197be
commit 89be656990
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
509 changed files with 49632 additions and 5159 deletions

View file

@ -1,3 +0,0 @@
from . service import *

View file

@ -1,7 +0,0 @@
#!/usr/bin/env python3
from . service import run
if __name__ == '__main__':
run()

View file

@ -1,176 +0,0 @@
def to_relationships(text):
prompt = f"""You are a helpful assistant that performs information extraction tasks for a provided text.
Read the provided text. You will model the text as an information network for a RDF knowledge graph in JSON.
Information Network Rules:
- An information network has subjects connected by predicates to objects.
- A subject is a named-entity or a conceptual topic.
- One subject can have many predicates and objects.
- An object is a property or attribute of a subject.
- A subject can be connected by a predicate to another subject.
Reading Instructions:
- Ignore document formatting in the provided text.
- Study the provided text carefully.
Here is the text:
{text}
Response Instructions:
- Obey the information network rules.
- Do not return special characters.
- Respond only with well-formed JSON.
- The JSON response shall be an array of JSON objects with keys "subject", "predicate", "object", and "object-entity".
- The JSON response shall use the following structure:
```json
[{{"subject": string, "predicate": string, "object": string, "object-entity": boolean}}]
```
- The key "object-entity" is TRUE only if the "object" is a subject.
- Do not write any additional text or explanations.
"""
return prompt
def to_topics(text):
prompt = f"""You are a helpful assistant that performs information extraction tasks for a provided text.\nRead the provided text. You will identify topics and their definitions in JSON.
Reading Instructions:
- Ignore document formatting in the provided text.
- Study the provided text carefully.
Here is the text:
{text}
Response Instructions:
- Do not respond with special characters.
- Return only topics that are concepts and unique to the provided text.
- Respond only with well-formed JSON.
- The JSON response shall be an array of objects with keys "topic" and "definition".
- The JSON response shall use the following structure:
```json
[{{"topic": string, "definition": string}}]
```
- Do not write any additional text or explanations.
"""
return prompt
def to_definitions(text):
prompt = f"""You are a helpful assistant that performs information extraction tasks for a provided text.\nRead the provided text. You will identify entities and their definitions in JSON.
Reading Instructions:
- Ignore document formatting in the provided text.
- Study the provided text carefully.
Here is the text:
{text}
Response Instructions:
- Do not respond with special characters.
- Return only entities that are named-entities such as: people, organizations, physical objects, locations, animals, products, commodotities, or substances.
- Respond only with well-formed JSON.
- The JSON response shall be an array of objects with keys "entity" and "definition".
- The JSON response shall use the following structure:
```json
[{{"entity": string, "definition": string}}]
```
- Do not write any additional text or explanations.
"""
return prompt
def to_rows(schema, text):
field_schema = [
f"- Name: {f.name}\n Type: {f.type}\n Definition: {f.description}"
for f in schema.fields
]
field_schema = "\n".join(field_schema)
schema = f"""Object name: {schema.name}
Description: {schema.description}
Fields:
{field_schema}"""
prompt = f"""<instructions>
Study the following text and derive objects which match the schema provided.
You must output an array of JSON objects for each object you discover
which matches the schema. For each object, output a JSON object whose fields
carry the name field specified in the schema.
</instructions>
<schema>
{schema}
</schema>
<text>
{text}
</text>
<requirements>
You will respond only with raw JSON format data. Do not provide
explanations. Do not add markdown formatting or headers or prefixes.
</requirements>"""
return prompt
def get_cypher(kg):
sg2 = []
for f in kg:
print(f)
sg2.append(f"({f.s})-[{f.p}]->({f.o})")
print(sg2)
kg = "\n".join(sg2)
kg = kg.replace("\\", "-")
return kg
def to_kg_query(query, kg):
cypher = get_cypher(kg)
prompt=f"""Study the following set of knowledge statements. The statements are written in Cypher format that has been extracted from a knowledge graph. Use only the provided set of knowledge statements in your response. Do not speculate if the answer is not found in the provided set of knowledge statements.
Here's the knowledge statements:
{cypher}
Use only the provided knowledge statements to respond to the following:
{query}
"""
return prompt
def to_document_query(query, documents):
documents = "\n\n".join(documents)
prompt=f"""Study the following context. Use only the information provided in the context in your response. Do not speculate if the answer is not found in the provided set of knowledge statements.
Here is the context:
{documents}
Use only the provided knowledge statements to respond to the following:
{query}
"""
return prompt

View file

@ -1,485 +0,0 @@
"""
Language service abstracts prompt engineering from LLM.
"""
#
# FIXME: This module is broken, it doesn't conform to the prompt API change
# made in 0.14, nor the prompt template support.
#
# It could be made to conform by using prompt-template as a starting
# point, and hard-coding all the information.
#
import json
import re
from .... schema import Definition, Relationship, Triple
from .... schema import Topic
from .... schema import PromptRequest, PromptResponse, Error
from .... schema import TextCompletionRequest, TextCompletionResponse
from .... schema import text_completion_request_queue
from .... schema import text_completion_response_queue
from .... schema import prompt_request_queue, prompt_response_queue
from .... base import ConsumerProducer
from .... clients.llm_client import LlmClient
from . prompts import to_definitions, to_relationships, to_topics
from . prompts import to_kg_query, to_document_query, to_rows
module = "prompt"
default_input_queue = prompt_request_queue
default_output_queue = prompt_response_queue
default_subscriber = module
class Processor(ConsumerProducer):
def __init__(self, **params):
input_queue = params.get("input_queue", default_input_queue)
output_queue = params.get("output_queue", default_output_queue)
subscriber = params.get("subscriber", default_subscriber)
tc_request_queue = params.get(
"text_completion_request_queue", text_completion_request_queue
)
tc_response_queue = params.get(
"text_completion_response_queue", text_completion_response_queue
)
super(Processor, self).__init__(
**params | {
"input_queue": input_queue,
"output_queue": output_queue,
"subscriber": subscriber,
"input_schema": PromptRequest,
"output_schema": PromptResponse,
"text_completion_request_queue": tc_request_queue,
"text_completion_response_queue": tc_response_queue,
}
)
self.llm = LlmClient(
subscriber=subscriber,
input_queue=tc_request_queue,
output_queue=tc_response_queue,
pulsar_host = self.pulsar_host,
pulsar_api_key=self.pulsar_api_key,
)
def parse_json(self, text):
json_match = re.search(r'```(?:json)?(.*?)```', text, re.DOTALL)
if json_match:
json_str = json_match.group(1).strip()
else:
# If no delimiters, assume the entire output is JSON
json_str = text.strip()
return json.loads(json_str)
async def handle(self, msg):
v = msg.value()
# Sender-produced ID
id = msg.properties()["id"]
kind = v.kind
print(f"Handling kind {kind}...", flush=True)
if kind == "extract-definitions":
await self.handle_extract_definitions(id, v)
return
elif kind == "extract-topics":
await self.handle_extract_topics(id, v)
return
elif kind == "extract-relationships":
await self.handle_extract_relationships(id, v)
return
elif kind == "extract-rows":
await self.handle_extract_rows(id, v)
return
elif kind == "kg-prompt":
await self.handle_kg_prompt(id, v)
return
elif kind == "document-prompt":
await self.handle_document_prompt(id, v)
return
else:
print("Invalid kind.", flush=True)
return
async def handle_extract_definitions(self, id, v):
try:
prompt = to_definitions(v.chunk)
ans = self.llm.request(prompt)
# Silently ignore JSON parse error
try:
defs = self.parse_json(ans)
except:
print("JSON parse error, ignored", flush=True)
defs = []
output = []
for defn in defs:
try:
e = defn["entity"]
d = defn["definition"]
if e == "": continue
if e is None: continue
if d == "": continue
if d is None: continue
output.append(
Definition(
name=e, definition=d
)
)
except:
print("definition fields missing, ignored", flush=True)
print("Send response...", flush=True)
r = PromptResponse(definitions=output, error=None)
await self.send(r, properties={"id": id})
print("Done.", flush=True)
except Exception as e:
print(f"Exception: {e}")
print("Send error response...", flush=True)
r = PromptResponse(
error=Error(
type = "llm-error",
message = str(e),
),
response=None,
)
await self.send(r, properties={"id": id})
async def handle_extract_topics(self, id, v):
try:
prompt = to_topics(v.chunk)
ans = self.llm.request(prompt)
# Silently ignore JSON parse error
try:
defs = self.parse_json(ans)
except:
print("JSON parse error, ignored", flush=True)
defs = []
output = []
for defn in defs:
try:
e = defn["topic"]
d = defn["definition"]
if e == "": continue
if e is None: continue
if d == "": continue
if d is None: continue
output.append(
Topic(
name=e, definition=d
)
)
except:
print("definition fields missing, ignored", flush=True)
print("Send response...", flush=True)
r = PromptResponse(topics=output, error=None)
await self.send(r, properties={"id": id})
print("Done.", flush=True)
except Exception as e:
print(f"Exception: {e}")
print("Send error response...", flush=True)
r = PromptResponse(
error=Error(
type = "llm-error",
message = str(e),
),
response=None,
)
await self.send(r, properties={"id": id})
async def handle_extract_relationships(self, id, v):
try:
prompt = to_relationships(v.chunk)
ans = self.llm.request(prompt)
# Silently ignore JSON parse error
try:
defs = self.parse_json(ans)
except:
print("JSON parse error, ignored", flush=True)
defs = []
output = []
for defn in defs:
try:
s = defn["subject"]
p = defn["predicate"]
o = defn["object"]
o_entity = defn["object-entity"]
if s == "": continue
if s is None: continue
if p == "": continue
if p is None: continue
if o == "": continue
if o is None: continue
if o_entity == "" or o_entity is None:
o_entity = False
output.append(
Relationship(
s = s,
p = p,
o = o,
o_entity = o_entity,
)
)
except Exception as e:
print("relationship fields missing, ignored", flush=True)
print("Send response...", flush=True)
r = PromptResponse(relationships=output, error=None)
await self.send(r, properties={"id": id})
print("Done.", flush=True)
except Exception as e:
print(f"Exception: {e}")
print("Send error response...", flush=True)
r = PromptResponse(
error=Error(
type = "llm-error",
message = str(e),
),
response=None,
)
await self.send(r, properties={"id": id})
async def handle_extract_rows(self, id, v):
try:
fields = v.row_schema.fields
prompt = to_rows(v.row_schema, v.chunk)
print(prompt)
ans = self.llm.request(prompt)
print(ans)
# Silently ignore JSON parse error
try:
objs = self.parse_json(ans)
except:
print("JSON parse error, ignored", flush=True)
objs = []
output = []
for obj in objs:
try:
row = {}
for f in fields:
if f.name not in obj:
print(f"Object ignored, missing field {f.name}")
row = {}
break
row[f.name] = obj[f.name]
if row == {}:
continue
output.append(row)
except Exception as e:
print("row fields missing, ignored", flush=True)
for row in output:
print(row)
print("Send response...", flush=True)
r = PromptResponse(rows=output, error=None)
await self.send(r, properties={"id": id})
print("Done.", flush=True)
except Exception as e:
print(f"Exception: {e}")
print("Send error response...", flush=True)
r = PromptResponse(
error=Error(
type = "llm-error",
message = str(e),
),
response=None,
)
await self.send(r, properties={"id": id})
async def handle_kg_prompt(self, id, v):
try:
prompt = to_kg_query(v.query, v.kg)
print(prompt)
ans = self.llm.request(prompt)
print(ans)
print("Send response...", flush=True)
r = PromptResponse(answer=ans, error=None)
await self.send(r, properties={"id": id})
print("Done.", flush=True)
except Exception as e:
print(f"Exception: {e}")
print("Send error response...", flush=True)
r = PromptResponse(
error=Error(
type = "llm-error",
message = str(e),
),
response=None,
)
await self.send(r, properties={"id": id})
async def handle_document_prompt(self, id, v):
try:
prompt = to_document_query(v.query, v.documents)
print("prompt")
print(prompt)
print("Call LLM...")
ans = self.llm.request(prompt)
print(ans)
print("Send response...", flush=True)
r = PromptResponse(answer=ans, error=None)
await self.send(r, properties={"id": id})
print("Done.", flush=True)
except Exception as e:
print(f"Exception: {e}")
print("Send error response...", flush=True)
r = PromptResponse(
error=Error(
type = "llm-error",
message = str(e),
),
response=None,
)
await self.send(r, properties={"id": id})
@staticmethod
def add_args(parser):
ConsumerProducer.add_args(
parser, default_input_queue, default_subscriber,
default_output_queue,
)
parser.add_argument(
'--text-completion-request-queue',
default=text_completion_request_queue,
help=f'Text completion request queue (default: {text_completion_request_queue})',
)
parser.add_argument(
'--text-completion-response-queue',
default=text_completion_response_queue,
help=f'Text completion response queue (default: {text_completion_response_queue})',
)
def run():
raise RuntimeError("NOT IMPLEMENTED")
Processor.launch(module, __doc__)

View file

@ -1,25 +0,0 @@
prompt-template \
-p pulsar://localhost:6650 \
--system-prompt 'You are a {{attitude}}, you are called {{name}}' \
--global-term \
'name=Craig' \
'attitude=LOUD, SHOUTY ANNOYING BOT' \
--prompt \
'question={{question}}' \
'french-question={{question}}' \
"analyze=Find the name and age in this text, and output a JSON structure containing just the name and age fields: {{description}}. Don't add markup, just output the raw JSON object." \
"graph-query=Study the following knowledge graph, and then answer the question.\\n\nGraph:\\n{% for edge in knowledge %}({{edge.0}})-[{{edge.1}}]->({{edge.2}})\\n{%endfor%}\\nQuestion:\\n{{question}}" \
"extract-definition=Analyse the text provided, and then return a list of terms and definitions. The output should be a JSON array, each item in the array is an object with fields 'term' and 'definition'.Don't add markup, just output the raw JSON object. Here is the text:\\n{{text}}" \
--prompt-response-type \
'question=text' \
'analyze=json' \
'graph-query=text' \
'extract-definition=json' \
--prompt-term \
'question=name:Bonny' \
'french-question=attitude:French-speaking bot' \
--prompt-schema \
'analyze={ "type" : "object", "properties" : { "age": { "type" : "number" }, "name": { "type" : "string" } } }' \
'extract-definition={ "type": "array", "items": { "type": "object", "properties": { "term": { "type": "string" }, "definition": { "type": "string" } }, "required": [ "term", "definition" ] } }'

View file

@ -1,3 +0,0 @@
from . service import *

View file

@ -1,7 +0,0 @@
#!/usr/bin/env python3
from . service import run
if __name__ == '__main__':
run()

View file

@ -1,92 +0,0 @@
import ibis
import json
from jsonschema import validate
import re
class PromptConfiguration:
def __init__(self, system_template, global_terms={}, prompts={}):
self.system_template = system_template
self.global_terms = global_terms
self.prompts = prompts
class Prompt:
def __init__(self, template, response_type = "text", terms=None, schema=None):
self.template = template
self.response_type = response_type
self.terms = terms
self.schema = schema
class PromptManager:
def __init__(self, config):
self.config = config
self.terms = config.global_terms
self.prompts = config.prompts
try:
self.system_template = ibis.Template(config.system_template)
except:
raise RuntimeError("Error in system template")
self.templates = {}
for k, v in self.prompts.items():
try:
self.templates[k] = ibis.Template(v.template)
except:
raise RuntimeError(f"Error in template: {k}")
if v.terms is None:
v.terms = {}
def parse_json(self, text):
json_match = re.search(r'```(?:json)?(.*?)```', text, re.DOTALL)
if json_match:
json_str = json_match.group(1).strip()
else:
# If no delimiters, assume the entire output is JSON
json_str = text.strip()
return json.loads(json_str)
async def invoke(self, id, input, llm):
print("Invoke...", flush=True)
if id not in self.prompts:
raise RuntimeError("ID invalid")
terms = self.terms | self.prompts[id].terms | input
resp_type = self.prompts[id].response_type
prompt = {
"system": self.system_template.render(terms),
"prompt": self.templates[id].render(terms)
}
resp = await llm(**prompt)
if resp_type == "text":
return resp
if resp_type != "json":
raise RuntimeError(f"Response type {resp_type} not known")
try:
obj = self.parse_json(resp)
except:
print("Parse fail:", resp, flush=True)
raise RuntimeError("JSON parse fail")
if self.prompts[id].schema:
try:
validate(instance=obj, schema=self.prompts[id].schema)
print("Validated", flush=True)
except Exception as e:
raise RuntimeError(f"Schema validation fail: {e}")
return obj

View file

@ -1,244 +0,0 @@
"""
Language service abstracts prompt engineering from LLM.
"""
import asyncio
import json
import re
from .... schema import Definition, Relationship, Triple
from .... schema import Topic
from .... schema import PromptRequest, PromptResponse, Error
from .... schema import TextCompletionRequest, TextCompletionResponse
from .... base import FlowProcessor
from .... base import ProducerSpec, ConsumerSpec, TextCompletionClientSpec
from . prompt_manager import PromptConfiguration, Prompt, PromptManager
default_ident = "prompt"
default_concurrency = 1
class Processor(FlowProcessor):
def __init__(self, **params):
id = params.get("id")
concurrency = params.get("concurrency", 1)
# Config key for prompts
self.config_key = params.get("config_type", "prompt")
super(Processor, self).__init__(
**params | {
"id": id,
"concurrency": concurrency,
}
)
self.register_specification(
ConsumerSpec(
name = "request",
schema = PromptRequest,
handler = self.on_request,
concurrency = concurrency,
)
)
self.register_specification(
TextCompletionClientSpec(
request_name = "text-completion-request",
response_name = "text-completion-response",
)
)
self.register_specification(
ProducerSpec(
name = "response",
schema = PromptResponse
)
)
self.register_config_handler(self.on_prompt_config)
# Null configuration, should reload quickly
self.manager = PromptManager(
config = PromptConfiguration("", {}, {})
)
async def on_prompt_config(self, config, version):
print("Loading configuration version", version)
if self.config_key not in config:
print(f"No key {self.config_key} in config", flush=True)
return
config = config[self.config_key]
try:
system = json.loads(config["system"])
ix = json.loads(config["template-index"])
prompts = {}
for k in ix:
pc = config[f"template.{k}"]
data = json.loads(pc)
prompt = data.get("prompt")
rtype = data.get("response-type", "text")
schema = data.get("schema", None)
prompts[k] = Prompt(
template = prompt,
response_type = rtype,
schema = schema,
terms = {}
)
self.manager = PromptManager(
PromptConfiguration(
system,
{},
prompts
)
)
print("Prompt configuration reloaded.", flush=True)
except Exception as e:
print("Exception:", e, flush=True)
print("Configuration reload failed", flush=True)
async def on_request(self, msg, consumer, flow):
v = msg.value()
# Sender-produced ID
id = msg.properties()["id"]
kind = v.id
try:
print(v.terms, flush=True)
input = {
k: json.loads(v)
for k, v in v.terms.items()
}
print(f"Handling kind {kind}...", flush=True)
async def llm(system, prompt):
print(system, flush=True)
print(prompt, flush=True)
resp = await flow("text-completion-request").text_completion(
system = system, prompt = prompt,
)
try:
return resp
except Exception as e:
print("LLM Exception:", e, flush=True)
return None
try:
resp = await self.manager.invoke(kind, input, llm)
except Exception as e:
print("Invocation exception:", e, flush=True)
raise e
print(resp, flush=True)
if isinstance(resp, str):
print("Send text response...", flush=True)
r = PromptResponse(
text=resp,
object=None,
error=None,
)
await flow("response").send(r, properties={"id": id})
return
else:
print("Send object response...", flush=True)
print(json.dumps(resp, indent=4), flush=True)
r = PromptResponse(
text=None,
object=json.dumps(resp),
error=None,
)
await flow("response").send(r, properties={"id": id})
return
except Exception as e:
print(f"Exception: {e}", flush=True)
print("Send error response...", flush=True)
r = PromptResponse(
error=Error(
type = "llm-error",
message = str(e),
),
response=None,
)
await flow("response").send(r, properties={"id": id})
except Exception as e:
print(f"Exception: {e}", flush=True)
print("Send error response...", flush=True)
r = PromptResponse(
error=Error(
type = "llm-error",
message = str(e),
),
response=None,
)
await self.send(r, properties={"id": id})
@staticmethod
def add_args(parser):
parser.add_argument(
'-c', '--concurrency',
type=int,
default=default_concurrency,
help=f'Concurrent processing threads (default: {default_concurrency})'
)
FlowProcessor.add_args(parser)
parser.add_argument(
'--config-type',
default="prompt",
help=f'Configuration key for prompts (default: prompt)',
)
def run():
Processor.launch(default_ident, __doc__)

View file

@ -8,10 +8,14 @@ import requests
import json
from prometheus_client import Histogram
import os
import logging
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
# Module logger
logger = logging.getLogger(__name__)
default_ident = "text-completion"
default_temperature = 0.0
@ -111,11 +115,11 @@ class Processor(LlmService):
inputtokens = response['usage']['prompt_tokens']
outputtokens = response['usage']['completion_tokens']
print(resp, flush=True)
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
logger.debug(f"LLM response: {resp}")
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
print("Send response...", flush=True)
logger.debug("Sending response...")
resp = LlmResult(
text = resp,
@ -128,7 +132,7 @@ class Processor(LlmService):
except TooManyRequests:
print("Rate limit...")
logger.warning("Rate limit exceeded")
# Leave rate limit retries to the base handler
raise TooManyRequests()
@ -137,10 +141,10 @@ class Processor(LlmService):
# Apart from rate limits, treat all exceptions as unrecoverable
print(f"Exception: {e}")
logger.error(f"Azure LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
print("Done.", flush=True)
logger.debug("Azure LLM processing complete")
@staticmethod
def add_args(parser):

View file

@ -8,6 +8,10 @@ import json
from prometheus_client import Histogram
from openai import AzureOpenAI, RateLimitError
import os
import logging
# Module logger
logger = logging.getLogger(__name__)
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
@ -84,10 +88,10 @@ class Processor(LlmService):
inputtokens = resp.usage.prompt_tokens
outputtokens = resp.usage.completion_tokens
print(resp.choices[0].message.content, flush=True)
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
print("Send response...", flush=True)
logger.debug(f"LLM response: {resp.choices[0].message.content}")
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
logger.debug("Sending response...")
r = LlmResult(
text = resp.choices[0].message.content,
@ -100,7 +104,7 @@ class Processor(LlmService):
except RateLimitError:
print("Send rate limit response...", flush=True)
logger.warning("Rate limit exceeded")
# Leave rate limit retries to the base handler
raise TooManyRequests()
@ -108,10 +112,10 @@ class Processor(LlmService):
except Exception as e:
# Apart from rate limits, treat all exceptions as unrecoverable
print(f"Exception: {e}")
logger.error(f"Azure OpenAI LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
print("Done.", flush=True)
logger.debug("Azure OpenAI LLM processing complete")
@staticmethod
def add_args(parser):

View file

@ -6,10 +6,14 @@ Input is prompt, output is response.
import anthropic
import os
import logging
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
# Module logger
logger = logging.getLogger(__name__)
default_ident = "text-completion"
default_model = 'claude-3-5-sonnet-20240620'
@ -42,7 +46,7 @@ class Processor(LlmService):
self.temperature = temperature
self.max_output = max_output
print("Initialised", flush=True)
logger.info("Claude LLM service initialized")
async def generate_content(self, system, prompt):
@ -69,9 +73,9 @@ class Processor(LlmService):
resp = response.content[0].text
inputtokens = response.usage.input_tokens
outputtokens = response.usage.output_tokens
print(resp, flush=True)
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
logger.debug(f"LLM response: {resp}")
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
resp = LlmResult(
text = resp,
@ -91,7 +95,7 @@ class Processor(LlmService):
# Apart from rate limits, treat all exceptions as unrecoverable
print(f"Exception: {e}")
logger.error(f"Claude LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
@staticmethod

View file

@ -7,6 +7,10 @@ Input is prompt, output is response.
import cohere
from prometheus_client import Histogram
import os
import logging
# Module logger
logger = logging.getLogger(__name__)
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
@ -39,7 +43,7 @@ class Processor(LlmService):
self.temperature = temperature
self.cohere = cohere.Client(api_key=api_key)
print("Initialised", flush=True)
logger.info("Cohere LLM service initialized")
async def generate_content(self, system, prompt):
@ -59,9 +63,9 @@ class Processor(LlmService):
inputtokens = int(output.meta.billed_units.input_tokens)
outputtokens = int(output.meta.billed_units.output_tokens)
print(resp, flush=True)
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
logger.debug(f"LLM response: {resp}")
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
resp = LlmResult(
text = resp,
@ -83,7 +87,7 @@ class Processor(LlmService):
# Apart from rate limits, treat all exceptions as unrecoverable
print(f"Exception: {e}")
logger.error(f"Cohere LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
@staticmethod

View file

@ -17,6 +17,10 @@ from google.genai import types
from google.genai.types import HarmCategory, HarmBlockThreshold
from google.api_core.exceptions import ResourceExhausted
import os
import logging
# Module logger
logger = logging.getLogger(__name__)
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
@ -77,7 +81,7 @@ class Processor(LlmService):
# HarmCategory.HARM_CATEGORY_CIVIC_INTEGRITY: block_level,
]
print("Initialised", flush=True)
logger.info("GoogleAIStudio LLM service initialized")
async def generate_content(self, system, prompt):
@ -102,9 +106,9 @@ class Processor(LlmService):
resp = response.text
inputtokens = int(response.usage_metadata.prompt_token_count)
outputtokens = int(response.usage_metadata.candidates_token_count)
print(resp, flush=True)
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
logger.debug(f"LLM response: {resp}")
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
resp = LlmResult(
text = resp,
@ -117,7 +121,7 @@ class Processor(LlmService):
except ResourceExhausted as e:
print("Hit rate limit:", e, flush=True)
logger.warning("Rate limit exceeded")
# Leave rate limit retries to the default handler
raise TooManyRequests()
@ -126,8 +130,7 @@ class Processor(LlmService):
# Apart from rate limits, treat all exceptions as unrecoverable
print(type(e), flush=True)
print(f"Exception: {e}", flush=True)
logger.error(f"GoogleAIStudio LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
@staticmethod

View file

@ -6,6 +6,10 @@ Input is prompt, output is response.
from openai import OpenAI
import os
import logging
# Module logger
logger = logging.getLogger(__name__)
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
@ -44,7 +48,7 @@ class Processor(LlmService):
api_key = "sk-no-key-required",
)
print("Initialised", flush=True)
logger.info("Llamafile LLM service initialized")
async def generate_content(self, system, prompt):
@ -70,9 +74,9 @@ class Processor(LlmService):
inputtokens = resp.usage.prompt_tokens
outputtokens = resp.usage.completion_tokens
print(resp.choices[0].message.content, flush=True)
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
logger.debug(f"LLM response: {resp.choices[0].message.content}")
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
resp = LlmResult(
text = resp.choices[0].message.content,
@ -87,7 +91,7 @@ class Processor(LlmService):
except Exception as e:
print(f"Exception: {e}")
logger.error(f"Llamafile LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
@staticmethod

View file

@ -6,6 +6,10 @@ Input is prompt, output is response.
from openai import OpenAI
import os
import logging
# Module logger
logger = logging.getLogger(__name__)
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
@ -44,7 +48,7 @@ class Processor(LlmService):
api_key = "sk-no-key-required",
)
print("Initialised", flush=True)
logger.info("LMStudio LLM service initialized")
async def generate_content(self, system, prompt):
@ -52,7 +56,7 @@ class Processor(LlmService):
try:
print(prompt)
logger.debug(f"Prompt: {prompt}")
resp = self.openai.chat.completions.create(
model=self.model,
@ -69,14 +73,14 @@ class Processor(LlmService):
#}
)
print(resp)
logger.debug(f"Full response: {resp}")
inputtokens = resp.usage.prompt_tokens
outputtokens = resp.usage.completion_tokens
print(resp.choices[0].message.content, flush=True)
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
logger.debug(f"LLM response: {resp.choices[0].message.content}")
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
resp = LlmResult(
text = resp.choices[0].message.content,
@ -91,7 +95,7 @@ class Processor(LlmService):
except Exception as e:
print(f"Exception: {e}")
logger.error(f"LMStudio LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
@staticmethod

View file

@ -6,6 +6,10 @@ Input is prompt, output is response.
from mistralai import Mistral
import os
import logging
# Module logger
logger = logging.getLogger(__name__)
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
@ -42,7 +46,7 @@ class Processor(LlmService):
self.max_output = max_output
self.mistral = Mistral(api_key=api_key)
print("Initialised", flush=True)
logger.info("Mistral LLM service initialized")
async def generate_content(self, system, prompt):
@ -75,9 +79,9 @@ class Processor(LlmService):
inputtokens = resp.usage.prompt_tokens
outputtokens = resp.usage.completion_tokens
print(resp.choices[0].message.content, flush=True)
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
logger.debug(f"LLM response: {resp.choices[0].message.content}")
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
resp = LlmResult(
text = resp.choices[0].message.content,
@ -105,7 +109,7 @@ class Processor(LlmService):
# Apart from rate limits, treat all exceptions as unrecoverable
print(f"Exception: {e}")
logger.error(f"Mistral LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
@staticmethod

View file

@ -6,6 +6,10 @@ Input is prompt, output is response.
from ollama import Client
import os
import logging
# Module logger
logger = logging.getLogger(__name__)
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
@ -41,8 +45,8 @@ class Processor(LlmService):
response = self.llm.generate(self.model, prompt)
response_text = response['response']
print("Send response...", flush=True)
print(response_text, flush=True)
logger.debug("Sending response...")
logger.debug(f"LLM response: {response_text}")
inputtokens = int(response['prompt_eval_count'])
outputtokens = int(response['eval_count'])
@ -60,7 +64,7 @@ class Processor(LlmService):
except Exception as e:
print(f"Exception: {e}")
logger.error(f"Ollama LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
@staticmethod

View file

@ -6,10 +6,14 @@ Input is prompt, output is response.
from openai import OpenAI, RateLimitError
import os
import logging
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
# Module logger
logger = logging.getLogger(__name__)
default_ident = "text-completion"
default_model = 'gpt-3.5-turbo'
@ -52,7 +56,7 @@ class Processor(LlmService):
else:
self.openai = OpenAI(api_key=api_key)
print("Initialised", flush=True)
logger.info("OpenAI LLM service initialized")
async def generate_content(self, system, prompt):
@ -85,9 +89,9 @@ class Processor(LlmService):
inputtokens = resp.usage.prompt_tokens
outputtokens = resp.usage.completion_tokens
print(resp.choices[0].message.content, flush=True)
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
logger.debug(f"LLM response: {resp.choices[0].message.content}")
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
resp = LlmResult(
text = resp.choices[0].message.content,
@ -109,7 +113,7 @@ class Processor(LlmService):
# Apart from rate limits, treat all exceptions as unrecoverable
print(f"Exception: {type(e)} {e}")
logger.error(f"OpenAI LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
@staticmethod

View file

@ -6,6 +6,10 @@ Input is prompt, output is response.
import os
import aiohttp
import logging
# Module logger
logger = logging.getLogger(__name__)
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
@ -41,9 +45,8 @@ class Processor(LlmService):
self.session = aiohttp.ClientSession()
print("Using TGI service at", base_url)
print("Initialised", flush=True)
logger.info(f"Using TGI service at {base_url}")
logger.info("TGI LLM service initialized")
async def generate_content(self, system, prompt):
@ -85,9 +88,9 @@ class Processor(LlmService):
inputtokens = resp["usage"]["prompt_tokens"]
outputtokens = resp["usage"]["completion_tokens"]
ans = resp["choices"][0]["message"]["content"]
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
print(ans, flush=True)
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
logger.debug(f"LLM response: {ans}")
resp = LlmResult(
text = ans,
@ -104,7 +107,7 @@ class Processor(LlmService):
# Apart from rate limits, treat all exceptions as unrecoverable
print(f"Exception: {type(e)} {e}")
logger.error(f"TGI LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
@staticmethod

View file

@ -6,6 +6,10 @@ Input is prompt, output is response.
import os
import aiohttp
import logging
# Module logger
logger = logging.getLogger(__name__)
from .... exceptions import TooManyRequests
from .... base import LlmService, LlmResult
@ -45,9 +49,8 @@ class Processor(LlmService):
self.session = aiohttp.ClientSession()
print("Using vLLM service at", base_url)
print("Initialised", flush=True)
logger.info(f"Using vLLM service at {base_url}")
logger.info("vLLM LLM service initialized")
async def generate_content(self, system, prompt):
@ -80,9 +83,9 @@ class Processor(LlmService):
inputtokens = resp["usage"]["prompt_tokens"]
outputtokens = resp["usage"]["completion_tokens"]
ans = resp["choices"][0]["text"]
print(f"Input Tokens: {inputtokens}", flush=True)
print(f"Output Tokens: {outputtokens}", flush=True)
print(ans, flush=True)
logger.info(f"Input Tokens: {inputtokens}")
logger.info(f"Output Tokens: {outputtokens}")
logger.debug(f"LLM response: {ans}")
resp = LlmResult(
text = ans,
@ -99,7 +102,7 @@ class Processor(LlmService):
# Apart from rate limits, treat all exceptions as unrecoverable
print(f"Exception: {type(e)} {e}")
logger.error(f"vLLM LLM exception ({type(e).__name__}): {e}", exc_info=True)
raise e
@staticmethod