trustgraph/docs/tech-specs/structured-data-descriptor.sw.md

---
layout: default
title: "Maelezo ya Muundo wa Data"
parent: "Swahili (Beta)"
---

# Maelezo ya Muundo wa Data

> **Beta Translation:** This document was translated via Machine Learning and as such may not be 100% accurate. All non-English languages are currently classified as Beta.

## Muhtasari

Maelezo ya Muundo wa Data ni lugha ya usanidi iliyo msingi ya JSON ambayo inaeleza jinsi ya kuchanganua, kubadilisha, na kuingiza data iliyo na muundo katika TrustGraph. Inatoa njia ya matangazo ya jinsi ya kuingiza data, ikisaidia aina tofauti za pembejeo na mnyororo mgumu wa mabadiliko bila kuhitaji nambari maalum.

## Dhana Muhimu

### 1. Ufafanuzi wa Aina
Inaeleza aina ya faili ya pembejeo na chaguzi za kuchanganua. Inaamua kichanganuzi gani cha kutumia na jinsi ya kutafsiri data ya chanzo.

### 2. Ramani za Uwanja
Inaunganisha njia za chanzo na uwanja wa lengo pamoja na mabadiliko. Inaeleza jinsi data inavyosonga kutoka kwa vyanzo vya pembejeo hadi kwenye uwanja wa schema ya pato.

### 3. Mnyororo wa Mabadiliko
Mnyororo wa mabadiliko ya data ambayo yanaweza kutumika kwenye thamani za uwanja, pamoja na:
Usafishaji wa data (kukata, kusawazisha)
Ubadilishaji wa aina (uchanganuzi wa tarehe, ubadilishaji wa aina)
Hesabu (hesabu, urekebishaji wa maandishi)
Utafiti (meza za rejea, mbadala)

### 4. Kanuni za Uthibitisho
Vipimo vya ubora wa data ambavyo hutumika kuhakikisha uadilifu wa data:
Uthibitisho wa aina
Vipimo vya anuwai
Ulinganishi wa muundo (regex)
Uthibitisho wa uwanja unaohitajika
Mantiki ya uthibitisho maalum

### 5. Mpangilio wa Jumla
Mpangilio ambao hutumika katika mchakato wote wa uingizaji:
Meza za utafiti kwa uboreshaji wa data
Vigezo na mara kwa mara vya kimataifa
Maelezo ya muundo wa pato
Sera za kushughulikia makosa

## Mkakati wa Utendaji

Utendaji wa programu ya uingizaji unafuata mnyororo huu:

1. **Changanua Mpangilio** - Pakia na thibitisha maelezo ya JSON
2. **Anzisha Kichanganuzi** - Pakia kichanganuzi kinachofaa (CSV, XML, JSON, n.k.) kulingana na `format.type`
3. **Tumia Uchujaji wa Awali** - Fanya vichujio na mabadiliko ya kimataifa
4. **Chakata Rekodi** - Kwa kila rekodi ya pembejeo:
   Toa data ukitumia njia za chanzo (JSONPath, XPath, majina ya safu)
   Tumia mabadiliko ya kiwanja kwa mlolongo
   Thibitisha matokeo dhidi ya sheria zilizoelezwa
   Tumia maadili chaguu kwa data inayokosekana
5. **Tumia Uchakataji wa Baada** - Fanya uondoaji mara mbili, ukusanyaji, n.k.
6. **Toa Pato** - Toa data katika muundo wa lengo uliotakikana

## Usaidizi wa Maneno ya Njia

Aina tofauti za pembejeo hutumia lugha zinazofaa za maneno ya njia:

**CSV**: Majina ya safu au fahirisi (`"column_name"` au `"[2]"`)
**JSON**: Lugha ya JSONPath (`"$.user.profile.email"`)
**XML**: Maneno ya XPath (`"//product[@id='123']/price"`)
**Upana-uliofanyika**: Majina ya uwanja kutoka maelezo ya uwanja

## Faida

**Nambari Moja** - Programu moja ya uingizaji inashughulikia aina tofauti za pembejeo
**Inafaa kwa Mtumiaji** - Watumiaji ambao hawana ujuzi wa kiufundi wanaweza kuunda mipangilio
**Inaweza Kurejeshwa** - Mipangilio inaweza kushirikiwa na kutolewa toleo
**Inafaa** - Mabadiliko magumu bila nambari maalum
**Imara** - Uthibitisho uliounganishwa na utunzaji wa kina wa makosa
**Inaweza Kudumishwa** - Njia ya matangazo hupunguza utata wa utekelezaji

## Maelezo ya Lugha

Maelezo ya Muundo wa Data hutumia muundo wa usanidi wa JSON na muundo wa juu unaofuata:

```json
{
  "version": "1.0",
  "metadata": {
    "name": "Configuration Name",
    "description": "Description of what this config does",
    "author": "Author Name",
    "created": "2024-01-01T00:00:00Z"
  },
  "format": { ... },
  "globals": { ... },
  "preprocessing": [ ... ],
  "mappings": [ ... ],
  "postprocessing": [ ... ],
  "output": { ... }
}
```

### Ufafanuzi wa Muundo

Huainisha muundo wa data ya pembejeo na chaguo za uchakataji:

```json
{
  "format": {
    "type": "csv|json|xml|fixed-width|excel|parquet",
    "encoding": "utf-8",
    "options": {
      // Format-specific options
    }
  }
}
```

#### Chaguo za Muundo wa CSV
```json
{
  "format": {
    "type": "csv",
    "options": {
      "delimiter": ",",
      "quote_char": "\"",
      "escape_char": "\\",
      "skip_rows": 1,
      "has_header": true,
      "null_values": ["", "NULL", "null", "N/A"]
    }
  }
}
```

#### Chaguo za Muundo wa JSON
```json
{
  "format": {
    "type": "json",
    "options": {
      "root_path": "$.data",
      "array_mode": "records|single",
      "flatten": false
    }
  }
}
```

#### Chaguo za Muundo wa XML
```json
{
  "format": {
    "type": "xml",
    "options": {
      "root_element": "//records/record",
      "namespaces": {
        "ns": "http://example.com/namespace"
      }
    }
  }
}
```

### Mipangilio ya Ulimwengu

Fafanua jedwali za utafutaji, vigezo, na usanidi wa jumla:

```json
{
  "globals": {
    "variables": {
      "current_date": "2024-01-01",
      "batch_id": "BATCH_001",
      "default_confidence": 0.8
    },
    "lookup_tables": {
      "country_codes": {
        "US": "United States",
        "UK": "United Kingdom",
        "CA": "Canada"
      },
      "status_mapping": {
        "1": "active",
        "0": "inactive"
      }
    },
    "constants": {
      "source_system": "legacy_crm",
      "import_type": "full"
    }
  }
}
```

### Ramani za Vyanzo vya Habari

Eleza jinsi data kutoka chanzo inavyolingana na vyanzo vya habari vya lengo, pamoja na mabadiliko:

```json
{
  "mappings": [
    {
      "target_field": "person_name",
      "source": "$.name",
      "transforms": [
        {"type": "trim"},
        {"type": "title_case"},
        {"type": "required"}
      ],
      "validation": [
        {"type": "min_length", "value": 2},
        {"type": "max_length", "value": 100},
        {"type": "pattern", "value": "^[A-Za-z\\s]+$"}
      ]
    },
    {
      "target_field": "age",
      "source": "$.age",
      "transforms": [
        {"type": "to_int"},
        {"type": "default", "value": 0}
      ],
      "validation": [
        {"type": "range", "min": 0, "max": 150}
      ]
    },
    {
      "target_field": "country",
      "source": "$.country_code",
      "transforms": [
        {"type": "lookup", "table": "country_codes"},
        {"type": "default", "value": "Unknown"}
      ]
    }
  ]
}
```

### Aina za Ubadilishaji

Kazi zinazopatikana za kubadilisha:

#### Ubadilishaji wa Mfumo wa Herufi
```json
{"type": "trim"},
{"type": "upper"},
{"type": "lower"},
{"type": "title_case"},
{"type": "replace", "pattern": "old", "replacement": "new"},
{"type": "regex_replace", "pattern": "\\d+", "replacement": "XXX"},
{"type": "substring", "start": 0, "end": 10},
{"type": "pad_left", "length": 10, "char": "0"}
```

#### Tafsiri za Aina
```json
{"type": "to_string"},
{"type": "to_int"},
{"type": "to_float"},
{"type": "to_bool"},
{"type": "to_date", "format": "YYYY-MM-DD"},
{"type": "parse_json"}
```

#### Operesheni za Data
```json
{"type": "default", "value": "default_value"},
{"type": "lookup", "table": "table_name"},
{"type": "concat", "values": ["field1", " - ", "field2"]},
{"type": "calculate", "expression": "${field1} + ${field2}"},
{"type": "conditional", "condition": "${age} > 18", "true_value": "adult", "false_value": "minor"}
```

### Kanuni za Uthibitisho

Uchunguzi wa ubora wa data pamoja na udhibiti wa makosa unaoweza kusanidiwa:

### Uthibitisho Msingi
```json
{"type": "required"},
{"type": "not_null"},
{"type": "min_length", "value": 5},
{"type": "max_length", "value": 100},
{"type": "range", "min": 0, "max": 1000},
{"type": "pattern", "value": "^[A-Z]{2,3}$"},
{"type": "in_list", "values": ["active", "inactive", "pending"]}
```

#### Uthibitisho Maalum
```json
{
  "type": "custom",
  "expression": "${age} >= 18 && ${country} == 'US'",
  "message": "Must be 18+ and in US"
},
{
  "type": "cross_field",
  "fields": ["start_date", "end_date"],
  "expression": "${start_date} < ${end_date}",
  "message": "Start date must be before end date"
}
```

### Maandalizi na Urekebishaji

Operesheni za jumla zinazotumiwa kabla na baada ya uhamishaji wa data:

```json
{
  "preprocessing": [
    {
      "type": "filter",
      "condition": "${status} != 'deleted'"
    },
    {
      "type": "sort",
      "field": "created_date",
      "order": "asc"
    }
  ],
  "postprocessing": [
    {
      "type": "deduplicate",
      "key_fields": ["email", "phone"]
    },
    {
      "type": "aggregate",
      "group_by": ["country"],
      "functions": {
        "total_count": {"type": "count"},
        "avg_age": {"type": "avg", "field": "age"}
      }
    }
  ]
}
```

### Usanidi wa Matokeo

Eleza jinsi data iliyochakatwa inapaswa kuonyeshwa:

```json
{
  "output": {
    "format": "trustgraph-objects",
    "schema_name": "person",
    "options": {
      "batch_size": 1000,
      "confidence": 0.9,
      "source_span_field": "raw_text",
      "metadata": {
        "source": "crm_import",
        "version": "1.0"
      }
    },
    "error_handling": {
      "on_validation_error": "skip|fail|log",
      "on_transform_error": "skip|fail|default",
      "max_errors": 100,
      "error_output": "errors.json"
    }
  }
}
```

## Mfano Kamili

```json
{
  "version": "1.0",
  "metadata": {
    "name": "Customer Import from CRM CSV",
    "description": "Imports customer data from legacy CRM system",
    "author": "Data Team",
    "created": "2024-01-01T00:00:00Z"
  },
  "format": {
    "type": "csv",
    "encoding": "utf-8",
    "options": {
      "delimiter": ",",
      "has_header": true,
      "skip_rows": 1
    }
  },
  "globals": {
    "variables": {
      "import_date": "2024-01-01",
      "default_confidence": 0.85
    },
    "lookup_tables": {
      "country_codes": {
        "US": "United States",
        "CA": "Canada",
        "UK": "United Kingdom"
      }
    }
  },
  "preprocessing": [
    {
      "type": "filter",
      "condition": "${status} == 'active'"
    }
  ],
  "mappings": [
    {
      "target_field": "full_name",
      "source": "customer_name",
      "transforms": [
        {"type": "trim"},
        {"type": "title_case"}
      ],
      "validation": [
        {"type": "required"},
        {"type": "min_length", "value": 2}
      ]
    },
    {
      "target_field": "email",
      "source": "email_address",
      "transforms": [
        {"type": "trim"},
        {"type": "lower"}
      ],
      "validation": [
        {"type": "pattern", "value": "^[\\w.-]+@[\\w.-]+\\.[a-zA-Z]{2,}$"}
      ]
    },
    {
      "target_field": "age",
      "source": "age",
      "transforms": [
        {"type": "to_int"},
        {"type": "default", "value": 0}
      ],
      "validation": [
        {"type": "range", "min": 0, "max": 120}
      ]
    },
    {
      "target_field": "country",
      "source": "country_code",
      "transforms": [
        {"type": "lookup", "table": "country_codes"},
        {"type": "default", "value": "Unknown"}
      ]
    }
  ],
  "output": {
    "format": "trustgraph-objects",
    "schema_name": "customer",
    "options": {
      "confidence": "${default_confidence}",
      "batch_size": 500
    },
    "error_handling": {
      "on_validation_error": "log",
      "max_errors": 50
    }
  }
}
```

## Maagizo ya LLM kwa Ajili ya Uundaji wa Maelezo

Maagizo yafuatayo yanaweza kutumika ili LLM iweze kuchanganua data ya mfano na kuunda usanidi wa maelezo:

```
I need you to analyze the provided data sample and create a Structured Data Descriptor configuration in JSON format.

The descriptor should follow this specification:
- version: "1.0"
- metadata: Configuration name, description, author, and creation date
- format: Input format type and parsing options
- globals: Variables, lookup tables, and constants
- preprocessing: Filters and transformations applied before mapping
- mappings: Field-by-field mapping from source to target with transformations and validations
- postprocessing: Operations like deduplication or aggregation
- output: Target format and error handling configuration

ANALYZE THE DATA:
1. Identify the format (CSV, JSON, XML, etc.)
2. Detect delimiters, encodings, and structure
3. Find data types for each field
4. Identify patterns and constraints
5. Look for fields that need cleaning or transformation
6. Find relationships between fields
7. Identify lookup opportunities (codes that map to values)
8. Detect required vs optional fields

CREATE THE DESCRIPTOR:
For each field in the sample data:
- Map it to an appropriate target field name
- Add necessary transformations (trim, case conversion, type casting)
- Include appropriate validations (required, patterns, ranges)
- Set defaults for missing values

Include preprocessing if needed:
- Filters to exclude invalid records
- Sorting requirements

Include postprocessing if beneficial:
- Deduplication on key fields
- Aggregation for summary data

Configure output for TrustGraph:
- format: "trustgraph-objects"
- schema_name: Based on the data entity type
- Appropriate error handling

DATA SAMPLE:
[Insert data sample here]

ADDITIONAL CONTEXT (optional):
- Target schema name: [if known]
- Business rules: [any specific requirements]
- Data quality issues to address: [known problems]

Generate a complete, valid Structured Data Descriptor configuration that will properly import this data into TrustGraph. Include comments explaining key decisions.
```

### Mfano wa Matumizi

```
I need you to analyze the provided data sample and create a Structured Data Descriptor configuration in JSON format.

[Standard instructions from above...]

DATA SAMPLE:
```csv
Kitambulisho cha Mteja,Jina,Barua pepe,Umri,Nchi,Hali,Tarehe ya Kujiunga,Ununuzi Jumla
1001,"Smith, John",john.smith@email.com,35,US,1,2023-01-15,5420.50
1002,"doe, jane",JANE.DOE@GMAIL.COM,28,CA,1,2023-03-22,3200.00
1003,"Bob Johnson",bob@,62,UK,0,2022-11-01,0
1004,"Alice Chen","alice.chen@company.org",41,US,1,2023-06-10,8900.25
1005,,invalid-email,25,XX,1,2024-01-01,100
```

ADDITIONAL CONTEXT:
- Target schema name: customer
- Business rules: Email should be valid and lowercase, names should be title case
- Data quality issues: Some emails are invalid, some names are missing, country codes need mapping
```

### Ombi la Kuchanganua Data Zilizopo Bila Sampuli

```
I need you to help me create a Structured Data Descriptor configuration for importing [data type] data.

The source data has these characteristics:
- Format: [CSV/JSON/XML/etc]
- Fields: [list the fields]
- Data quality issues: [describe any known issues]
- Volume: [approximate number of records]

Requirements:
- [List any specific transformation needs]
- [List any validation requirements]
- [List any business rules]

Please generate a Structured Data Descriptor configuration that will:
1. Parse the input format correctly
2. Clean and standardize the data
3. Validate according to the requirements
4. Handle errors gracefully
5. Output in TrustGraph ExtractedObject format

Focus on making the configuration robust and reusable.
```
Feat: TrustGraph i18n & Documentation Translation Updates (#781) Native CLI i18n: The TrustGraph CLI has built-in translation support that dynamically loads language strings. You can test and use different languages by simply passing the --lang flag (e.g., --lang es for Spanish, --lang ru for Russian) or by configuring your environment's LANG variable. Automated Docs Translations: This PR introduces autonomously translated Markdown documentation into several target languages, including Spanish, Swahili, Portuguese, Turkish, Hindi, Hebrew, Arabic, Simplified Chinese, and Russian. 2026-04-14 07:07:58 -04:00			`---`
			`layout: default`
			`title: "Maelezo ya Muundo wa Data"`
			`parent: "Swahili (Beta)"`
			`---`

			`# Maelezo ya Muundo wa Data`

			`> Beta Translation: This document was translated via Machine Learning and as such may not be 100% accurate. All non-English languages are currently classified as Beta.`

			`## Muhtasari`

			`Maelezo ya Muundo wa Data ni lugha ya usanidi iliyo msingi ya JSON ambayo inaeleza jinsi ya kuchanganua, kubadilisha, na kuingiza data iliyo na muundo katika TrustGraph. Inatoa njia ya matangazo ya jinsi ya kuingiza data, ikisaidia aina tofauti za pembejeo na mnyororo mgumu wa mabadiliko bila kuhitaji nambari maalum.`

			`## Dhana Muhimu`

			`### 1. Ufafanuzi wa Aina`
			`Inaeleza aina ya faili ya pembejeo na chaguzi za kuchanganua. Inaamua kichanganuzi gani cha kutumia na jinsi ya kutafsiri data ya chanzo.`

			`### 2. Ramani za Uwanja`
			`Inaunganisha njia za chanzo na uwanja wa lengo pamoja na mabadiliko. Inaeleza jinsi data inavyosonga kutoka kwa vyanzo vya pembejeo hadi kwenye uwanja wa schema ya pato.`

			`### 3. Mnyororo wa Mabadiliko`
			`Mnyororo wa mabadiliko ya data ambayo yanaweza kutumika kwenye thamani za uwanja, pamoja na:`
			`Usafishaji wa data (kukata, kusawazisha)`
			`Ubadilishaji wa aina (uchanganuzi wa tarehe, ubadilishaji wa aina)`
			`Hesabu (hesabu, urekebishaji wa maandishi)`
			`Utafiti (meza za rejea, mbadala)`

			`### 4. Kanuni za Uthibitisho`
			`Vipimo vya ubora wa data ambavyo hutumika kuhakikisha uadilifu wa data:`
			`Uthibitisho wa aina`
			`Vipimo vya anuwai`
			`Ulinganishi wa muundo (regex)`
			`Uthibitisho wa uwanja unaohitajika`
			`Mantiki ya uthibitisho maalum`

			`### 5. Mpangilio wa Jumla`
			`Mpangilio ambao hutumika katika mchakato wote wa uingizaji:`
			`Meza za utafiti kwa uboreshaji wa data`
			`Vigezo na mara kwa mara vya kimataifa`
			`Maelezo ya muundo wa pato`
			`Sera za kushughulikia makosa`

			`## Mkakati wa Utendaji`

			`Utendaji wa programu ya uingizaji unafuata mnyororo huu:`

			`1. Changanua Mpangilio - Pakia na thibitisha maelezo ya JSON`
			2. Anzisha Kichanganuzi - Pakia kichanganuzi kinachofaa (CSV, XML, JSON, n.k.) kulingana na `format.type`
			`3. Tumia Uchujaji wa Awali - Fanya vichujio na mabadiliko ya kimataifa`
			`4. Chakata Rekodi - Kwa kila rekodi ya pembejeo:`
			`Toa data ukitumia njia za chanzo (JSONPath, XPath, majina ya safu)`
			`Tumia mabadiliko ya kiwanja kwa mlolongo`
			`Thibitisha matokeo dhidi ya sheria zilizoelezwa`
			`Tumia maadili chaguu kwa data inayokosekana`
			`5. Tumia Uchakataji wa Baada - Fanya uondoaji mara mbili, ukusanyaji, n.k.`
			`6. Toa Pato - Toa data katika muundo wa lengo uliotakikana`

			`## Usaidizi wa Maneno ya Njia`

			`Aina tofauti za pembejeo hutumia lugha zinazofaa za maneno ya njia:`

			CSV: Majina ya safu au fahirisi (`"column_name"` au `"[2]"`)
			JSON: Lugha ya JSONPath (`"$.user.profile.email"`)
			XML: Maneno ya XPath (`"//product[@id='123']/price"`)
			`Upana-uliofanyika: Majina ya uwanja kutoka maelezo ya uwanja`

			`## Faida`

			`Nambari Moja - Programu moja ya uingizaji inashughulikia aina tofauti za pembejeo`
			`Inafaa kwa Mtumiaji - Watumiaji ambao hawana ujuzi wa kiufundi wanaweza kuunda mipangilio`
			`Inaweza Kurejeshwa - Mipangilio inaweza kushirikiwa na kutolewa toleo`
			`Inafaa - Mabadiliko magumu bila nambari maalum`
			`Imara - Uthibitisho uliounganishwa na utunzaji wa kina wa makosa`
			`Inaweza Kudumishwa - Njia ya matangazo hupunguza utata wa utekelezaji`

			`## Maelezo ya Lugha`

			`Maelezo ya Muundo wa Data hutumia muundo wa usanidi wa JSON na muundo wa juu unaofuata:`

			```json
			`{`
			`"version": "1.0",`
			`"metadata": {`
			`"name": "Configuration Name",`
			`"description": "Description of what this config does",`
			`"author": "Author Name",`
			`"created": "2024-01-01T00:00:00Z"`
			`},`
			`"format": { ... },`
			`"globals": { ... },`
			`"preprocessing": [ ... ],`
			`"mappings": [ ... ],`
			`"postprocessing": [ ... ],`
			`"output": { ... }`
			`}`
			```

			`### Ufafanuzi wa Muundo`

			`Huainisha muundo wa data ya pembejeo na chaguo za uchakataji:`

			```json
			`{`
			`"format": {`
			`"type": "csv\|json\|xml\|fixed-width\|excel\|parquet",`
			`"encoding": "utf-8",`
			`"options": {`
			`// Format-specific options`
			`}`
			`}`
			`}`
			```

			`#### Chaguo za Muundo wa CSV`
			```json
			`{`
			`"format": {`
			`"type": "csv",`
			`"options": {`
			`"delimiter": ",",`
			`"quote_char": "\"",`
			`"escape_char": "\\",`
			`"skip_rows": 1,`
			`"has_header": true,`
			`"null_values": ["", "NULL", "null", "N/A"]`
			`}`
			`}`
			`}`
			```

			`#### Chaguo za Muundo wa JSON`
			```json
			`{`
			`"format": {`
			`"type": "json",`
			`"options": {`
			`"root_path": "$.data",`
			`"array_mode": "records\|single",`
			`"flatten": false`
			`}`
			`}`
			`}`
			```

			`#### Chaguo za Muundo wa XML`
			```json
			`{`
			`"format": {`
			`"type": "xml",`
			`"options": {`
			`"root_element": "//records/record",`
			`"namespaces": {`
			`"ns": "http://example.com/namespace"`
			`}`
			`}`
			`}`
			`}`
			```

			`### Mipangilio ya Ulimwengu`

			`Fafanua jedwali za utafutaji, vigezo, na usanidi wa jumla:`

			```json
			`{`
			`"globals": {`
			`"variables": {`
			`"current_date": "2024-01-01",`
			`"batch_id": "BATCH_001",`
			`"default_confidence": 0.8`
			`},`
			`"lookup_tables": {`
			`"country_codes": {`
			`"US": "United States",`
			`"UK": "United Kingdom",`
			`"CA": "Canada"`
			`},`
			`"status_mapping": {`
			`"1": "active",`
			`"0": "inactive"`
			`}`
			`},`
			`"constants": {`
			`"source_system": "legacy_crm",`
			`"import_type": "full"`
			`}`
			`}`
			`}`
			```

			`### Ramani za Vyanzo vya Habari`

			`Eleza jinsi data kutoka chanzo inavyolingana na vyanzo vya habari vya lengo, pamoja na mabadiliko:`

			```json
			`{`
			`"mappings": [`
			`{`
			`"target_field": "person_name",`
			`"source": "$.name",`
			`"transforms": [`
			`{"type": "trim"},`
			`{"type": "title_case"},`
			`{"type": "required"}`
			`],`
			`"validation": [`
			`{"type": "min_length", "value": 2},`
			`{"type": "max_length", "value": 100},`
			`{"type": "pattern", "value": "^[A-Za-z\\s]+$"}`
			`]`
			`},`
			`{`
			`"target_field": "age",`
			`"source": "$.age",`
			`"transforms": [`
			`{"type": "to_int"},`
			`{"type": "default", "value": 0}`
			`],`
			`"validation": [`
			`{"type": "range", "min": 0, "max": 150}`
			`]`
			`},`
			`{`
			`"target_field": "country",`
			`"source": "$.country_code",`
			`"transforms": [`
			`{"type": "lookup", "table": "country_codes"},`
			`{"type": "default", "value": "Unknown"}`
			`]`
			`}`
			`]`
			`}`
			```

			`### Aina za Ubadilishaji`

			`Kazi zinazopatikana za kubadilisha:`

			`#### Ubadilishaji wa Mfumo wa Herufi`
			```json
			`{"type": "trim"},`
			`{"type": "upper"},`
			`{"type": "lower"},`
			`{"type": "title_case"},`
			`{"type": "replace", "pattern": "old", "replacement": "new"},`
			`{"type": "regex_replace", "pattern": "\\d+", "replacement": "XXX"},`
			`{"type": "substring", "start": 0, "end": 10},`
			`{"type": "pad_left", "length": 10, "char": "0"}`
			```

			`#### Tafsiri za Aina`
			```json
			`{"type": "to_string"},`
			`{"type": "to_int"},`
			`{"type": "to_float"},`
			`{"type": "to_bool"},`
			`{"type": "to_date", "format": "YYYY-MM-DD"},`
			`{"type": "parse_json"}`
			```

			`#### Operesheni za Data`
			```json
			`{"type": "default", "value": "default_value"},`
			`{"type": "lookup", "table": "table_name"},`
			`{"type": "concat", "values": ["field1", " - ", "field2"]},`
			`{"type": "calculate", "expression": "${field1} + ${field2}"},`
			`{"type": "conditional", "condition": "${age} > 18", "true_value": "adult", "false_value": "minor"}`
			```

			`### Kanuni za Uthibitisho`

			`Uchunguzi wa ubora wa data pamoja na udhibiti wa makosa unaoweza kusanidiwa:`

			`### Uthibitisho Msingi`
			```json
			`{"type": "required"},`
			`{"type": "not_null"},`
			`{"type": "min_length", "value": 5},`
			`{"type": "max_length", "value": 100},`
			`{"type": "range", "min": 0, "max": 1000},`
			`{"type": "pattern", "value": "^[A-Z]{2,3}$"},`
			`{"type": "in_list", "values": ["active", "inactive", "pending"]}`
			```

			`#### Uthibitisho Maalum`
			```json
			`{`
			`"type": "custom",`
			`"expression": "${age} >= 18 && ${country} == 'US'",`
			`"message": "Must be 18+ and in US"`
			`},`
			`{`
			`"type": "cross_field",`
			`"fields": ["start_date", "end_date"],`
			`"expression": "${start_date} < ${end_date}",`
			`"message": "Start date must be before end date"`
			`}`
			```

			`### Maandalizi na Urekebishaji`

			`Operesheni za jumla zinazotumiwa kabla na baada ya uhamishaji wa data:`

			```json
			`{`
			`"preprocessing": [`
			`{`
			`"type": "filter",`
			`"condition": "${status} != 'deleted'"`
			`},`
			`{`
			`"type": "sort",`
			`"field": "created_date",`
			`"order": "asc"`
			`}`
			`],`
			`"postprocessing": [`
			`{`
			`"type": "deduplicate",`
			`"key_fields": ["email", "phone"]`
			`},`
			`{`
			`"type": "aggregate",`
			`"group_by": ["country"],`
			`"functions": {`
			`"total_count": {"type": "count"},`
			`"avg_age": {"type": "avg", "field": "age"}`
			`}`
			`}`
			`]`
			`}`
			```

			`### Usanidi wa Matokeo`

			`Eleza jinsi data iliyochakatwa inapaswa kuonyeshwa:`

			```json
			`{`
			`"output": {`
			`"format": "trustgraph-objects",`
			`"schema_name": "person",`
			`"options": {`
			`"batch_size": 1000,`
			`"confidence": 0.9,`
			`"source_span_field": "raw_text",`
			`"metadata": {`
			`"source": "crm_import",`
			`"version": "1.0"`
			`}`
			`},`
			`"error_handling": {`
			`"on_validation_error": "skip\|fail\|log",`
			`"on_transform_error": "skip\|fail\|default",`
			`"max_errors": 100,`
			`"error_output": "errors.json"`
			`}`
			`}`
			`}`
			```

			`## Mfano Kamili`

			```json
			`{`
			`"version": "1.0",`
			`"metadata": {`
			`"name": "Customer Import from CRM CSV",`
			`"description": "Imports customer data from legacy CRM system",`
			`"author": "Data Team",`
			`"created": "2024-01-01T00:00:00Z"`
			`},`
			`"format": {`
			`"type": "csv",`
			`"encoding": "utf-8",`
			`"options": {`
			`"delimiter": ",",`
			`"has_header": true,`
			`"skip_rows": 1`
			`}`
			`},`
			`"globals": {`
			`"variables": {`
			`"import_date": "2024-01-01",`
			`"default_confidence": 0.85`
			`},`
			`"lookup_tables": {`
			`"country_codes": {`
			`"US": "United States",`
			`"CA": "Canada",`
			`"UK": "United Kingdom"`
			`}`
			`}`
			`},`
			`"preprocessing": [`
			`{`
			`"type": "filter",`
			`"condition": "${status} == 'active'"`
			`}`
			`],`
			`"mappings": [`
			`{`
			`"target_field": "full_name",`
			`"source": "customer_name",`
			`"transforms": [`
			`{"type": "trim"},`
			`{"type": "title_case"}`
			`],`
			`"validation": [`
			`{"type": "required"},`
			`{"type": "min_length", "value": 2}`
			`]`
			`},`
			`{`
			`"target_field": "email",`
			`"source": "email_address",`
			`"transforms": [`
			`{"type": "trim"},`
			`{"type": "lower"}`
			`],`
			`"validation": [`
			`{"type": "pattern", "value": "^[\\w.-]+@[\\w.-]+\\.[a-zA-Z]{2,}$"}`
			`]`
			`},`
			`{`
			`"target_field": "age",`
			`"source": "age",`
			`"transforms": [`
			`{"type": "to_int"},`
			`{"type": "default", "value": 0}`
			`],`
			`"validation": [`
			`{"type": "range", "min": 0, "max": 120}`
			`]`
			`},`
			`{`
			`"target_field": "country",`
			`"source": "country_code",`
			`"transforms": [`
			`{"type": "lookup", "table": "country_codes"},`
			`{"type": "default", "value": "Unknown"}`
			`]`
			`}`
			`],`
			`"output": {`
			`"format": "trustgraph-objects",`
			`"schema_name": "customer",`
			`"options": {`
			`"confidence": "${default_confidence}",`
			`"batch_size": 500`
			`},`
			`"error_handling": {`
			`"on_validation_error": "log",`
			`"max_errors": 50`
			`}`
			`}`
			`}`
			```

			`## Maagizo ya LLM kwa Ajili ya Uundaji wa Maelezo`

			`Maagizo yafuatayo yanaweza kutumika ili LLM iweze kuchanganua data ya mfano na kuunda usanidi wa maelezo:`

			```
			`I need you to analyze the provided data sample and create a Structured Data Descriptor configuration in JSON format.`

			`The descriptor should follow this specification:`
			`- version: "1.0"`
			`- metadata: Configuration name, description, author, and creation date`
			`- format: Input format type and parsing options`
			`- globals: Variables, lookup tables, and constants`
			`- preprocessing: Filters and transformations applied before mapping`
			`- mappings: Field-by-field mapping from source to target with transformations and validations`
			`- postprocessing: Operations like deduplication or aggregation`
			`- output: Target format and error handling configuration`

			`ANALYZE THE DATA:`
			`1. Identify the format (CSV, JSON, XML, etc.)`
			`2. Detect delimiters, encodings, and structure`
			`3. Find data types for each field`
			`4. Identify patterns and constraints`
			`5. Look for fields that need cleaning or transformation`
			`6. Find relationships between fields`
			`7. Identify lookup opportunities (codes that map to values)`
			`8. Detect required vs optional fields`

			`CREATE THE DESCRIPTOR:`
			`For each field in the sample data:`
			`- Map it to an appropriate target field name`
			`- Add necessary transformations (trim, case conversion, type casting)`
			`- Include appropriate validations (required, patterns, ranges)`
			`- Set defaults for missing values`

			`Include preprocessing if needed:`
			`- Filters to exclude invalid records`
			`- Sorting requirements`

			`Include postprocessing if beneficial:`
			`- Deduplication on key fields`
			`- Aggregation for summary data`

			`Configure output for TrustGraph:`
			`- format: "trustgraph-objects"`
			`- schema_name: Based on the data entity type`
			`- Appropriate error handling`

			`DATA SAMPLE:`
			`[Insert data sample here]`

			`ADDITIONAL CONTEXT (optional):`
			`- Target schema name: [if known]`
			`- Business rules: [any specific requirements]`
			`- Data quality issues to address: [known problems]`

			`Generate a complete, valid Structured Data Descriptor configuration that will properly import this data into TrustGraph. Include comments explaining key decisions.`
			```

			`### Mfano wa Matumizi`

			```
			`I need you to analyze the provided data sample and create a Structured Data Descriptor configuration in JSON format.`

			`[Standard instructions from above...]`

			`DATA SAMPLE:`
			```csv
			`Kitambulisho cha Mteja,Jina,Barua pepe,Umri,Nchi,Hali,Tarehe ya Kujiunga,Ununuzi Jumla`
			`1001,"Smith, John",john.smith@email.com,35,US,1,2023-01-15,5420.50`
			`1002,"doe, jane",JANE.DOE@GMAIL.COM,28,CA,1,2023-03-22,3200.00`
			`1003,"Bob Johnson",bob@,62,UK,0,2022-11-01,0`
			`1004,"Alice Chen","alice.chen@company.org",41,US,1,2023-06-10,8900.25`
			`1005,,invalid-email,25,XX,1,2024-01-01,100`
			```

			`ADDITIONAL CONTEXT:`
			`- Target schema name: customer`
			`- Business rules: Email should be valid and lowercase, names should be title case`
			`- Data quality issues: Some emails are invalid, some names are missing, country codes need mapping`
			```

			`### Ombi la Kuchanganua Data Zilizopo Bila Sampuli`

			```
			`I need you to help me create a Structured Data Descriptor configuration for importing [data type] data.`

			`The source data has these characteristics:`
			`- Format: [CSV/JSON/XML/etc]`
			`- Fields: [list the fields]`
			`- Data quality issues: [describe any known issues]`
			`- Volume: [approximate number of records]`

			`Requirements:`
			`- [List any specific transformation needs]`
			`- [List any validation requirements]`
			`- [List any business rules]`

			`Please generate a Structured Data Descriptor configuration that will:`
			`1. Parse the input format correctly`
			`2. Clean and standardize the data`
			`3. Validate according to the requirements`
			`4. Handle errors gracefully`
			`5. Output in TrustGraph ExtractedObject format`

			`Focus on making the configuration robust and reusable.`
			```