trustgraph/docs/cli/tg-start-library-processing.md

# tg-start-library-processing

Submits a library document for processing through TrustGraph workflows.

## Synopsis

```bash
tg-start-library-processing -d DOCUMENT_ID --id PROCESSING_ID [options]
```

## Description

The `tg-start-library-processing` command initiates processing of a document stored in TrustGraph's document library. This triggers workflows that can extract text, generate embeddings, create knowledge graphs, and enable document search and analysis.

Each processing job is assigned a unique processing ID for tracking and management purposes.

## Options

### Required Arguments

- `-d, --document-id ID`: Document ID from the library to process
- `--id, --processing-id ID`: Unique identifier for this processing job

### Optional Arguments

- `-u, --api-url URL`: TrustGraph API URL (default: `$TRUSTGRAPH_URL` or `http://localhost:8088/`)
- `-U, --user USER`: User ID for processing context (default: `trustgraph`)
- `-i, --flow-id ID`: Flow instance to use for processing (default: `default`)
- `--collection COLLECTION`: Collection to assign processed data (default: `default`)
- `--tags TAGS`: Comma-separated tags for the processing job

## Examples

### Basic Document Processing
```bash
tg-start-library-processing -d "doc_123456789" --id "proc_001"
```

### Processing with Custom Collection
```bash
tg-start-library-processing \
  -d "research_paper_456" \
  --id "research_proc_001" \
  --collection "research-papers" \
  --tags "nlp,research,2023"
```

### Processing with Specific Flow
```bash
tg-start-library-processing \
  -d "technical_manual" \
  --id "manual_proc_001" \
  -i "document-analysis-flow" \
  -U "technical-team" \
  --collection "technical-docs"
```

### Processing Multiple Documents
```bash
# Process several documents in sequence
documents=("doc_001" "doc_002" "doc_003")
for i in "${!documents[@]}"; do
  doc_id="${documents[$i]}"
  proc_id="batch_proc_$(printf %03d $((i+1)))"
  
  echo "Processing document: $doc_id"
  tg-start-library-processing \
    -d "$doc_id" \
    --id "$proc_id" \
    --collection "batch-processing" \
    --tags "batch,automated"
done
```

## Processing Workflow

### Document Processing Steps
1. **Document Retrieval**: Fetch document from library
2. **Content Extraction**: Extract text and metadata
3. **Text Processing**: Clean and normalize content
4. **Embedding Generation**: Create vector embeddings
5. **Knowledge Extraction**: Generate triples and entities
6. **Index Creation**: Make content searchable

### Processing Types
Different document types may trigger different processing workflows:
- **PDF Documents**: Text extraction, OCR if needed
- **Text Files**: Direct text processing
- **Images**: OCR and image analysis
- **Structured Data**: Schema extraction and mapping

## Use Cases

### Batch Document Processing
```bash
# Process all unprocessed documents
process_all_documents() {
  local collection="$1"
  local batch_id="batch_$(date +%Y%m%d_%H%M%S)"
  
  echo "Starting batch processing for collection: $collection"
  
  # Get all document IDs
  tg-show-library-documents | \
    grep "| id" | \
    awk '{print $3}' | \
    while read -r doc_id; do
      proc_id="${batch_id}_${doc_id}"
      
      echo "Processing document: $doc_id"
      tg-start-library-processing \
        -d "$doc_id" \
        --id "$proc_id" \
        --collection "$collection" \
        --tags "batch,automated,$(date +%Y%m%d)"
      
      # Add delay to avoid overwhelming the system
      sleep 2
    done
}

# Process all documents
process_all_documents "processed-docs"
```

### Department-Specific Processing
```bash
# Process documents by department
process_by_department() {
  local dept="$1"
  local flow="$2"
  
  echo "Processing documents for department: $dept"
  
  # Find documents with department tag
  tg-show-library-documents -U "$dept" | \
    grep "| id" | \
    awk '{print $3}' | \
    while read -r doc_id; do
      proc_id="${dept}_proc_$(date +%s)_${doc_id}"
      
      echo "Processing $dept document: $doc_id"
      tg-start-library-processing \
        -d "$doc_id" \
        --id "$proc_id" \
        -i "$flow" \
        -U "$dept" \
        --collection "${dept}-processed" \
        --tags "$dept,departmental"
    done
}

# Process documents for different departments
process_by_department "research" "research-flow"
process_by_department "finance" "document-flow"
process_by_department "legal" "compliance-flow"
```

### Priority Processing
```bash
# Process high-priority documents first
priority_processing() {
  local priority_tags=("urgent" "high-priority" "critical")
  
  for tag in "${priority_tags[@]}"; do
    echo "Processing $tag documents..."
    
    tg-show-library-documents | \
      grep -B5 -A5 "$tag" | \
      grep "| id" | \
      awk '{print $3}' | \
      while read -r doc_id; do
        proc_id="priority_$(date +%s)_${doc_id}"
        
        echo "Processing priority document: $doc_id"
        tg-start-library-processing \
          -d "$doc_id" \
          --id "$proc_id" \
          --collection "priority-processed" \
          --tags "priority,$tag"
      done
  done
}

priority_processing
```

### Conditional Processing
```bash
# Process documents based on criteria
conditional_processing() {
  local criteria="$1"
  local flow="$2"
  
  echo "Processing documents matching criteria: $criteria"
  
  tg-show-library-documents | \
    grep -B10 -A10 "$criteria" | \
    grep "| id" | \
    awk '{print $3}' | \
    while read -r doc_id; do
      # Check if already processed
      if tg-invoke-document-rag -q "test" 2>/dev/null | grep -q "$doc_id"; then
        echo "Document $doc_id already processed, skipping"
        continue
      fi
      
      proc_id="conditional_$(date +%s)_${doc_id}"
      
      echo "Processing document: $doc_id"
      tg-start-library-processing \
        -d "$doc_id" \
        --id "$proc_id" \
        -i "$flow" \
        --collection "conditional-processed" \
        --tags "conditional,$criteria"
    done
}

# Process technical documents
conditional_processing "technical" "technical-flow"
```

## Advanced Usage

### Processing with Validation
```bash
# Process with pre and post validation
validated_processing() {
  local doc_id="$1"
  local proc_id="$2"
  local collection="$3"
  
  echo "Starting validated processing for: $doc_id"
  
  # Pre-processing validation
  if ! tg-show-library-documents | grep -q "$doc_id"; then
    echo "ERROR: Document $doc_id not found"
    return 1
  fi
  
  # Check if processing ID is unique
  if tg-show-flows | grep -q "$proc_id"; then
    echo "ERROR: Processing ID $proc_id already in use"
    return 1
  fi
  
  # Start processing
  echo "Starting processing..."
  tg-start-library-processing \
    -d "$doc_id" \
    --id "$proc_id" \
    --collection "$collection" \
    --tags "validated,$(date +%Y%m%d)"
  
  # Monitor processing
  echo "Monitoring processing progress..."
  timeout=300  # 5 minutes
  elapsed=0
  interval=10
  
  while [ $elapsed -lt $timeout ]; do
    if tg-invoke-document-rag -q "test" -C "$collection" 2>/dev/null | grep -q "$doc_id"; then
      echo "✓ Processing completed successfully"
      return 0
    fi
    
    echo "Processing in progress... (${elapsed}s elapsed)"
    sleep $interval
    elapsed=$((elapsed + interval))
  done
  
  echo "⚠ Processing timeout reached"
  return 1
}

# Usage
validated_processing "doc_123" "validated_proc_001" "validated-docs"
```

### Parallel Processing with Limits
```bash
# Process multiple documents in parallel with concurrency limits
parallel_processing() {
  local doc_list=("$@")
  local max_concurrent=5
  local current_jobs=0
  
  echo "Processing ${#doc_list[@]} documents with max $max_concurrent concurrent jobs"
  
  for doc_id in "${doc_list[@]}"; do
    # Wait if max concurrent jobs reached
    while [ $current_jobs -ge $max_concurrent ]; do
      wait -n  # Wait for any job to complete
      current_jobs=$((current_jobs - 1))
    done
    
    # Start processing in background
    (
      proc_id="parallel_$(date +%s)_${doc_id}"
      echo "Starting processing: $doc_id"
      
      tg-start-library-processing \
        -d "$doc_id" \
        --id "$proc_id" \
        --collection "parallel-processed" \
        --tags "parallel,batch"
      
      echo "Completed processing: $doc_id"
    ) &
    
    current_jobs=$((current_jobs + 1))
  done
  
  # Wait for all remaining jobs
  wait
  echo "All processing jobs completed"
}

# Get document list and process in parallel
doc_list=($(tg-show-library-documents | grep "| id" | awk '{print $3}'))
parallel_processing "${doc_list[@]}"
```

### Processing with Retry Logic
```bash
# Process with automatic retry on failure
processing_with_retry() {
  local doc_id="$1"
  local proc_id="$2"
  local max_retries=3
  local retry_delay=30
  
  for attempt in $(seq 1 $max_retries); do
    echo "Processing attempt $attempt/$max_retries for document: $doc_id"
    
    if tg-start-library-processing \
        -d "$doc_id" \
        --id "${proc_id}_attempt_${attempt}" \
        --collection "retry-processed" \
        --tags "retry,attempt_$attempt"; then
      
      # Wait and check if processing succeeded
      sleep $retry_delay
      
      if tg-invoke-document-rag -q "test" 2>/dev/null | grep -q "$doc_id"; then
        echo "✓ Processing succeeded on attempt $attempt"
        return 0
      else
        echo "Processing started but content not yet accessible"
      fi
    else
      echo "✗ Processing failed on attempt $attempt"
    fi
    
    if [ $attempt -lt $max_retries ]; then
      echo "Retrying in ${retry_delay}s..."
      sleep $retry_delay
    fi
  done
  
  echo "✗ Processing failed after $max_retries attempts"
  return 1
}

# Usage
processing_with_retry "doc_123" "retry_proc_001"
```

### Configuration-Driven Processing
```bash
# Process documents based on configuration file
config_driven_processing() {
  local config_file="$1"
  
  if [ ! -f "$config_file" ]; then
    echo "Configuration file not found: $config_file"
    return 1
  fi
  
  echo "Processing documents based on configuration: $config_file"
  
  # Example configuration format:
  # doc_id,flow_id,collection,tags
  # doc_123,research-flow,research-docs,nlp research
  
  while IFS=',' read -r doc_id flow_id collection tags; do
    # Skip header line
    if [ "$doc_id" = "doc_id" ]; then
      continue
    fi
    
    proc_id="config_$(date +%s)_${doc_id}"
    
    echo "Processing: $doc_id -> $collection (flow: $flow_id)"
    
    tg-start-library-processing \
      -d "$doc_id" \
      --id "$proc_id" \
      -i "$flow_id" \
      --collection "$collection" \
      --tags "$tags"
    
  done < "$config_file"
}

# Create example configuration
cat > processing_config.csv << EOF
doc_id,flow_id,collection,tags
doc_123,research-flow,research-docs,nlp research
doc_456,finance-flow,finance-docs,financial quarterly
doc_789,general-flow,general-docs,general processing
EOF

# Process based on configuration
config_driven_processing "processing_config.csv"
```

## Error Handling

### Document Not Found
```bash
Exception: Document not found
```
**Solution**: Verify document exists with `tg-show-library-documents`.

### Processing ID Conflict
```bash
Exception: Processing ID already exists
```
**Solution**: Use a unique processing ID or check existing jobs with `tg-show-flows`.

### Flow Not Found
```bash
Exception: Flow instance not found
```
**Solution**: Verify flow exists with `tg-show-flows` or `tg-show-flow-classes`.

### Insufficient Resources
```bash
Exception: Processing queue full
```
**Solution**: Wait for current jobs to complete or scale processing resources.

## Monitoring and Management

### Processing Status
```bash
# Monitor processing progress
monitor_processing() {
  local proc_id="$1"
  local timeout="${2:-300}"  # 5 minutes default
  
  echo "Monitoring processing: $proc_id"
  
  elapsed=0
  interval=10
  
  while [ $elapsed -lt $timeout ]; do
    # Check if processing is active
    if tg-show-flows | grep -q "$proc_id"; then
      echo "Processing active... (${elapsed}s elapsed)"
    else
      echo "Processing completed or stopped"
      break
    fi
    
    sleep $interval
    elapsed=$((elapsed + interval))
  done
  
  if [ $elapsed -ge $timeout ]; then
    echo "Monitoring timeout reached"
  fi
}

# Monitor specific processing job
monitor_processing "proc_001" 600
```

### Batch Monitoring
```bash
# Monitor multiple processing jobs
monitor_batch() {
  local proc_pattern="$1"
  
  echo "Monitoring batch processing: $proc_pattern"
  
  while true; do
    active_jobs=$(tg-show-flows | grep -c "$proc_pattern" || echo "0")
    
    if [ "$active_jobs" -eq 0 ]; then
      echo "All batch processing jobs completed"
      break
    fi
    
    echo "Active jobs: $active_jobs"
    sleep 30
  done
}

# Monitor batch processing
monitor_batch "batch_proc_"
```

## Environment Variables

- `TRUSTGRAPH_URL`: Default API URL

## Related Commands

- [`tg-show-library-documents`](tg-show-library-documents.md) - List available documents
- [`tg-stop-library-processing`](tg-stop-library-processing.md) - Stop processing jobs
- [`tg-show-flows`](tg-show-flows.md) - Monitor processing flows
- [`tg-invoke-document-rag`](tg-invoke-document-rag.md) - Query processed documents

## API Integration

This command uses the [Library API](../apis/api-librarian.md) to initiate document processing workflows.

## Best Practices

1. **Unique IDs**: Always use unique processing IDs to avoid conflicts
2. **Resource Management**: Monitor system resources during batch processing
3. **Error Handling**: Implement retry logic for robust processing
4. **Monitoring**: Track processing progress and completion
5. **Collection Organization**: Use meaningful collection names
6. **Tagging**: Apply consistent tagging for better organization
7. **Documentation**: Document processing procedures and configurations

## Troubleshooting

### Processing Not Starting
```bash
# Check document exists
tg-show-library-documents | grep "document-id"

# Check flow is available
tg-show-flows | grep "flow-id"

# Check system resources
free -h
df -h
```

### Slow Processing
```bash
# Check processing queue
tg-show-flows | grep processing | wc -l

# Monitor system load
top
htop
```

### Processing Failures
```bash
# Check processing logs
# (Log location depends on TrustGraph configuration)

# Retry with different flow
tg-start-library-processing -d "doc-id" --id "retry-proc" -i "alternative-flow"
```
Update docs for API/CLI changes in 1.0 (#421) * Update some API basics for the 0.23/1.0 API change 2025-07-03 14:58:32 +01:00			`# tg-start-library-processing`

			`Submits a library document for processing through TrustGraph workflows.`

			`## Synopsis`

			```bash
			`tg-start-library-processing -d DOCUMENT_ID --id PROCESSING_ID [options]`
			```

			`## Description`

			The `tg-start-library-processing` command initiates processing of a document stored in TrustGraph's document library. This triggers workflows that can extract text, generate embeddings, create knowledge graphs, and enable document search and analysis.

			`Each processing job is assigned a unique processing ID for tracking and management purposes.`

			`## Options`

			`### Required Arguments`

			- `-d, --document-id ID`: Document ID from the library to process
			- `--id, --processing-id ID`: Unique identifier for this processing job

			`### Optional Arguments`

			- `-u, --api-url URL`: TrustGraph API URL (default: `$TRUSTGRAPH_URL` or `http://localhost:8088/`)
			- `-U, --user USER`: User ID for processing context (default: `trustgraph`)
			- `-i, --flow-id ID`: Flow instance to use for processing (default: `default`)
			- `--collection COLLECTION`: Collection to assign processed data (default: `default`)
			- `--tags TAGS`: Comma-separated tags for the processing job

			`## Examples`

			`### Basic Document Processing`
			```bash
			`tg-start-library-processing -d "doc_123456789" --id "proc_001"`
			```

			`### Processing with Custom Collection`
			```bash
			`tg-start-library-processing \`
			`-d "research_paper_456" \`
			`--id "research_proc_001" \`
			`--collection "research-papers" \`
			`--tags "nlp,research,2023"`
			```

			`### Processing with Specific Flow`
			```bash
			`tg-start-library-processing \`
			`-d "technical_manual" \`
			`--id "manual_proc_001" \`
			`-i "document-analysis-flow" \`
			`-U "technical-team" \`
			`--collection "technical-docs"`
			```

			`### Processing Multiple Documents`
			```bash
			`# Process several documents in sequence`
			`documents=("doc_001" "doc_002" "doc_003")`
			`for i in "${!documents[@]}"; do`
			`doc_id="${documents[$i]}"`
			`proc_id="batch_proc_$(printf %03d $((i+1)))"`

			`echo "Processing document: $doc_id"`
			`tg-start-library-processing \`
			`-d "$doc_id" \`
			`--id "$proc_id" \`
			`--collection "batch-processing" \`
			`--tags "batch,automated"`
			`done`
			```

			`## Processing Workflow`

			`### Document Processing Steps`
			`1. Document Retrieval: Fetch document from library`
			`2. Content Extraction: Extract text and metadata`
			`3. Text Processing: Clean and normalize content`
			`4. Embedding Generation: Create vector embeddings`
			`5. Knowledge Extraction: Generate triples and entities`
			`6. Index Creation: Make content searchable`

			`### Processing Types`
			`Different document types may trigger different processing workflows:`
			`- PDF Documents: Text extraction, OCR if needed`
			`- Text Files: Direct text processing`
			`- Images: OCR and image analysis`
			`- Structured Data: Schema extraction and mapping`

			`## Use Cases`

			`### Batch Document Processing`
			```bash
			`# Process all unprocessed documents`
			`process_all_documents() {`
			`local collection="$1"`
			`local batch_id="batch_$(date +%Y%m%d_%H%M%S)"`

			`echo "Starting batch processing for collection: $collection"`

			`# Get all document IDs`
			`tg-show-library-documents \| \`
			`grep "\| id" \| \`
			`awk '{print $3}' \| \`
			`while read -r doc_id; do`
			`proc_id="${batch_id}_${doc_id}"`

			`echo "Processing document: $doc_id"`
			`tg-start-library-processing \`
			`-d "$doc_id" \`
			`--id "$proc_id" \`
			`--collection "$collection" \`
			`--tags "batch,automated,$(date +%Y%m%d)"`

			`# Add delay to avoid overwhelming the system`
			`sleep 2`
			`done`
			`}`

			`# Process all documents`
			`process_all_documents "processed-docs"`
			```

			`### Department-Specific Processing`
			```bash
			`# Process documents by department`
			`process_by_department() {`
			`local dept="$1"`
			`local flow="$2"`

			`echo "Processing documents for department: $dept"`

			`# Find documents with department tag`
			`tg-show-library-documents -U "$dept" \| \`
			`grep "\| id" \| \`
			`awk '{print $3}' \| \`
			`while read -r doc_id; do`
			`proc_id="${dept}_proc_$(date +%s)_${doc_id}"`

			`echo "Processing $dept document: $doc_id"`
			`tg-start-library-processing \`
			`-d "$doc_id" \`
			`--id "$proc_id" \`
			`-i "$flow" \`
			`-U "$dept" \`
			`--collection "${dept}-processed" \`
			`--tags "$dept,departmental"`
			`done`
			`}`

			`# Process documents for different departments`
			`process_by_department "research" "research-flow"`
			`process_by_department "finance" "document-flow"`
			`process_by_department "legal" "compliance-flow"`
			```

			`### Priority Processing`
			```bash
			`# Process high-priority documents first`
			`priority_processing() {`
			`local priority_tags=("urgent" "high-priority" "critical")`

			`for tag in "${priority_tags[@]}"; do`
			`echo "Processing $tag documents..."`

			`tg-show-library-documents \| \`
			`grep -B5 -A5 "$tag" \| \`
			`grep "\| id" \| \`
			`awk '{print $3}' \| \`
			`while read -r doc_id; do`
			`proc_id="priority_$(date +%s)_${doc_id}"`

			`echo "Processing priority document: $doc_id"`
			`tg-start-library-processing \`
			`-d "$doc_id" \`
			`--id "$proc_id" \`
			`--collection "priority-processed" \`
			`--tags "priority,$tag"`
			`done`
			`done`
			`}`

			`priority_processing`
			```

			`### Conditional Processing`
			```bash
			`# Process documents based on criteria`
			`conditional_processing() {`
			`local criteria="$1"`
			`local flow="$2"`

			`echo "Processing documents matching criteria: $criteria"`

			`tg-show-library-documents \| \`
			`grep -B10 -A10 "$criteria" \| \`
			`grep "\| id" \| \`
			`awk '{print $3}' \| \`
			`while read -r doc_id; do`
			`# Check if already processed`
			`if tg-invoke-document-rag -q "test" 2>/dev/null \| grep -q "$doc_id"; then`
			`echo "Document $doc_id already processed, skipping"`
			`continue`
			`fi`

			`proc_id="conditional_$(date +%s)_${doc_id}"`

			`echo "Processing document: $doc_id"`
			`tg-start-library-processing \`
			`-d "$doc_id" \`
			`--id "$proc_id" \`
			`-i "$flow" \`
			`--collection "conditional-processed" \`
			`--tags "conditional,$criteria"`
			`done`
			`}`

			`# Process technical documents`
			`conditional_processing "technical" "technical-flow"`
			```

			`## Advanced Usage`

			`### Processing with Validation`
			```bash
			`# Process with pre and post validation`
			`validated_processing() {`
			`local doc_id="$1"`
			`local proc_id="$2"`
			`local collection="$3"`

			`echo "Starting validated processing for: $doc_id"`

			`# Pre-processing validation`
			`if ! tg-show-library-documents \| grep -q "$doc_id"; then`
			`echo "ERROR: Document $doc_id not found"`
			`return 1`
			`fi`

			`# Check if processing ID is unique`
			`if tg-show-flows \| grep -q "$proc_id"; then`
			`echo "ERROR: Processing ID $proc_id already in use"`
			`return 1`
			`fi`

			`# Start processing`
			`echo "Starting processing..."`
			`tg-start-library-processing \`
			`-d "$doc_id" \`
			`--id "$proc_id" \`
			`--collection "$collection" \`
			`--tags "validated,$(date +%Y%m%d)"`

			`# Monitor processing`
			`echo "Monitoring processing progress..."`
			`timeout=300 # 5 minutes`
			`elapsed=0`
			`interval=10`

			`while [ $elapsed -lt $timeout ]; do`
			`if tg-invoke-document-rag -q "test" -C "$collection" 2>/dev/null \| grep -q "$doc_id"; then`
			`echo "✓ Processing completed successfully"`
			`return 0`
			`fi`

			`echo "Processing in progress... (${elapsed}s elapsed)"`
			`sleep $interval`
			`elapsed=$((elapsed + interval))`
			`done`

			`echo "⚠ Processing timeout reached"`
			`return 1`
			`}`

			`# Usage`
			`validated_processing "doc_123" "validated_proc_001" "validated-docs"`
			```

			`### Parallel Processing with Limits`
			```bash
			`# Process multiple documents in parallel with concurrency limits`
			`parallel_processing() {`
			`local doc_list=("$@")`
			`local max_concurrent=5`
			`local current_jobs=0`

			`echo "Processing ${#doc_list[@]} documents with max $max_concurrent concurrent jobs"`

			`for doc_id in "${doc_list[@]}"; do`
			`# Wait if max concurrent jobs reached`
			`while [ $current_jobs -ge $max_concurrent ]; do`
			`wait -n # Wait for any job to complete`
			`current_jobs=$((current_jobs - 1))`
			`done`

			`# Start processing in background`
			`(`
			`proc_id="parallel_$(date +%s)_${doc_id}"`
			`echo "Starting processing: $doc_id"`

			`tg-start-library-processing \`
			`-d "$doc_id" \`
			`--id "$proc_id" \`
			`--collection "parallel-processed" \`
			`--tags "parallel,batch"`

			`echo "Completed processing: $doc_id"`
			`) &`

			`current_jobs=$((current_jobs + 1))`
			`done`

			`# Wait for all remaining jobs`
			`wait`
			`echo "All processing jobs completed"`
			`}`

			`# Get document list and process in parallel`
			`doc_list=($(tg-show-library-documents \| grep "\| id" \| awk '{print $3}'))`
			`parallel_processing "${doc_list[@]}"`
			```

			`### Processing with Retry Logic`
			```bash
			`# Process with automatic retry on failure`
			`processing_with_retry() {`
			`local doc_id="$1"`
			`local proc_id="$2"`
			`local max_retries=3`
			`local retry_delay=30`

			`for attempt in $(seq 1 $max_retries); do`
			`echo "Processing attempt $attempt/$max_retries for document: $doc_id"`

			`if tg-start-library-processing \`
			`-d "$doc_id" \`
			`--id "${proc_id}_attempt_${attempt}" \`
			`--collection "retry-processed" \`
			`--tags "retry,attempt_$attempt"; then`

			`# Wait and check if processing succeeded`
			`sleep $retry_delay`

			`if tg-invoke-document-rag -q "test" 2>/dev/null \| grep -q "$doc_id"; then`
			`echo "✓ Processing succeeded on attempt $attempt"`
			`return 0`
			`else`
			`echo "Processing started but content not yet accessible"`
			`fi`
			`else`
			`echo "✗ Processing failed on attempt $attempt"`
			`fi`

			`if [ $attempt -lt $max_retries ]; then`
			`echo "Retrying in ${retry_delay}s..."`
			`sleep $retry_delay`
			`fi`
			`done`

			`echo "✗ Processing failed after $max_retries attempts"`
			`return 1`
			`}`

			`# Usage`
			`processing_with_retry "doc_123" "retry_proc_001"`
			```

			`### Configuration-Driven Processing`
			```bash
			`# Process documents based on configuration file`
			`config_driven_processing() {`
			`local config_file="$1"`

			`if [ ! -f "$config_file" ]; then`
			`echo "Configuration file not found: $config_file"`
			`return 1`
			`fi`

			`echo "Processing documents based on configuration: $config_file"`

			`# Example configuration format:`
			`# doc_id,flow_id,collection,tags`
			`# doc_123,research-flow,research-docs,nlp research`

			`while IFS=',' read -r doc_id flow_id collection tags; do`
			`# Skip header line`
			`if [ "$doc_id" = "doc_id" ]; then`
			`continue`
			`fi`

			`proc_id="config_$(date +%s)_${doc_id}"`

			`echo "Processing: $doc_id -> $collection (flow: $flow_id)"`

			`tg-start-library-processing \`
			`-d "$doc_id" \`
			`--id "$proc_id" \`
			`-i "$flow_id" \`
			`--collection "$collection" \`
			`--tags "$tags"`

			`done < "$config_file"`
			`}`

			`# Create example configuration`
			`cat > processing_config.csv << EOF`
			`doc_id,flow_id,collection,tags`
			`doc_123,research-flow,research-docs,nlp research`
			`doc_456,finance-flow,finance-docs,financial quarterly`
			`doc_789,general-flow,general-docs,general processing`
			`EOF`

			`# Process based on configuration`
			`config_driven_processing "processing_config.csv"`
			```

			`## Error Handling`

			`### Document Not Found`
			```bash
			`Exception: Document not found`
			```
			Solution: Verify document exists with `tg-show-library-documents`.

			`### Processing ID Conflict`
			```bash
			`Exception: Processing ID already exists`
			```
			Solution: Use a unique processing ID or check existing jobs with `tg-show-flows`.

			`### Flow Not Found`
			```bash
			`Exception: Flow instance not found`
			```
			Solution: Verify flow exists with `tg-show-flows` or `tg-show-flow-classes`.

			`### Insufficient Resources`
			```bash
			`Exception: Processing queue full`
			```
			`Solution: Wait for current jobs to complete or scale processing resources.`

			`## Monitoring and Management`

			`### Processing Status`
			```bash
			`# Monitor processing progress`
			`monitor_processing() {`
			`local proc_id="$1"`
			`local timeout="${2:-300}" # 5 minutes default`

			`echo "Monitoring processing: $proc_id"`

			`elapsed=0`
			`interval=10`

			`while [ $elapsed -lt $timeout ]; do`
			`# Check if processing is active`
			`if tg-show-flows \| grep -q "$proc_id"; then`
			`echo "Processing active... (${elapsed}s elapsed)"`
			`else`
			`echo "Processing completed or stopped"`
			`break`
			`fi`

			`sleep $interval`
			`elapsed=$((elapsed + interval))`
			`done`

			`if [ $elapsed -ge $timeout ]; then`
			`echo "Monitoring timeout reached"`
			`fi`
			`}`

			`# Monitor specific processing job`
			`monitor_processing "proc_001" 600`
			```

			`### Batch Monitoring`
			```bash
			`# Monitor multiple processing jobs`
			`monitor_batch() {`
			`local proc_pattern="$1"`

			`echo "Monitoring batch processing: $proc_pattern"`

			`while true; do`
			`active_jobs=$(tg-show-flows \| grep -c "$proc_pattern" \|\| echo "0")`

			`if [ "$active_jobs" -eq 0 ]; then`
			`echo "All batch processing jobs completed"`
			`break`
			`fi`

			`echo "Active jobs: $active_jobs"`
			`sleep 30`
			`done`
			`}`

			`# Monitor batch processing`
			`monitor_batch "batch_proc_"`
			```

			`## Environment Variables`

			- `TRUSTGRAPH_URL`: Default API URL

			`## Related Commands`

			- [`tg-show-library-documents`](tg-show-library-documents.md) - List available documents
			- [`tg-stop-library-processing`](tg-stop-library-processing.md) - Stop processing jobs
			- [`tg-show-flows`](tg-show-flows.md) - Monitor processing flows
			- [`tg-invoke-document-rag`](tg-invoke-document-rag.md) - Query processed documents

			`## API Integration`

			`This command uses the [Library API](../apis/api-librarian.md) to initiate document processing workflows.`

			`## Best Practices`

			`1. Unique IDs: Always use unique processing IDs to avoid conflicts`
			`2. Resource Management: Monitor system resources during batch processing`
			`3. Error Handling: Implement retry logic for robust processing`
			`4. Monitoring: Track processing progress and completion`
			`5. Collection Organization: Use meaningful collection names`
			`6. Tagging: Apply consistent tagging for better organization`
			`7. Documentation: Document processing procedures and configurations`

			`## Troubleshooting`

			`### Processing Not Starting`
			```bash
			`# Check document exists`
			`tg-show-library-documents \| grep "document-id"`

			`# Check flow is available`
			`tg-show-flows \| grep "flow-id"`

			`# Check system resources`
			`free -h`
			`df -h`
			```

			`### Slow Processing`
			```bash
			`# Check processing queue`
			`tg-show-flows \| grep processing \| wc -l`

			`# Monitor system load`
			`top`
			`htop`
			```

			`### Processing Failures`
			```bash
			`# Check processing logs`
			`# (Log location depends on TrustGraph configuration)`

			`# Retry with different flow`
			`tg-start-library-processing -d "doc-id" --id "retry-proc" -i "alternative-flow"`
			```