refactor: Update GitHub connector to use gitingest CLI

- Refactored GitHubConnector to utilize gitingest CLI via subprocess, improving performance and avoiding async issues with Celery.
- Updated ingestion method to handle repository digests more efficiently, including error handling for subprocess execution.
- Adjusted GitHub indexer to call the new synchronous ingestion method.
- Clarified documentation regarding the optional nature of the Personal Access Token for public repositories.
This commit is contained in:
Anish Sarkar 2026-01-20 23:24:33 +05:30
parent 49b8a46d10
commit 35888144eb
8 changed files with 221 additions and 256 deletions

View file

@ -530,7 +530,10 @@ def validate_connector_config(
# "validators": {},
# },
"GITHUB_CONNECTOR": {
"required": ["GITHUB_PAT", "repo_full_names"],
# GITHUB_PAT is optional - only required for private repositories
# Public repositories can be indexed without authentication
"required": ["repo_full_names"],
"optional": ["GITHUB_PAT"], # Optional - only needed for private repos
"validators": {
"repo_full_names": lambda: validate_list_field(
"repo_full_names", "repo_full_names"