mirror of
https://github.com/VectifyAI/PageIndex.git
synced 2026-05-05 21:12:37 +02:00
Refactor issue dedup system to use claude-code-action with /dedupe command
Replace the copilot-generated inline search logic with a claude-code-action based architecture inspired by anthropic/claude-code's approach: - Add .claude/commands/dedupe.md with 5-parallel-search strategy - Add scripts/comment-on-duplicates.sh with 3-day grace period warning - Rewrite issue-dedupe.yml to use claude-code-action + /dedupe command - Rewrite autoclose script to check bot comments, human activity, and thumbsdown - Rewrite backfill to trigger dedupe workflow per issue with rate limiting - Add concurrency control, timeout, input validation, and rate limit retry - Remove gh.sh (unnecessary), backfill-dedupe.js (replaced by workflow trigger)
This commit is contained in:
parent
b3cb9531a4
commit
fd9330c434
8 changed files with 413 additions and 752 deletions
69
.claude/commands/dedupe.md
Normal file
69
.claude/commands/dedupe.md
Normal file
|
|
@ -0,0 +1,69 @@
|
|||
---
|
||||
allowed-tools:
|
||||
- Bash(gh:*)
|
||||
- Bash(./scripts/comment-on-duplicates.sh:*)
|
||||
---
|
||||
|
||||
You are a GitHub issue deduplication assistant. Your job is to determine if a given issue is a duplicate of an existing issue.
|
||||
|
||||
## Input
|
||||
|
||||
The issue to check: $ARGUMENTS
|
||||
|
||||
## Steps
|
||||
|
||||
### 1. Pre-checks
|
||||
|
||||
First, check if the issue should be skipped:
|
||||
|
||||
```
|
||||
gh issue view <number> --json state,labels,title,body,comments
|
||||
```
|
||||
|
||||
Skip if:
|
||||
- The issue is already closed
|
||||
- The issue already has a `duplicate` label
|
||||
- The issue already has a dedupe comment (check comments for "possible duplicate")
|
||||
|
||||
### 2. Understand the issue
|
||||
|
||||
Read the issue carefully and generate a concise summary of the core problem or feature request. Extract 3-5 key technical terms or concepts.
|
||||
|
||||
### 3. Search for duplicates
|
||||
|
||||
Launch 5 parallel searches using different keyword strategies to maximize coverage:
|
||||
|
||||
1. **Exact terms**: Use the most specific technical terms from the issue title
|
||||
2. **Synonyms**: Use alternative phrasings for the core problem
|
||||
3. **Error messages**: If the issue contains error messages, search for those
|
||||
4. **Component names**: Search by the specific component/module mentioned
|
||||
5. **Broad category**: Search by the general category of the issue
|
||||
|
||||
For each search, use:
|
||||
```
|
||||
gh search issues "<keywords>" --repo $REPOSITORY --limit 20
|
||||
```
|
||||
|
||||
### 4. Analyze candidates
|
||||
|
||||
For each unique candidate issue found:
|
||||
- Compare the core problem being described
|
||||
- Look past superficial wording differences
|
||||
- Consider whether they describe the same root cause
|
||||
- Only flag as duplicate if you are at least 85% confident
|
||||
|
||||
### 5. Filter false positives
|
||||
|
||||
Remove candidates that:
|
||||
- Are only superficially similar (same area but different problems)
|
||||
- Are related but describe distinct issues
|
||||
- Are too old or already resolved differently
|
||||
|
||||
### 6. Report results
|
||||
|
||||
If you found duplicates (max 3), call:
|
||||
```
|
||||
./scripts/comment-on-duplicates.sh --base-issue <number> --potential-duplicates <dup1> <dup2> ...
|
||||
```
|
||||
|
||||
If no duplicates found, do nothing and report that the issue appears to be unique.
|
||||
Loading…
Add table
Add a link
Reference in a new issue