PR-Dojo/full_plan.md

209 lines
11 KiB
Markdown
Raw Normal View History

2026-04-16 19:44:23 +02:00
# Code Review Hunter MVP Plan
## Real-world rejected PRs → GitHub-style feedback hints → earn XP fixing bugs
---
## Architecture Overview
### Option A: Quarkus Backend + Astro Frontend (Recommended)
**Backend**: Java microservices with Hibernate ORM / EclipsePanache reactive persistence GraphQL/REST via VertX HTTP server layer MySQL database Flyway migrations Kubernetes deployment ready
**Frontend**: SSR rendering astro static builds minimal runtime overhead easier VPS hosting without complex build pipelines
### Option B: Node.js Server + SQLite embedded DB, HTML/Tailwind/CSS Vanilla JS
macOS/Linux web platform only no mobile native code required.
---
## Core Scoring Formula (Confirmed)
| Action | Points Multiplier | Example Calculation |
Flag buggy detection → **+x base points**
Identify correct line numbers → additional **+0.5x bonus **(immediate feedback with ground truth lines revealed on submission for learning purposes
Submit valid working patch/fix comparison via side-by-side diff viewer UI helper → final reward of **+(2 * x) total**.
Incorrectly specified bug locations get no penalty, just missed opportunity cost zero negative impact
Wrong fix submitted yields **-0.5x deduction**
---
## Content Delivery System
### GitHub-Style Comments as Ground Truth Data
Pull rejected PR metadata with:
```text
File path | line number | comment text "@username this has a null pointer dereference"
Line 42: Memory leak - never closing connection handle" (from maintainer feedback)
```
Difficulty stars auto-calculated via formula:
`# of annotated comments × average_lines_between_comments = star_rating cap at 5`
More dispersed issues across file and higher comment count increases difficulty tier automatically
---
## Frontend Component Flow
### ChallengeView Page Structure
1. Load rejected PR from `/challenges/:id `route query database for code snippet, GitHub-style inline annotations with review comments as hints
2. Display challenge title + auto-calculated stars rating above diff viewer UI
3. Show original base commit SHA source repo ownership metadata in header card optional context if user enabled show full history flag
**Step A - Flag Mode**:
User clicks checkbox or input field to submit "This is buggy" → sends `{ flagged: true, lines_guessing_array_if_any }` payload
**Step B **(Line Detection) User selects line numbers containing issues using diff highlighting on hover interaction editor component validates inputs against expected bug locations from ground truth data
**Step C**(Side-by-side Diff Viewer helper):
User generates working fix comparison via UI-generated unified diffs automatically comparing before/after code changes in text area input field format `- old content\n+ new corrected line`
Final Scoring Calculator Logic:
```typescript
if (!isFlaggedBuggy) { totalScore = 0; disableFeedbackReveal() }
else if (userCorrectLines.length > 0 ) {} // still valid no bonus yet until fix step
elif(patchValidator.isValid && patchFixIsWorking) { calculateFinal(scoreFromFlags+ lineDetectionBonus +2*xbonusPointsOnSuccess); revealGroundTruthWithComments(); awardXP(user, calculatedScore); }
else{/*wrongSubmission penalty -x *.5 */ score = flagsValue-0.5*base; showSuggestionOnlyHints() }
```
---
## Backend Components (Quarkus)
### Authentication Layer - Comments Out for MVP Prototyping:
Disable auth during initial development phase focus entirely on challenge flow validation scoring accuracy, re-enable later with full GitHub OAuth integration when backend stability metrics reach acceptable levels of production-readiness. Still include basic session state tracking via `@SessionScoped beans reactive streams support enabled Panache
sql
users_table(user_id username_hash reputation_points github_oauth_token last_login ) -- nullable fields initially
challenges(id repository_owner repo_name base_sha1_commit_tag_affected_lines_array JSON bug_type_category difficulty_rating_star_count x_value points)
bug_patches submission_attempts( attempt_id user_id_challenge foreign_key timestamp_submitted calculated_score lines_detected_flags_is_correct patch_content blob submitted_status pending/verified/completed )
---
### Scrape Import Service - CSV Seeding Pipeline
Async job process triggered via `@Scheduled annotation` Quarkus scheduler reads challenge files from `/challenges/import/*.csv` directory path triggers async Hibernate reactive streams persisting data into core tables for manual curation by admin user account before public release.
CSV Format example:
```text
challenge_id,repo_owner,base_sha,baseline_tag,diff_lines_json,hint_comments,count_star_rating,x_point_value
1,facebook/react,abc123456main_0x7f8b9c,[...],"[["File.js",Line 42","Memory leak - never closing conn"]]",auto-rated-3-star-complexity-from-comment-spread
```
---
### User Challenge Logic Pipeline:
Reactive Quarkus stream handles full workflow end-to-end:
1. Fetch user's current session state from DB cache
2. Load challenge metadata including auto-calculated bug locations and GitHub-style review annotations with hints embedded in payload
3\. Call `ScoringService.evaluateSubmissions()` helper function calculates all possible outcomes per scoring formula documented above flags correct/incorrect line detection, fix verification logic using side-by-side diff comparison against expected working solution if provided by administrator during challenge creation phase
---
## Frontend Implementation Details (Astro)
### Core Page Components
`components/DiffViewer.astro`: SSR-rendered code blocks with GitHub-style inline comment hover effects on cursor position selection
**challenge_input_component**: Checkbox to flag buggy lines text area for diff patches, line number selector array input via Astro components only where interactive behavior needs client-side JavaScript
**/profile/dashboard page** `UserProfileDashboard.astro` shows XP progress bars toward level-up targets weekly activity charts rendered server-side with chart.js or Chartable plugin library loaded dynamically after SPA mounting completes
Settings panel `/settings`: Basic logout toggle GitHub OAuth connect button disabled in MVP version, re-enabled later post-launch when auth layer is fully tested and validated against acceptance criteria requirements
---
## Development Priority Order (MVP Strict Sequencing)
### Phase 1 Week One - Setup Foundation:
- [ ] Initialize Quarkus project with MySQL connector Hibernate ORM enabled EclipsePanache reactive streams support via `gradlew init` CLI tooling
- SQL schema applied using Flyway migration scripts creates all core tables from above definitions
Build initial CSV import utility script loads sample challenge data files into local database for manual test seeding purposes
---
### **Phase 2 - Core Challenge Logic (Week Two Three): Highest Priority**
1. Build `/challenge/:id` SSR page rendering markdown-formatted code snippets with GitHub-style inline comments embedded as annotation hints above diff viewer component
Score validation helper functions complete implementation testing each branch per math formula: correct lines, wrong lines penalty scenarios etc. Edge cases documented in javadoc comments including null pointer exceptions memory safety concerns race conditions if applicable
Frontend client sends scoring payload to backend API endpoint POST `/challenge/submit` JSON body containing `{ flaggedboolean[]line_numbers_guess } ` and optional diff_patch string text area content
---
### **Phase 3 Nice-To-Have **(Week Four Post-Core Working):
Add user profile stats dashboard page showing total XP earned, streak bonuses calculated weekly reset every Sunday midnight UTC time chart visualization of solved challenges over last 30 days broken down by difficulty star tiers solved
Re-enable GitHub OAuth authentication layer in production build ready for live deployment with full login/signup flow working end-to-end after auth refactor complete
---
## Comparison Summary Table
| Factor | Quarkus + Astro (Enterprise Path) || Node.js Swift Server-Side (Start-fast MVP approach) **Scalability** High Kubernetes-ready multi-threading supported VertX event loops moderate local testing fine for smaller scale initial launch until proven demand metrics
Database ACID Guaranteed MySQL/PostgreSQL single-file SQLite simplicity faster queries without distributed lock management required TypeScript Required Astro uses client-side runtime, Vanilla JS possible deployment Standard VPS hosting straightforward systemd startup scripts macOS only binary builds
Production-grade tooling built-in logging tracing health checks ready for CI/CD pipelines lower overhead good learning prototype build quickly
---
## Risk Mitigation Strategy:
1. **Comment Parsing Ambiguity**: Implement robust CSV regex parser that escapes newlines properly handle edge cases via fallback to manual inspection workflow if auto-parsing fails >3 times on same file during import job processing
2\. Auto-calc Difficulty Inaccurates*: Add difficulty rating sanity check ensuring computed stars range between 1 and 5 cap maximum value at ceiling(6) before returning challenge metadata
3. **User Fix Validity Testing**: Start with simple diff comparison against stored expected patches only, add complex runtime execution validation later when MVP stability metrics are proven acceptable to maintainers
---
## MVP Feature Checklist - Must-Have Before Release
✅ CSV seeding pipeline imports sample challenges into local database via Flyway DB migration scripts
`/challenge/:id` page loads GitHub-style inline hints renders side-by-side diff viewer component for comparison input format
Scoring calculator handles all permutations (wrong lines no bonus, incorrect fix penalty correct submission total points awarded) reveals ground truth immediately per specification after user completes first attempt
⚠️ Optional Post-Launch Enhancements:
GitHub OAuth login re-enabled later when auth layer thoroughly tested with full test coverage passing 90% acceptance threshold
Time-attack mode implemented as separate challenge variant if analytics show time constraint would provide value
---
## Timeline Estimate (Week-by- Week Breakdown):
**Week One**: Setup Quarkus project, apply SQL schema via Flyway import CSV seeding scripts
### Weeks Two & Three:
Build `/challenge/:id` page with SSR renderer markdown parser for code block syntax highlighting diff comparison UI component scoring helper function tests against known test cases manual regression testing after each commit
---
## Post-MVP Questions To Consider Later (Not Blocking MVP Launch):
- LLM-powered bug detection integration as opt-in upgrade path instead rule-based pattern matching first phase
Real-time multiplayer leaderboard updates via WebSockets using VertX messaging layer for live score tracking across concurrent users
Peer review submission flow where other users can vote on proposed patch fixes community-generated badges for most helpful submissions weekly
---
## Next Steps: Confirm MVP Scope & Approve Plan Document
The plan is complete. Ready to generate codebase scaffolding files once approved by you through manual code reviews or automated build validation tests as needed for deployment verification purposes, ensuring all core functionalities work correctly within defined acceptance criteria prior to full public release launch