feat: add cv benchmark workflow and admin visibility

2026-04-01 12:25:45 +02:00
parent 0551a525a8
commit 0d65835857
16 changed files with 832 additions and 95 deletions
@@ -0,0 +1,224 @@
+# CV builder, parser, benchmark, and Ollama admin integration
+
+## What changed
+
+This branch upgrades the Profile CV flow from a text-only rewrite helper into a template-driven CV builder backed by the server-side renderer/PDF pipeline, strengthens CV normalization around location and qualification handling, adds a repeatable local corpus benchmark workflow, and expands the admin system page with richer Ollama visibility.
+
+## Profile CV builder
+
+### New backend capabilities
+
+`JobTrackerApi/Controllers/ProfileCvController.cs`
+
+- Hardened `POST /api/profile-cv/rewrite-section`
+  - accepts flexible `jobApplicationId` payloads (number or blank string)
+  - uses richer saved-job context for tailoring
+  - logs empty AI responses with useful context
+- Added `GET /api/profile-cv/templates`
+- Added `POST /api/profile-cv/rewrite-preview`
+  - rewrites either the whole CV or one selected section
+  - rebuilds structured CV from the rewritten full text
+  - maps the result into the shared template renderer
+  - returns rendered HTML, file name, rewritten text, and full replacement text
+- Added `POST /api/profile-cv/export-pdf`
+  - uses the same rendered HTML and the shared Playwright exporter
+
+### Frontend flow
+
+`job-tracker-ui/src/pages/ProfilePage.tsx`
+
+- Replaced the old rewrite draft box with a template-driven builder section.
+- Users can:
+  - choose from 6 templates
+  - optionally target one section
+  - target by free-text role or saved job
+  - inspect the rewritten content
+  - inspect the actual rendered preview
+  - download a PDF
+  - replace the master CV with the rebuilt full-text result
+
+## Templates
+
+Shared renderer: `JobTrackerApi/Services/CvTemplateRenderer.cs`
+
+Available templates:
+- `ats-minimal`
+- `harvard`
+- `auckland`
+- `edinburgh`
+- `monarch`
+- `fjord`
+
+### Adding a new template
+
+1. Add the new template id to `NormalizeTemplateId()` in:
+   - `JobTrackerApi/Services/CvTemplateRenderer.cs`
+   - `JobTrackerApi/Controllers/ProfileCvController.cs`
+2. Add a render branch in `CvTemplateRenderer.Render()`.
+3. Add a descriptor to `GetCvTemplateDescriptors()`.
+4. Add the matching card entry in `job-tracker-ui/src/pages/ProfilePage.tsx` if you want a custom preview card.
+
+## PDF generation
+
+The master CV builder now reuses the existing server-side pipeline:
+
+1. rewrite full text / section
+2. rebuild structured CV
+3. map to `TailoredCvDocument`
+4. render HTML via `ICvTemplateRenderer`
+5. export PDF via `ICvPdfExporter` / Playwright
+
+This keeps PDF output visually aligned with the selected template and avoids a separate client-only print implementation.
+
+## Parser and structured CV changes
+
+### Shared schema
+
+`Models/StructuredCvProfile.cs`
+
+Added:
+- `education[].qualificationLevel`
+- top-level `certifications[]`
+- top-level `projects[]`
+
+`qualification` remains the original preserved text.
+
+### Normalization improvements
+
+`Models/StructuredCvProfileJson.cs`
+
+- tighter location sanitization to avoid skill or role spillover into location fields
+- qualification level normalization to one of:
+  - `Secondary`
+  - `Diploma/Certificate`
+  - `Bachelor`
+  - `Master`
+  - `PhD`
+  - `Other`
+- first-class normalization for certifications and projects
+- section reconstruction now includes certifications and projects
+
+### Extraction prompt improvements
+
+`JobTrackerApi/Controllers/ProfileCvController.cs`
+
+The LLM extraction prompt now explicitly asks for:
+- qualification level enum
+- certifications
+- projects
+- strict location separation rules
+- preservation of original qualification text
+
+## Benchmark workflow
+
+### Runner
+
+Use:
+
+```bash
+./scripts/run-cv-benchmark.sh
+```
+
+Optional overrides:
+
+```bash
+CV_BENCHMARK_OUTPUT_DIR=/absolute/output/path \
+CV_BENCHMARK_APPROVED_DIR=/absolute/approved/fixtures/path \
+./scripts/run-cv-benchmark.sh
+```
+
+### Inputs
+
+The runner scans:
+
+- `/home/pi/cvs`
+
+Supported corpus file types:
+- PDF
+- DOCX
+- TXT
+- MD
+
+### Outputs
+
+The runner writes:
+
+- `index.json` — machine-readable summary
+- `report.md` — markdown overview
+- `outputs/*.json` — latest normalized structured output per CV
+- `candidate-fixtures/*.json` — created when no approved fixture exists yet
+
+Approved fixtures are local by design and should be reviewed manually before being promoted into the approved fixture path you use for regression comparisons.
+
+### Admin review
+
+`GET /api/admin/system/cv-benchmark`
+
+The admin system page surfaces:
+- benchmark root path
+- last benchmark update time
+- latest markdown summary
+
+## Ollama admin visibility
+
+### Python health endpoint
+
+`tools/summarizer/app.py`
+
+`GET /health` now returns additional Ollama metadata when configured/reachable:
+- `ollama_version`
+- `ollama_installed_models`
+- `ollama_loaded_models`
+- `ollama_loaded_count`
+
+### Backend propagation
+
+`JobTrackerApi/Services/SummarizerService.cs`
+
+The backend metrics shape now carries those fields through to admin consumers.
+
+### Admin UI
+
+`job-tracker-ui/src/pages/AdminSystemPage.tsx`
+
+The system page now shows:
+- Ollama version
+- loaded model count
+- installed model chips
+- loaded model chips
+- benchmark summary panel
+
+## Verification used on this branch
+
+### Backend
+
+```bash
+dotnet test JobTrackerApi.Tests/JobTrackerApi.Tests.csproj --filter ProfileCvControllerTests
+dotnet test JobTrackerApi.Tests/JobTrackerApi.Tests.csproj --filter "ProfileCvControllerTests|AuthAndSystemControllerTests|JobApplicationsApplicationPackageTests"
+dotnet test JobTrackerApi.Tests/JobTrackerApi.Tests.csproj --filter CvCorpusHarnessTests
+```
+
+### Frontend
+
+```bash
+cd job-tracker-ui && CI=true npm test -- --runInBand --watch=false src/profile-page.test.tsx
+cd job-tracker-ui && CI=true npm test -- --runInBand --watch=false src/admin-system-page.test.tsx
+cd job-tracker-ui && CI=true npm test -- --runInBand --watch=false src/profile-page.test.tsx src/admin-system-page.test.tsx src/job-details-generated-drafts.test.tsx
+```
+
+### Benchmark runner
+
+```bash
+CV_BENCHMARK_OUTPUT_DIR="$(pwd)/tmp/cv-benchmarks/latest" \
+CV_BENCHMARK_APPROVED_DIR="$(pwd)/tmp/cv-benchmarks/approved" \
+./scripts/run-cv-benchmark.sh
+```
+
+### Python service tests
+
+The summarizer Python unit tests were updated for the new health payload, but this machine currently lacks `pip` / `venv` support (`python3 -m venv` fails because `python3.12-venv` is not installed), so test execution is environment-blocked here. Once Python packaging is available, run:
+
+```bash
+cd tools/summarizer
+python3 -m pytest -q tests/test_app.py
+```