Sprint2/prompt design v1 2.3 by usha-sj · Pull Request #8 · utmgdsc/Study-One

usha-sj · 2026-02-08T23:48:23Z

Feature (PromptV1): Centralized study generation prompt system SOC-2.3

🎉 New feature (Extends backend architecture, non-breaking feature)

PR Summary

Adds a centralized, versioned prompt system for AI study generation.
The Gemini prompt is moved out of main.py into a dedicated module with strict JSON enforcement and few-shot examples to improve reliability without changing the API contract.

Overview

What feature/problem does this PR address?

Hardcoded prompt in main.py hard to maintain
AI output inconsistencies could break JSON parsing
No structure for prompt versioning or iteration

What approach was taken?

Created prompts/study_gen_v1.py as the single source of truth
Added build_study_generation_prompt() to construct prompts
Enforced strict JSON schema matching the API contract
Added few-shot examples to stabilize Gemini output
Wired endpoint to use the prompt builder

Important design decisions / trade-offs

Prompt is versioned (v1) to allow safe future upgrades
JSON schema is frozen to preserve frontend compatibility
Few-shot examples increase tokens but improve consistency (can be set to False when called in main.py)

Files Changed

File	Action	Description
backend/prompts/study_gen_v1.py	Created	Centralized prompt system (v1)
backend/main.py	Modified	Uses prompt builder instead of inline prompt
backend/README.md	Modified	Documents prompt architecture

Test Cases / Edge Cases

Empty notes rejected by request validator
Invalid AI JSON returns safe 500 error
Markdown-wrapped AI output cleaned and parsed
Quiz quality issues logged (non-blocking)

Checklist

Added a clear description
Documented edge cases
Updated backend documentation
Preserved API contract

Additional Notes

No frontend changes
No API schema changes
Future upgrades should create study_gen_v2.py instead of modifying v1
Hardcoded prompt is still there just in case, but I commented it out for now. Should probably delete.

Jira Ticket

Jira Ticket(s) - [SOC-2.3]

greptile-apps · 2026-02-08T23:49:35Z

Greptile Overview

Greptile Summary

This PR centralizes the Gemini study-generation prompt into a versioned module (backend/prompts/study_gen_v1.py) and updates the /api/v1/generate endpoint to build prompts via build_study_generation_prompt(...) (optionally with few-shot examples). It also adds a lightweight validate_quiz_quality pass that logs warnings after parsing model output, and documents the new prompt architecture in backend/README.md.

The main correctness concerns are around runtime importability and Python version compatibility: the new import in backend/main.py appears to assume prompts is a top-level module, and the new type annotations in study_gen_v1.py can be a syntax error on Python versions prior to 3.9/3.10 depending on your deployment target.

Confidence Score: 3/5

This PR is mergeable after fixing a couple of runtime compatibility issues.
Core change is straightforward (prompt moved to a module), but there are two likely runtime footguns: the prompts.* import may fail depending on how the app is launched, and list[dict] / list[str] annotations can be a hard syntax error on older Python runtimes. Once those are addressed, the change should be low risk.
backend/main.py; backend/prompts/study_gen_v1.py

Important Files Changed

Filename	Overview
backend/README.md	Documents the new centralized prompt module and how the /api/v1/generate endpoint uses it; no code-level issues found.
backend/main.py	Switches endpoint to use build_study_generation_prompt and adds quiz quality warnings; current import path likely breaks when running backend as a package.
backend/prompts/study_gen_v1.py	Adds centralized v1 prompt builders and quiz validation; validate_quiz_quality annotations may be incompatible with Python <3.9.

Sequence Diagram

sequenceDiagram
    autonumber
    participant Client
    participant API as backend/main.py
    participant Prompt as backend/prompts/study_gen_v1.py
    participant Gemini as Gemini Model

    Client->>API: POST /study (notes, options)
    API->>API: Validate request (reject empty notes)
    API->>Prompt: build_study_generation_prompt(notes, include_examples)
    Prompt-->>API: Prompt string (system + schema + examples)
    API->>Gemini: generate_content(prompt)
    Gemini-->>API: Model output (may be JSON / markdown-wrapped)
    API->>API: Clean output (strip markdown fences)
    API->>API: Parse JSON
    alt JSON valid
        API-->>Client: 200 Study JSON (API contract)
    else JSON invalid
        API-->>Client: 500 Safe error
    end

greptile-apps

_{3 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

backend/main.py

backend/prompts/study_gen_v1.py

…or Python 3.8

Arhum2 · 2026-02-09T00:02:35Z

@greptileai review this again and give me the updated confidence score

greptile-apps · 2026-02-09T00:05:19Z

Greptile Overview

Greptile Summary

Refactors prompt logic from inline string in main.py into a versioned, centralized module (prompts/study_gen_v1.py) with few-shot examples and quality validation.

Key improvements:

Structured prompt system with system instructions, output format spec, and examples
build_study_generation_prompt() constructs prompts programmatically
validate_quiz_quality() detects common issues (duplicate options, invalid answers)
Comprehensive documentation in README.md with versioning strategy
Non-breaking API changes (preserves frontend compatibility)

Issues found:

Python 3.9+ type syntax (list[str], list[QuizQuestion]) in main.py breaks compatibility with Python 3.8
Missing __init__.py in prompts/ directory
Commented-out code should be removed before merge

Confidence Score: 3/5

Safe to merge after fixing Python 3.8 compatibility issues in main.py
The refactoring is well-structured and non-breaking, but Python 3.9+ type hints in main.py will cause syntax errors on Python 3.8 deployments. The commented code cleanup and missing init.py are minor style issues
backend/main.py requires type hint fixes for Python 3.8 compatibility (lines 54, 64-65)

Important Files Changed

Filename	Overview
backend/prompts/study_gen_v1.py	New centralized prompt module with versioning, few-shot examples, and validation. Minor: missing init.py for proper package structure
backend/main.py	Integrates new prompt system. Issues: Python 3.9+ type syntax breaks 3.8 compatibility, commented code should be removed

Sequence Diagram

sequenceDiagram
    participant Client as Frontend
    participant API as /api/v1/generate
    participant Prompt as study_gen_v1.py
    participant Gemini as GeminiService
    participant Validator as validate_quiz_quality()

    Client->>API: POST /api/v1/generate<br/>{text: "user notes"}
    API->>API: Validate request.text<br/>(not empty)
    API->>Prompt: build_study_generation_prompt()<br/>(user_notes, include_examples=True)
    Prompt->>Prompt: Construct prompt with:<br/>SYSTEM_PROMPT + EXAMPLES +<br/>user_notes + OUTPUT_FORMAT
    Prompt-->>API: Complete prompt string
    API->>Gemini: call_gemini(prompt)
    Gemini-->>API: JSON response<br/>(may have markdown wrapping)
    API->>API: Clean markdown fences<br/>(remove ```json markers)
    API->>API: Parse JSON and validate<br/>(summary, quiz fields)
    API->>Validator: validate_quiz_quality(quiz_data)
    Validator-->>API: warnings list<br/>(logged, non-blocking)
    API-->>Client: GenerateResponse<br/>{summary: [], quiz: []}

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-09T00:05:30Z

Additional Comments (2)

backend/main.py
list[str] and list[QuizQuestion] use Python 3.9+ syntax. For Python 3.8 compatibility, import List from typing and use List[str] and List[QuizQuestion]

    options: List[str]
    answer: str

Prompt To Fix With AI

This is a comment left during a code review.
Path: backend/main.py
Line: 54:55

Comment:
`list[str]` and `list[QuizQuestion]` use Python 3.9+ syntax. For Python 3.8 compatibility, import `List` from typing and use `List[str]` and `List[QuizQuestion]`

```suggestion
    options: List[str]
    answer: str
```

How can I resolve this? If you propose a fix, please make it concise.

backend/main.py
Same Python 3.9+ syntax issue

    summary: List[str]
    quiz: List[QuizQuestion]

Prompt To Fix With AI

This is a comment left during a code review.
Path: backend/main.py
Line: 64:65

Comment:
Same Python 3.9+ syntax issue

```suggestion
    summary: List[str]
    quiz: List[QuizQuestion]
```

How can I resolve this? If you propose a fix, please make it concise.

Arhum2 · 2026-02-10T00:13:37Z

@greptileai review this again with the new version of main and give me the confidence score aswell

greptile-apps · 2026-02-10T00:15:47Z

Greptile Overview

Greptile Summary

This PR centralizes the Gemini study-generation prompt into backend/prompts/study_gen_v1.py and wires /api/v1/generate to use build_study_generation_prompt() instead of an inline f-string prompt. It also adds a lightweight validate_quiz_quality() pass to emit warnings about quiz output structure.

Main issues to address before merge are around integration/robustness: the new import path in backend/main.py conflicts with the documented uvicorn main:app run mode (startup import failure), the few-shot example scaffolding includes non-JSON tokens that can prompt Gemini to emit non-JSON (breaking json.loads()), and the endpoint prints raw model output (which may contain user notes) directly to stdout.

Confidence Score: 2/5

Not safe to merge until startup/import and output-parsing risks are addressed.
The PR introduces a run-context import that will break the common uvicorn main:app invocation, and the prompt’s few-shot scaffolding can cause non-JSON model output that the endpoint can’t parse. Additionally, raw model output is printed to stdout, which can leak user-provided notes into logs.
backend/main.py, backend/prompts/study_gen_v1.py

Important Files Changed

Filename	Overview
backend/README.md	Documents the new centralized study prompt module and how it’s used by `/api/v1/generate`.
backend/main.py	Switches generate endpoint to use the prompt builder and adds quiz quality warnings; introduces an import/run-context mismatch and logs raw AI output to stdout.
backend/prompts/study_gen_v1.py	Adds centralized prompt builders and quiz-quality validation; few-shot examples include non-JSON scaffolding that can induce non-JSON model outputs.

Sequence Diagram

sequenceDiagram
  autonumber
  participant Client
  participant API as FastAPI (/api/v1/generate)
  participant Prompt as study_gen_v1.build_study_generation_prompt
  participant Gemini as GeminiService.call_gemini

  Client->>API: POST /api/v1/generate {text}
  API->>Prompt: build_study_generation_prompt(user_notes, include_examples)
  Prompt-->>API: prompt string
  API->>Gemini: call_gemini(prompt)
  Gemini-->>API: response.text (string)
  API->>API: strip markdown fences + json.loads
  API->>API: validate quiz items + validate_quiz_quality
  API-->>Client: 200 {summary, quiz}
  alt Gemini returns None
    API-->>Client: 500 Failed to generate
  else JSON decode / structure error
    API-->>Client: 500 Failed to parse / Invalid format
  end

greptile-apps

_{3 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-10T00:15:51Z

backend/main.py

+from backend.prompts.study_gen_v1 import (
+    build_study_generation_prompt,
+    validate_quiz_quality
+)


Broken import when run in backend/

from backend.prompts.study_gen_v1 import ... will fail when starting the app from within backend/ (as documented via uvicorn main:app --reload in backend/README.md), because backend won’t be a top-level package in that execution context. This makes the server crash on startup in the common local/dev invocation; use an import that works from backend/ (e.g. from prompts.study_gen_v1 ...) or adjust the run command to uvicorn backend.main:app so the package import is valid.

Prompt To Fix With AI

This is a comment left during a code review. Path: backend/main.py Line: 10:13 Comment: **Broken import when run in backend/** `from backend.prompts.study_gen_v1 import ...` will fail when starting the app from within `backend/` (as documented via `uvicorn main:app --reload` in `backend/README.md`), because `backend` won’t be a top-level package in that execution context. This makes the server crash on startup in the common local/dev invocation; use an import that works from `backend/` (e.g. `from prompts.study_gen_v1 ...`) or adjust the run command to `uvicorn backend.main:app` so the package import is valid. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-10T00:15:52Z

backend/main.py

+        # Optional: Run quality checks on the quiz
+        quality_warnings = validate_quiz_quality(data.get("quiz", []))
+        if quality_warnings:
+            print(f"[generate] Quality warnings: {quality_warnings}")
+            # Can log these or return them to the frontend in the future


Sensitive data logged to stdout

On invalid/failed responses, this endpoint prints Raw response: {response} and also prints quality_warnings unconditionally. Gemini output can contain user-provided notes verbatim, so this will leak user content into server logs. Since this PR adds additional logging paths, it should be gated/removed or switched to structured logging with redaction (and avoid printing raw model output).

Prompt To Fix With AI

This is a comment left during a code review. Path: backend/main.py Line: 176:180 Comment: **Sensitive data logged to stdout** On invalid/failed responses, this endpoint prints `Raw response: {response}` and also prints `quality_warnings` unconditionally. Gemini output can contain user-provided notes verbatim, so this will leak user content into server logs. Since this PR adds additional logging paths, it should be gated/removed or switched to structured logging with redaction (and avoid printing raw model output). How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-10T00:15:53Z

backend/prompts/study_gen_v1.py

+EXAMPLES = """Here are examples of correct output format:
+
+--- EXAMPLE 1 ---
+
+INPUT NOTES:
+"Photosynthesis is the process by which plants convert sunlight into energy. It occurs in chloroplasts and requires carbon dioxide and water. The outputs are glucose and oxygen."
+
+CORRECT OUTPUT:


Few-shot violates JSON-only rule

SYSTEM_PROMPT requires “ONLY valid JSON”, but EXAMPLES includes headers like --- EXAMPLE 1 ---, INPUT NOTES:, and CORRECT OUTPUT:. Because the final prompt contains these non-JSON tokens, the model may mirror that structure and prepend similar headers in its response, directly causing json.loads() to fail. If strict JSON output is required, keep the few-shot content itself purely JSON (or clearly isolate examples as non-output content and add an explicit anti-pattern warning).

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: backend/prompts/study_gen_v1.py Line: 83:90 Comment: **Few-shot violates JSON-only rule** `SYSTEM_PROMPT` requires “ONLY valid JSON”, but `EXAMPLES` includes headers like `--- EXAMPLE 1 ---`, `INPUT NOTES:`, and `CORRECT OUTPUT:`. Because the final prompt contains these non-JSON tokens, the model may mirror that structure and prepend similar headers in its response, directly causing `json.loads()` to fail. If strict JSON output is required, keep the few-shot content itself purely JSON (or clearly isolate examples as non-output content and add an explicit anti-pattern warning). <sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub> How can I resolve this? If you propose a fix, please make it concise.

usha-sj added 3 commits February 8, 2026 18:28

Study prompt created, added examples and some checks for basic testing.

a55402c

Edited main.py to instead call prompt.

a8863d5

Updated readme on study prompt generation

39ed2b2

usha-sj requested review from Arhum2 and alextgu February 8, 2026 23:48

usha-sj self-assigned this Feb 8, 2026

greptile-apps bot reviewed Feb 8, 2026

View reviewed changes

backend/main.py Show resolved Hide resolved

backend/prompts/study_gen_v1.py Show resolved Hide resolved

Greptile fixes: main.py import fix and study_gen_v1.py function fix f…

10937ad

…or Python 3.8

greptile-apps bot reviewed Feb 9, 2026

View reviewed changes

greptile-apps bot reviewed Feb 10, 2026

View reviewed changes

Conversation

usha-sj commented Feb 8, 2026

Feature (PromptV1): Centralized study generation prompt system SOC-2.3

PR Summary

Overview

What feature/problem does this PR address?

What approach was taken?

Important design decisions / trade-offs

Files Changed

Test Cases / Edge Cases

Checklist

Additional Notes

Jira Ticket

Uh oh!

greptile-apps bot commented Feb 8, 2026

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Arhum2 commented Feb 9, 2026

Uh oh!

greptile-apps bot commented Feb 9, 2026

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 9, 2026

Uh oh!

Arhum2 commented Feb 10, 2026

Uh oh!

greptile-apps bot commented Feb 10, 2026

Greptile Overview

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants