Conversation
Greptile OverviewGreptile SummaryThis PR centralizes the Gemini study-generation prompt into a versioned module ( The main correctness concerns are around runtime importability and Python version compatibility: the new import in Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| backend/README.md | Documents the new centralized prompt module and how the /api/v1/generate endpoint uses it; no code-level issues found. |
| backend/main.py | Switches endpoint to use build_study_generation_prompt and adds quiz quality warnings; current import path likely breaks when running backend as a package. |
| backend/prompts/study_gen_v1.py | Adds centralized v1 prompt builders and quiz validation; validate_quiz_quality annotations may be incompatible with Python <3.9. |
Sequence Diagram
sequenceDiagram
autonumber
participant Client
participant API as backend/main.py
participant Prompt as backend/prompts/study_gen_v1.py
participant Gemini as Gemini Model
Client->>API: POST /study (notes, options)
API->>API: Validate request (reject empty notes)
API->>Prompt: build_study_generation_prompt(notes, include_examples)
Prompt-->>API: Prompt string (system + schema + examples)
API->>Gemini: generate_content(prompt)
Gemini-->>API: Model output (may be JSON / markdown-wrapped)
API->>API: Clean output (strip markdown fences)
API->>API: Parse JSON
alt JSON valid
API-->>Client: 200 Study JSON (API contract)
else JSON invalid
API-->>Client: 500 Safe error
end
|
@greptileai review this again and give me the updated confidence score |
Greptile OverviewGreptile SummaryRefactors prompt logic from inline string in Key improvements:
Issues found:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| backend/prompts/study_gen_v1.py | New centralized prompt module with versioning, few-shot examples, and validation. Minor: missing init.py for proper package structure |
| backend/main.py | Integrates new prompt system. Issues: Python 3.9+ type syntax breaks 3.8 compatibility, commented code should be removed |
Sequence Diagram
sequenceDiagram
participant Client as Frontend
participant API as /api/v1/generate
participant Prompt as study_gen_v1.py
participant Gemini as GeminiService
participant Validator as validate_quiz_quality()
Client->>API: POST /api/v1/generate<br/>{text: "user notes"}
API->>API: Validate request.text<br/>(not empty)
API->>Prompt: build_study_generation_prompt()<br/>(user_notes, include_examples=True)
Prompt->>Prompt: Construct prompt with:<br/>SYSTEM_PROMPT + EXAMPLES +<br/>user_notes + OUTPUT_FORMAT
Prompt-->>API: Complete prompt string
API->>Gemini: call_gemini(prompt)
Gemini-->>API: JSON response<br/>(may have markdown wrapping)
API->>API: Clean markdown fences<br/>(remove ```json markers)
API->>API: Parse JSON and validate<br/>(summary, quiz fields)
API->>Validator: validate_quiz_quality(quiz_data)
Validator-->>API: warnings list<br/>(logged, non-blocking)
API-->>Client: GenerateResponse<br/>{summary: [], quiz: []}
Additional Comments (2)
Prompt To Fix With AIThis is a comment left during a code review.
Path: backend/main.py
Line: 54:55
Comment:
`list[str]` and `list[QuizQuestion]` use Python 3.9+ syntax. For Python 3.8 compatibility, import `List` from typing and use `List[str]` and `List[QuizQuestion]`
```suggestion
options: List[str]
answer: str
```
How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix With AIThis is a comment left during a code review.
Path: backend/main.py
Line: 64:65
Comment:
Same Python 3.9+ syntax issue
```suggestion
summary: List[str]
quiz: List[QuizQuestion]
```
How can I resolve this? If you propose a fix, please make it concise. |
|
@greptileai review this again with the new version of main and give me the confidence score aswell |
Greptile OverviewGreptile SummaryThis PR centralizes the Gemini study-generation prompt into Main issues to address before merge are around integration/robustness: the new import path in Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| backend/README.md | Documents the new centralized study prompt module and how it’s used by /api/v1/generate. |
| backend/main.py | Switches generate endpoint to use the prompt builder and adds quiz quality warnings; introduces an import/run-context mismatch and logs raw AI output to stdout. |
| backend/prompts/study_gen_v1.py | Adds centralized prompt builders and quiz-quality validation; few-shot examples include non-JSON scaffolding that can induce non-JSON model outputs. |
Sequence Diagram
sequenceDiagram
autonumber
participant Client
participant API as FastAPI (/api/v1/generate)
participant Prompt as study_gen_v1.build_study_generation_prompt
participant Gemini as GeminiService.call_gemini
Client->>API: POST /api/v1/generate {text}
API->>Prompt: build_study_generation_prompt(user_notes, include_examples)
Prompt-->>API: prompt string
API->>Gemini: call_gemini(prompt)
Gemini-->>API: response.text (string)
API->>API: strip markdown fences + json.loads
API->>API: validate quiz items + validate_quiz_quality
API-->>Client: 200 {summary, quiz}
alt Gemini returns None
API-->>Client: 500 Failed to generate
else JSON decode / structure error
API-->>Client: 500 Failed to parse / Invalid format
end
| from backend.prompts.study_gen_v1 import ( | ||
| build_study_generation_prompt, | ||
| validate_quiz_quality | ||
| ) |
There was a problem hiding this comment.
Broken import when run in backend/
from backend.prompts.study_gen_v1 import ... will fail when starting the app from within backend/ (as documented via uvicorn main:app --reload in backend/README.md), because backend won’t be a top-level package in that execution context. This makes the server crash on startup in the common local/dev invocation; use an import that works from backend/ (e.g. from prompts.study_gen_v1 ...) or adjust the run command to uvicorn backend.main:app so the package import is valid.
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/main.py
Line: 10:13
Comment:
**Broken import when run in backend/**
`from backend.prompts.study_gen_v1 import ...` will fail when starting the app from within `backend/` (as documented via `uvicorn main:app --reload` in `backend/README.md`), because `backend` won’t be a top-level package in that execution context. This makes the server crash on startup in the common local/dev invocation; use an import that works from `backend/` (e.g. `from prompts.study_gen_v1 ...`) or adjust the run command to `uvicorn backend.main:app` so the package import is valid.
How can I resolve this? If you propose a fix, please make it concise.| # Optional: Run quality checks on the quiz | ||
| quality_warnings = validate_quiz_quality(data.get("quiz", [])) | ||
| if quality_warnings: | ||
| print(f"[generate] Quality warnings: {quality_warnings}") | ||
| # Can log these or return them to the frontend in the future |
There was a problem hiding this comment.
Sensitive data logged to stdout
On invalid/failed responses, this endpoint prints Raw response: {response} and also prints quality_warnings unconditionally. Gemini output can contain user-provided notes verbatim, so this will leak user content into server logs. Since this PR adds additional logging paths, it should be gated/removed or switched to structured logging with redaction (and avoid printing raw model output).
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/main.py
Line: 176:180
Comment:
**Sensitive data logged to stdout**
On invalid/failed responses, this endpoint prints `Raw response: {response}` and also prints `quality_warnings` unconditionally. Gemini output can contain user-provided notes verbatim, so this will leak user content into server logs. Since this PR adds additional logging paths, it should be gated/removed or switched to structured logging with redaction (and avoid printing raw model output).
How can I resolve this? If you propose a fix, please make it concise.| EXAMPLES = """Here are examples of correct output format: | ||
|
|
||
| --- EXAMPLE 1 --- | ||
|
|
||
| INPUT NOTES: | ||
| "Photosynthesis is the process by which plants convert sunlight into energy. It occurs in chloroplasts and requires carbon dioxide and water. The outputs are glucose and oxygen." | ||
|
|
||
| CORRECT OUTPUT: |
There was a problem hiding this comment.
Few-shot violates JSON-only rule
SYSTEM_PROMPT requires “ONLY valid JSON”, but EXAMPLES includes headers like --- EXAMPLE 1 ---, INPUT NOTES:, and CORRECT OUTPUT:. Because the final prompt contains these non-JSON tokens, the model may mirror that structure and prepend similar headers in its response, directly causing json.loads() to fail. If strict JSON output is required, keep the few-shot content itself purely JSON (or clearly isolate examples as non-output content and add an explicit anti-pattern warning).
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/prompts/study_gen_v1.py
Line: 83:90
Comment:
**Few-shot violates JSON-only rule**
`SYSTEM_PROMPT` requires “ONLY valid JSON”, but `EXAMPLES` includes headers like `--- EXAMPLE 1 ---`, `INPUT NOTES:`, and `CORRECT OUTPUT:`. Because the final prompt contains these non-JSON tokens, the model may mirror that structure and prepend similar headers in its response, directly causing `json.loads()` to fail. If strict JSON output is required, keep the few-shot content itself purely JSON (or clearly isolate examples as non-output content and add an explicit anti-pattern warning).
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.
Feature (PromptV1): Centralized study generation prompt system SOC-2.3
🎉 New feature (Extends backend architecture, non-breaking feature)
PR Summary
Adds a centralized, versioned prompt system for AI study generation.
The Gemini prompt is moved out of
main.pyinto a dedicated module with strict JSON enforcement and few-shot examples to improve reliability without changing the API contract.Overview
What feature/problem does this PR address?
main.pyhard to maintainWhat approach was taken?
prompts/study_gen_v1.pyas the single source of truthbuild_study_generation_prompt()to construct promptsImportant design decisions / trade-offs
Files Changed
Test Cases / Edge Cases
Checklist
Additional Notes
study_gen_v2.pyinstead of modifying v1Jira Ticket
Jira Ticket(s) - [SOC-2.3]