-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
fix: add revision parameter support and escape quotes in chat templates #3793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: add revision parameter support and escape quotes in chat templates #3793
Conversation
- Fix unslothai#3544: Add revision parameter to AutoConfig, AutoModelForCausalLM, AutoModelForSequenceClassification, and load_correct_tokenizer calls in FastLlamaModel.from_pretrained. This enables loading specific model revisions/branches from HuggingFace Hub. - Fix unslothai#3667: Escape single quotes in system messages before substituting into Jinja2 templates. This prevents TemplateSyntaxError when system messages contain apostrophes (e.g., "user's" in Vicuna templates). Signed-off-by: majiayu000 <1835304752@qq.com>
Summary of ChangesHello @majiayu000, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves two distinct issues to enhance the robustness and functionality of the system. It ensures that model and tokenizer loading respects the specified Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR fixes two bugs: adding revision parameter support and escaping single quotes in chat templates. The changes for the revision parameter look correct and are well-implemented. For the chat template fix, while the escaping logic is correct, it introduces code duplication and there's a pre-existing logic bug that makes some of the new code unreachable. I've provided a suggestion to refactor the function to address these issues for better maintainability and correctness.
| if has_placeholder: | ||
| if system_message is None: | ||
| raise ValueError("Unsloth: You need to provide a system message for custom templates.") | ||
| new_template = re.sub(system_message_pattern, system_message, template) | ||
| # Escape single quotes to prevent Jinja2 template syntax errors | ||
| escaped_message = system_message.replace("'", "\\'") | ||
| new_template = re.sub(system_message_pattern, escaped_message, template) | ||
| return new_template, system_message | ||
|
|
||
| return template, system_message | ||
|
|
||
| # For predefined templates with default system message | ||
| message_to_use = system_message if system_message is not None else default_system_message | ||
| new_template = re.sub(system_message_pattern, message_to_use, template) | ||
| # Escape single quotes to prevent Jinja2 template syntax errors | ||
| escaped_message = message_to_use.replace("'", "\\'") | ||
| new_template = re.sub(system_message_pattern, escaped_message, template) | ||
|
|
||
| return new_template, message_to_use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function has some duplicated logic for escaping and substituting the system message. More importantly, there seems to be a logic bug making some code unreachable.
Dead Code Issue
When type_chat_template is None (for custom templates), default_system_message on line 2094 becomes None. This causes the function to enter the if default_system_message is None: block and return early on line 2102. Consequently, the logic for custom templates starting on line 2105 is never executed.
Duplication Issue
The logic for escaping single quotes and substituting the message is repeated for custom templates (lines 2112-2113) and predefined templates (lines 2121-2122).
To fix both issues and improve clarity, I suggest refactoring the entire _change_system_message function. Since I can't suggest a change for the whole function, I'll provide the suggested implementation here for you to apply:
def _change_system_message(template: str, type_chat_template: str, system_message: str = None):
system_message_pattern = r"\{system_message\}"
# For custom templates
if type_chat_template is None:
if not re.search(system_message_pattern, template):
return template, system_message
if system_message is None:
raise ValueError("Unsloth: You need to provide a system message for custom templates.")
message_to_process = system_message
message_to_return = system_message
# For predefined templates
else:
default_system_message = DEFAULT_SYSTEM_MESSAGE.get(f"{type_chat_template}", None)
if default_system_message is None:
if system_message is not None:
logger.warning_once(
f"Unsloth: You tried to change the system message for {type_chat_template}, "
"but it doesn't have a default system message. "
"You need to manually add the system message in your data."
)
return template, system_message
message_to_process = system_message if system_message is not None else default_system_message
message_to_return = message_to_process
# Common logic for substitution
# Escape single quotes to prevent Jinja2 template syntax errors
escaped_message = message_to_process.replace("'", "\\'")
new_template = re.sub(system_message_pattern, escaped_message, template)
return new_template, message_to_returnThis refactoring first separates the logic for custom and predefined templates, then uses a common block for the substitution, making the code cleaner and fixing the dead code issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| model_name, | ||
| token = token, | ||
| attn_implementation = "sdpa", | ||
| revision = revision, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thread revision into vLLM fast inference
When revision is set and fast_inference=True, AutoConfig.from_pretrained now loads that specific revision, but the fast‑inference path later calls load_vllm without any revision parameter. That means the config can come from revision X while vLLM still fetches the default (latest) weights, which can yield a config/weights mismatch or silently ignore the user’s requested revision. This mismatch is introduced by adding revision here without propagating it to the vLLM loader.
Useful? React with 👍 / 👎.
|
Hey @majiayu000 |
- Add revision to load_vllm_kwargs in llama.py to fix config/weights mismatch - Add revision to PEFT AutoConfig calls in loader.py (FastLanguageModel & FastModel) Addresses reviewer feedback from @chatgpt-codex-connector and @Datta0 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Hi @Datta0, thanks for the review! I've added the
I also addressed the feedback from @chatgpt-codex-connector by propagating Regarding Gemini's suggestion to refactor Please let me know if there's anything else you'd like me to change! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| # Escape single quotes to prevent Jinja2 template syntax errors | ||
| escaped_message = system_message.replace("'", "\\'") | ||
| new_template = re.sub(system_message_pattern, escaped_message, template) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid escaping apostrophes in raw custom templates
In the custom-template branch, {system_message} is replaced directly in the raw template string. Unconditionally escaping ' here means any custom template that inserts {system_message} as literal text (e.g., SYSTEM: {system_message} outside a quoted Jinja2 string) will now render backslashes in the prompt (SYSTEM: user\'s ...). That changes the prompt content rather than just preventing Jinja2 parse errors. Consider escaping only when the placeholder is inside a quoted literal, or deferring escaping to render-time.
Useful? React with 👍 / 👎.
|
Hey @majiayu000 thanks for the changes but I think there are still few places where we might benefit from adding the same paramter. Like this for example If you can do a thorough check on |
Propagate revision parameter to all from_pretrained calls in vision.py to ensure consistent version pinning for vision models. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Thanks for the review! Added revision parameter to FastBaseModel in vision.py (4 places: function signature, and 3 from_pretrained calls). |
|
Yeah I think this looks fine now. If you would like to continue on this, you can explore adding the same to load_vllm function here and its calls in llama.py and loader.py (I assume) to complete it and make this work for fast_inference as well. But that can be done as separate PR too if you prefer it that way |
Datta0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As a follow up maybe we can think about passing on the revision parameter to load_vllm to ensure full end to end integration.
|
Hi @Datta0 , I had a try as your suggested. The code was on my another computer. Will push it later. |
Summary
This PR addresses two bugs:
Fix #3544: Add
revisionparameter supportThe
revisionparameter inFastLlamaModel.from_pretrainedwas defined in the function signature but never passed to the underlying HuggingFace calls. This prevented users from loading specific model revisions/branches.Changes:
revisionparameter toAutoConfig.from_pretrainedrevisionparameter toAutoModelForCausalLM.from_pretrainedrevisionparameter toAutoModelForSequenceClassification.from_pretrainedrevisionparameter toload_correct_tokenizerand_load_correct_tokenizerfunctionsFix #3667: Escape single quotes in Vicuna chat template
When using the Vicuna chat template with
get_chat_template(), Jinja2 throws aTemplateSyntaxErrorbecause the default system message contains apostrophes (e.g., "user's questions") that break the single-quoted string in the template.Changes:
'→\') in_change_system_messagefunction before substituting system messages into templatesTest Plan
expected token 'end of print statement', got 's'