Skip to content

Release Stage Advantage module#6

Closed
ChonghaoSima wants to merge 4 commits intomainfrom
dev
Closed

Release Stage Advantage module#6
ChonghaoSima wants to merge 4 commits intomainfrom
dev

Conversation

@ChonghaoSima
Copy link
Contributor

Summary

  • Add the full Stage Advantage pipeline: GT data labeling, advantage estimator training, advantage estimation on new data, and AWBC training
  • Update README with Stage Advantage quick start, pipeline overview, and module checklist
  • Update training configs and policy files (agilex, arx) to support advantage-weighted training
  • Remove stale .vscode/settings.json

Changes

  • stage_advantage/ — new module with annotation pipeline (gt_label.py, eval.py, evaluator.py), shell scripts (gt_labeling.sh, train_estimator.sh, eval.sh, train_awbc.sh), and READMEs
  • src/openpi/training/config.py — add AWBC and advantage estimator training configs
  • src/openpi/policies/agilex_policy.py, arx_policy.py — support advantage data loading
  • README.md — add Stage Advantage documentation and update to-do list

Test plan

  • Verify GT labeling pipeline on sample dataset
  • Verify advantage estimator training launches correctly
  • Verify AWBC training with advantage-labeled data
  • Confirm README rendering on GitHub

🤖 Generated with Claude Code

Tradewindycc and others added 4 commits February 15, 2026 02:48
Mark Train-Deploy Alignment as released, add update log entry,
check off to-do item, and replace Coming Soon placeholder with
full content (data augmentation, DAgger, inference quick start).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update stage_advantage READMEs to note that Task_A/advantage/ is
available on both HuggingFace and ModelScope dataset repos. Check off
the HuggingFace & ModelScope to-do item and add update log entry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants