Goal Radius Randomization by nadarenator · Pull Request #285 · Emerge-Lab/PufferDrive

nadarenator · 2026-02-09T13:21:17Z

Variable Goal Radius + Collision/Offroad Reward Conditioning

Use per-agent reward_coefs for goal distance checks, collision/offroad penalties, and rendering instead of global env-> values.

Files changed: drive.h, visualize.c

Changes

Observation allowlist (drive.h): Added REWARD_COEF_GOAL_RADIUS, REWARD_COEF_COLLISION, REWARD_COEF_OFFROAD to
normalize_reward_coef() so the policy sees them as normalized [-1, 1] features
Goal check (drive.h:c_step): distance_to_goal < agent->reward_coefs[REWARD_COEF_GOAL_RADIUS] (was env->goal_radius)
Collision penalty (drive.h:c_step): Uses agent->reward_coefs[REWARD_COEF_COLLISION] (was env->reward_vehicle_collision)
Offroad penalty (drive.h:c_step): Uses agent->reward_coefs[REWARD_COEF_OFFROAD] (was env->reward_offroad_collision)
Rendering (drive.h:draw_agent_obs, drive.h:draw_scene): Goal circles use per-agent reward_coefs[REWARD_COEF_GOAL_RADIUS] instead
of global env->goal_radius
Visualize (visualize.c): Pass reward_randomization and reward_conditioning from config to Drive struct init

greptile-apps · 2026-02-09T13:24:15Z

Greptile Overview

Greptile Summary

This PR adds per-agent goal-reaching radius randomization/conditioning in the Drive environment. On the C side it introduces a per-agent Agent.goal_radius, samples it on init/reset/respawn when enabled, uses it for goal-reached checks and rendering, and appends a normalized value to ego observations (bumping ego feature counts). On the Python side it updates the policy model to derive ego input dims from env.ego_features.

Key issues to address before merge:

Default config behavior changes: drive.ini enables randomization by default (= 1) despite the PR description stating it’s off by default.
API mismatch: Drive.__init__ exposes goal_radius_randomization but it is not forwarded into the C env init path; the flag is currently controlled only by the INI parser, making the Python kwarg a silent no-op.

Confidence Score: 3/5

This PR is mergeable after fixing a couple of user-facing configuration/API inconsistencies.
Core C changes for per-agent goal radius sampling/usage and observation shape updates look internally consistent, and torch model now keys off env.ego_features. The main blockers are (1) default config enabling randomization contrary to the stated default, and (2) Python exposing a kwarg that does nothing because the C binding only reads the INI value.
pufferlib/config/ocean/drive.ini, pufferlib/ocean/drive/drive.py, pufferlib/ocean/drive/binding.c

Important Files Changed

Filename	Overview
pufferlib/config/ocean/drive.ini	Adds `goal_radius_randomization` setting, but sets it to `1` (enabled) despite PR description claiming off-by-default; this changes default training behavior.
pufferlib/ocean/drive/binding.c	Wires `goal_radius_randomization` from INI config into `Drive` env during init; note this ignores any Python kwarg (flag is INI-driven only).
pufferlib/ocean/drive/datatypes.h	Adds per-agent `goal_radius` field to Agent struct; no issues found in this change alone.
pufferlib/ocean/drive/drive.h	Bumps ego feature counts and appends normalized per-agent goal radius; samples per-agent radii on init/reset/respawn and uses them for goal checks/rendering. Main risk is behavior/config consistency with Python/INI defaults.
pufferlib/ocean/drive/drive.py	Adds `goal_radius_randomization` kwarg and computes ego feature dims from C constants; bug: new kwarg is never forwarded to C env init, so it’s a silent no-op.
pufferlib/ocean/drive/visualize.c	Passes `goal_radius_randomization` from parsed config into `Drive` initialization for visualization; change looks consistent.
pufferlib/ocean/env_config.h	Adds `goal_radius_randomization` to INI config struct and parses it from `[env]`; change is straightforward.
pufferlib/ocean/torch.py	Uses `env.ego_features` instead of hardcoded dims for ego encoder input size; aligns model with updated ego feature counts.

Sequence Diagram

sequenceDiagram
  participant Py as Python (Drive/torch.py)
  participant Bind as C binding (binding.c)
  participant Ini as INI parser (env_config.h)
  participant Env as C Env (Drive in drive.h)

  Py->>Bind: env_init(..., ini_file=".../drive.ini", goal_radius=..., ...)
  Bind->>Ini: ini_parse(ini_file, handler, &conf)
  Ini-->>Bind: conf.goal_radius_randomization
  Bind->>Env: env.goal_radius_randomization = conf.goal_radius_randomization
  Bind->>Env: init(env)
  Env->>Env: init_goal_positions()
  Env->>Env: for each active agent: sample/set agent.goal_radius
  loop each step
    Py->>Bind: vec_step()
    Bind->>Env: c_step(env)
    Env->>Env: compute distance_to_goal
    Env->>Env: within_distance = dist < agent.goal_radius
    Env->>Env: compute_observations(): append agent.goal_radius/12
    Bind-->>Py: observations updated (ego_dim includes goal radius)
  end

greptile-apps

_{8 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-09T13:24:19Z

pufferlib/config/ocean/drive.ini

+; 0=disabled (use env goal_radius), 1=enabled (random 2-12m per agent)
+goal_radius_randomization = 1
 ; Max target speed in m/s for the agent to maintain towards the goal


Default flag is enabled

PR description says goal_radius_randomization is off by default, but pufferlib/config/ocean/drive.ini sets goal_radius_randomization = 1. This will change training behavior for anyone using the default config; set the default to 0 (or update the PR description if the intent is to enable by default).

greptile-apps · 2026-02-09T13:24:25Z

Additional Comments (1)

pufferlib/ocean/drive/drive.py
Kwarg is ignored

Drive.__init__ accepts goal_radius_randomization, but it’s never passed through to the C env (binding.env_init). As a result, setting it from Python has no effect (behavior is controlled only by the INI value loaded in binding.c). Either forward goal_radius_randomization into env_init(...) and have binding.c read it from kwargs, or remove the Python kwarg to avoid a silent no-op.

nadarenator added 3 commits February 9, 2026 07:59

goal radius randomization and conditioning

1021dd2

visualizer and safety

e47a2e5

pre-commit hook

9f90c58

nadarenator requested review from Aditya-Gupta26 and riccardosavorgnan February 9, 2026 13:21

greptile-apps bot reviewed Feb 9, 2026

View reviewed changes

nadarenator and others added 12 commits February 9, 2026 08:27

set default no randomization

d6a6e7a

Adding Infra code for Speed limits and lane center/angle

dbc0212

Removing commented code for lane alignment metric

b4fda09

Modifying torch.py

1a6826b

Fixing reset issues and lane alignment override

08beb16

First checkpoint to add reward conditioning

e63168a

Further changes to support reward conditioning

1c89168

Fixing alignment issue

d5510cc

Adding lane reward conditioning

59bbc99

Modifying ego features based on conditioning logic

6d9a87d

Merge branch 'aditya/lane_rewards' into kj/goal_radius_random

36df8fd

pre-commit

2a5b484

nadarenator force-pushed the kj/goal_radius_random branch from 7123438 to 2a5b484 Compare February 11, 2026 01:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Goal Radius Randomization#285

Goal Radius Randomization#285
nadarenator wants to merge 15 commits into3.0_betafrom
kj/goal_radius_random

nadarenator commented Feb 9, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 9, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 9, 2026

Uh oh!

greptile-apps bot commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nadarenator commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Variable Goal Radius + Collision/Offroad Reward Conditioning

Changes

Uh oh!

greptile-apps bot commented Feb 9, 2026

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nadarenator commented Feb 9, 2026 •

edited

Loading