Skip to content

Conversation

@WenqingLan1
Copy link
Contributor

This pull request adds support for NVBench-based GPU micro-benchmarks to SuperBench.

  • Integrated the NVBench submodule
  • Implemented two benchmarks
    • nvbench-sleep-kernel
    • nvbench-kernel-launch
  • updated documentation and added example scripts

Example config:

version: v0.12
superbench:
  enable:
  # nvbench benchmarks
  - nvbench-sleep-kernel:single
  - nvbench-sleep-kernel:list
  - nvbench-sleep-kernel:range
  - nvbench-sleep-kernel:range-step
  - nvbench-kernel-launch
  var:
    default_local_mode: &default_local_mode
      modes:
      - name: local
        proc_num: 4
        prefix: CUDA_VISIBLE_DEVICES={proc_rank}
        parallel: yes
  benchmarks:
    nvbench-sleep-kernel:single:
      <<: *default_local_mode
      timeout: 300
      parameters:
        duration_us: "50"                   # Single value format
        timeout: 30
    nvbench-sleep-kernel:list:
      <<: *default_local_mode
      timeout: 300
      parameters:
        duration_us: "[25,50,75]"         # List format - no spaces after commas
        timeout: 30
    nvbench-sleep-kernel:range:
      <<: *default_local_mode
      timeout: 300
      parameters:
        duration_us: "[0:5]"           # Range format
        timeout: 30
    nvbench-sleep-kernel:range-step:
      <<: *default_local_mode
      timeout: 300
      parameters:
        duration_us: "[0:50:10]"         # Range with step format
        timeout: 30
    nvbench-kernel-launch:
      <<: *default_local_mode
      timeout: 300

@WenqingLan1 WenqingLan1 requested a review from a team as a code owner October 9, 2025 23:12
@WenqingLan1 WenqingLan1 added benchmarks SuperBench Benchmarks micro-benchmarks Micro Benchmark Test for SuperBench Benchmarks labels Oct 9, 2025
@codecov
Copy link

codecov bot commented Oct 10, 2025

Codecov Report

❌ Patch coverage is 89.11917% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.79%. Comparing base (575859b) to head (498d551).

Files with missing lines Patch % Lines
...rbench/benchmarks/micro_benchmarks/nvbench_base.py 80.39% 20 Missing ⚠️
...enchmarks/micro_benchmarks/nvbench_sleep_kernel.py 98.07% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #750      +/-   ##
==========================================
+ Coverage   85.70%   85.79%   +0.08%     
==========================================
  Files         102      105       +3     
  Lines        7703     7896     +193     
==========================================
+ Hits         6602     6774     +172     
- Misses       1101     1122      +21     
Flag Coverage Δ
cpu-python3.10-unit-test 71.40% <88.94%> (+0.43%) ⬆️
cpu-python3.12-unit-test 71.40% <88.94%> (+0.43%) ⬆️
cpu-python3.7-unit-test 70.90% <89.11%> (+0.46%) ⬆️
cuda-unit-test 83.72% <88.94%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@guoshzhao guoshzhao self-assigned this Oct 17, 2025
@polarG polarG requested a review from Copilot January 23, 2026 00:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds NVBench-based CUDA GPU micro-benchmarks to SuperBench, including build integration, result parsing, tests, examples, and documentation updates.

Changes:

  • Adds NVBench submodule integration and a cuda_nvbench third-party build target.
  • Introduces two new micro-benchmarks (nvbench-sleep-kernel, nvbench-kernel-launch) with parsing + unit tests.
  • Updates Docker images, docs, and CI workflow to support required tooling (notably newer CMake for NVBench).

Reviewed changes

Copilot reviewed 20 out of 23 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
third_party/nvbench Adds NVBench as a git submodule dependency.
third_party/Makefile Adds cuda_nvbench build/install target and adjusts recipe indentation.
tests/data/nvbench_sleep_kernel.log Adds a sample NVBench sleep-kernel output fixture for parsing tests.
tests/data/nvbench_kernel_launch.log Adds a sample NVBench kernel-launch output fixture for parsing tests.
tests/benchmarks/micro_benchmarks/test_nvbench_sleep_kernel.py Adds unit tests for sleep-kernel preprocess and parsing.
tests/benchmarks/micro_benchmarks/test_nvbench_kernel_launch.py Adds unit tests for kernel-launch preprocess and parsing.
superbench/benchmarks/micro_benchmarks/nvbench_sleep_kernel.py Implements the NVBench sleep-kernel benchmark wrapper + output parser.
superbench/benchmarks/micro_benchmarks/nvbench_kernel_launch.py Implements the NVBench kernel-launch benchmark wrapper + output parser.
superbench/benchmarks/micro_benchmarks/nvbench_base.py Adds a shared NVBench benchmark base class (CLI args, parsing helpers).
superbench/benchmarks/micro_benchmarks/nvbench/sleep_kernel.cu Adds NVBench CUDA benchmark implementing a sleep/busy-wait kernel.
superbench/benchmarks/micro_benchmarks/nvbench/kernel_launch.cu Adds NVBench CUDA benchmark for empty-kernel launch overhead.
superbench/benchmarks/micro_benchmarks/nvbench/CMakeLists.txt Adds CMake build for NVBench-based benchmark executables.
superbench/benchmarks/micro_benchmarks/init.py Exports the new NVBench benchmarks from the micro-benchmarks package.
examples/benchmarks/nvbench_sleep_kernel.py Adds an example runner for the sleep-kernel benchmark.
examples/benchmarks/nvbench_kernel_launch.py Adds an example runner for the kernel-launch benchmark.
docs/user-tutorial/benchmarks/micro-benchmarks.md Documents the new NVBench benchmarks and their metrics.
dockerfile/rocm5.0.x.dockerfile Updates Intel MLC download version used in the ROCm image.
dockerfile/cuda13.0.dockerfile Installs newer CMake and builds cuda_nvbench in the CUDA image.
dockerfile/cuda12.9.dockerfile Installs newer CMake and builds cuda_nvbench in the CUDA image.
dockerfile/cuda12.8.dockerfile Installs newer CMake and builds cuda_nvbench in the CUDA image.
.gitmodules Registers the third_party/nvbench submodule.
.gitignore Ignores compile_commands.json.
.github/workflows/codeql-analysis.yml Upgrades CodeQL actions to v3 and adds CMake setup for the C++ job.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

DEBIAN_FRONTEND=noninteractive apt-get install -y ffmpeg libavcodec-dev libavformat-dev libavutil-dev libswresample-dev sudo
DEBIAN_FRONTEND=noninteractive apt-get install -y ffmpeg libavcodec-dev libavformat-dev libavutil-dev libswresample-dev sudo build-essential
- name: Setup CMake
uses: lukka/get-cmake@latest
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using @latest for third-party GitHub Actions is a supply-chain risk and can lead to non-reproducible CI behavior. Pin this action to a specific tagged version or commit SHA.

Suggested change
uses: lukka/get-cmake@latest
uses: lukka/get-cmake@v3.20.0

Copilot uses AI. Check for mistakes.
&& git -C msccl checkout 87048bd && git -C msccl submodule update --recursive --init
else ifeq ($(shell echo $(CUDA_VER)">=12.8" | bc -l), 1)
# Get commit 87048bd from msscl to support updated nccl and sm_100
# Get commit 87048bd from msscl to support updated nccl and sm_100
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: change msscl to msccl.

Suggested change
# Get commit 87048bd from msscl to support updated nccl and sm_100
# Get commit 87048bd from msccl to support updated nccl and sm_100

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 6, 2026 00:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 23 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmarks SuperBench Benchmarks micro-benchmarks Micro Benchmark Test for SuperBench Benchmarks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants