Skip to content

Conversation

@chhwang
Copy link
Contributor

@chhwang chhwang commented Jan 7, 2026

No description provided.

Binyang2014 and others added 24 commits December 4, 2025 19:20
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- [x] Move hash specialization and equality operator from std/global
namespace to custom namespace
- [x] Update unordered_map to use custom hash and equality as template
parameters
- [x] Add noexcept to equality operator
- [x] Verify the changes build correctly
- [x] Run code review and security checks

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for
you](https://github.com/microsoft/mscclpp/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Binyang2014 <9415966+Binyang2014@users.noreply.github.com>
Co-authored-by: Binyang Li <binyli@microsoft.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors NVLS (NVLink Sharp) connection handling to use the unified GpuIpcMem abstraction instead of directly managing CUDA multicast APIs. This simplifies the code by delegating multicast memory management to the existing GpuIpcMem infrastructure.

Key Changes

  • Replaces manual multicast handle management with GpuIpcMem and GpuIpcMemHandle abstractions
  • Removes manual buffer allocation tracking (allocatedRanges_, freeRanges_) as this is now handled internally by GpuIpcMem
  • Removes explicit synchronization barriers in connectNvlsCollective, relying instead on the blocking behavior of cuMulticastBindAddr within GpuIpcMem::mapMulticast()

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/switch_channel.cc Complete refactoring of NvlsConnection::Impl to use GpuIpcMem; updated license header; removed buffer allocation logic and synchronization barriers; simplified bindMemory implementation
python/csrc/switch_channel_py.cpp Removed Python binding for getMultiCastMinGranularity method (breaking API change)
include/mscclpp/switch_channel.hpp Removed public API methods addDevice() and getMultiCastMinGranularity() (breaking API changes)

@chhwang
Copy link
Contributor Author

chhwang commented Jan 7, 2026

/azp run mscclpp-ut

@azure-pipelines
Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

@chhwang chhwang changed the base branch from chhwang/new-ipc-mem to main January 7, 2026 08:49
@chhwang
Copy link
Contributor Author

chhwang commented Jan 7, 2026

/azp run mscclpp-ut

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants