Add examples of AutoTP #998

tohtana · 2026-01-22T00:13:40Z

This PR adds examples of AutoTP training including custom partitioning partterns.

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

@inkcherry

This PR introduces a flexible, configuration-driven API for AutoTP (Automatic Tensor Parallelism) that allows users to define custom layer partitioning patterns for training. @inkcherry @delock ## Motivation Previously, AutoTP relied on hardcoded layer detection logic that was difficult to customize for new model architectures. This PR enables: 1. **Custom models**: Users can define exact regex patterns to match their model's parameter names 2. **Fused layers**: Support for fused QKV, gate_up_proj, and other packed weight matrices with unequal sub-parameter sizes (e.g., GQA with different Q/K/V dimensions) 3. **Extensibility**: Easy to add new model presets or customize existing ones Here is an example of a config including custom partitioning patterns: ```json { "tensor_parallel": { "autotp_size": 4, "partition_config": { "use_default_specs": false, "layer_specs": [ { "patterns": [".*\\.o_proj\\.weight$", ".*\\.down_proj\\.weight$"], "partition_type": "row" }, { "patterns": [".*\\.[qkv]_proj\\.weight$"], "partition_type": "column" }, { "patterns": [".*\\.gate_up_proj\\.weight$"], "partition_type": "column", "shape": [2, -1], "partition_dim": 0 } ] } } } ``` Refer to the [document](https://github.com/tohtana/DeepSpeed/blob/tohtana/autotp_custom_patterns/docs/code-docs/source/training.rst) for more details (including preset models and how to define partitioning for fused models). We also opened a new [PR](deepspeedai/DeepSpeedExamples#998) to show the usage. ## Simplified initialization step AutoTP previously required calling ``set_autotp_mode(training=True)`` and ``deepspeed.tp_model_init`` before ``deepspeed.initialize``. Now we can include all the necessary configurations in the DeepSpeed config. We still support the traditional initialization path for backward compatibility. When you use both (i.e. calling ``set_autotp_mode(training=True)`` and ``deepspeed.tp_model_init`` and passing the config to ``deepspeed.initialize``), we will merge the settings at initialization. When we have conflicting settings, we will error out. --------- Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

tohtana added 6 commits January 20, 2026 16:07

add custom autotp example

ee7c364

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

update key name

783391a

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

update tp examples

cb7de37

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

update tp examples

a547421

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

update example

4285cd9

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

update READMEs

7eedc31

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

tohtana requested a review from tjruwase as a code owner January 22, 2026 00:13

tohtana mentioned this pull request Jan 22, 2026

Support custom partitioning patterns for AutoTP deepspeedai/DeepSpeed#7806

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add examples of AutoTP #998

Add examples of AutoTP #998

Uh oh!

tohtana commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add examples of AutoTP #998

Are you sure you want to change the base?

Add examples of AutoTP #998

Uh oh!

Conversation

tohtana commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant