Skip to content

Develop/ozan#8

Open
ozanpali wants to merge 7 commits intoiit-DLSLab:feature/moefrom
ozanpali:develop/ozan
Open

Develop/ozan#8
ozanpali wants to merge 7 commits intoiit-DLSLab:feature/moefrom
ozanpali:develop/ozan

Conversation

@ozanpali
Copy link

No description provided.

@ozanpali
Copy link
Author

Ciao Giulio, the additional self._last_router_probs is used to compute the load_balance_loss using the unmasked gate probabilities for sparse routing. In the case of dense routing they are identical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant