Skip to content

vllm.model_executor.layers.fused_moe.fused_moe_router

FusedMoERouter

Bases: ABC

FusedMoERouter is an abstract class that provides a 'select_experts' method that is used for routing hidden states based on router logits.

Source code in vllm/model_executor/layers/fused_moe/fused_moe_router.py
class FusedMoERouter(ABC):
    """
    FusedMoERouter is an abstract class that provides a 'select_experts'
    method that is used for routing hidden states based on router logits.
    """

    @property
    @abstractmethod
    def routing_method_type(self) -> RoutingMethodType:
        raise NotImplementedError

    @abstractmethod
    def select_experts(
        self,
        hidden_states: torch.Tensor,
        router_logits: torch.Tensor,
    ) -> tuple[torch.Tensor, torch.Tensor]:
        """
        Route the input hidden states to the top-k experts based on the
        router logits.

        Returns:
            (topk_weights, topk_ids)
            (tuple[torch.Tensor, torch.Tensor]):
            The weights and expert ids computation result.

            **Compatibility**: When EPLB is not enabled, the returned ids are
            equivalent to global logical ids, so should be compatible with
            plain MoE implementations without redundant experts.
        """
        raise NotImplementedError

routing_method_type abstractmethod property

routing_method_type: RoutingMethodType

select_experts abstractmethod

select_experts(
    hidden_states: Tensor, router_logits: Tensor
) -> tuple[Tensor, Tensor]

Route the input hidden states to the top-k experts based on the router logits.

Returns:

Type Description
Tensor

(topk_weights, topk_ids)

tuple[Tensor, Tensor]
tuple[Tensor, Tensor]

The weights and expert ids computation result.

tuple[Tensor, Tensor]

Compatibility: When EPLB is not enabled, the returned ids are

tuple[Tensor, Tensor]

equivalent to global logical ids, so should be compatible with

tuple[Tensor, Tensor]

plain MoE implementations without redundant experts.

Source code in vllm/model_executor/layers/fused_moe/fused_moe_router.py
@abstractmethod
def select_experts(
    self,
    hidden_states: torch.Tensor,
    router_logits: torch.Tensor,
) -> tuple[torch.Tensor, torch.Tensor]:
    """
    Route the input hidden states to the top-k experts based on the
    router logits.

    Returns:
        (topk_weights, topk_ids)
        (tuple[torch.Tensor, torch.Tensor]):
        The weights and expert ids computation result.

        **Compatibility**: When EPLB is not enabled, the returned ids are
        equivalent to global logical ids, so should be compatible with
        plain MoE implementations without redundant experts.
    """
    raise NotImplementedError