vllm.model_executor.layers.fused_moe.fused_moe_router ¶

FusedMoERouter ¶

Bases: ABC

FusedMoERouter is an abstract class that provides a 'select_experts' method that is used for routing hidden states based on router logits.

Source code in vllm/model_executor/layers/fused_moe/fused_moe_router.py

class FusedMoERouter(ABC):
    """
    FusedMoERouter is an abstract class that provides a 'select_experts'
    method that is used for routing hidden states based on router logits.
    """

    @property
    @abstractmethod
    def routing_method_type(self) -> RoutingMethodType:
        raise NotImplementedError

    @abstractmethod
    def select_experts(
        self,
        hidden_states: torch.Tensor,
        router_logits: torch.Tensor,
    ) -> tuple[torch.Tensor, torch.Tensor]:
        """
        Route the input hidden states to the top-k experts based on the
        router logits.

        Returns:
            (topk_weights, topk_ids)
            (tuple[torch.Tensor, torch.Tensor]):
            The weights and expert ids computation result.

            **Compatibility**: When EPLB is not enabled, the returned ids are
            equivalent to global logical ids, so should be compatible with
            plain MoE implementations without redundant experts.
        """
        raise NotImplementedError

routing_method_type `abstractmethod` `property` ¶

routing_method_type: RoutingMethodType

select_experts `abstractmethod` ¶

select_experts(
    hidden_states: Tensor, router_logits: Tensor
) -> tuple[Tensor, Tensor]

Route the input hidden states to the top-k experts based on the router logits.

Returns:

Type	Description
`Tensor`	(topk_weights, topk_ids)
`tuple[Tensor, Tensor]`
`tuple[Tensor, Tensor]`	The weights and expert ids computation result.
`tuple[Tensor, Tensor]`	Compatibility: When EPLB is not enabled, the returned ids are
`tuple[Tensor, Tensor]`	equivalent to global logical ids, so should be compatible with
`tuple[Tensor, Tensor]`	plain MoE implementations without redundant experts.

Source code in vllm/model_executor/layers/fused_moe/fused_moe_router.py

@abstractmethod
def select_experts(
    self,
    hidden_states: torch.Tensor,
    router_logits: torch.Tensor,
) -> tuple[torch.Tensor, torch.Tensor]:
    """
    Route the input hidden states to the top-k experts based on the
    router logits.

    Returns:
        (topk_weights, topk_ids)
        (tuple[torch.Tensor, torch.Tensor]):
        The weights and expert ids computation result.

        **Compatibility**: When EPLB is not enabled, the returned ids are
        equivalent to global logical ids, so should be compatible with
        plain MoE implementations without redundant experts.
    """
    raise NotImplementedError

vllm.model_executor.layers.fused_moe.fused_moe_router ¶

FusedMoERouter ¶

routing_method_type abstractmethod property ¶

select_experts abstractmethod ¶

routing_method_type `abstractmethod` `property` ¶

select_experts `abstractmethod` ¶