Skip to content

vllm.v1.attention.backends.short_conv_attn

ShortConvAttentionBackend

Bases: AttentionBackend

Source code in vllm/v1/attention/backends/short_conv_attn.py
class ShortConvAttentionBackend(AttentionBackend):
    @staticmethod
    def get_builder_cls() -> type["ShortConvAttentionMetadataBuilder"]:
        return ShortConvAttentionMetadataBuilder

get_builder_cls staticmethod

get_builder_cls() -> type[
    ShortConvAttentionMetadataBuilder
]
Source code in vllm/v1/attention/backends/short_conv_attn.py
@staticmethod
def get_builder_cls() -> type["ShortConvAttentionMetadataBuilder"]:
    return ShortConvAttentionMetadataBuilder

ShortConvAttentionMetadata dataclass

Bases: BaseMambaAttentionMetadata

Source code in vllm/v1/attention/backends/short_conv_attn.py
@dataclass
class ShortConvAttentionMetadata(BaseMambaAttentionMetadata):
    pass

__init__

__init__(
    num_prefills: int,
    num_prefill_tokens: int,
    num_decodes: int,
    num_decode_tokens: int,
    num_reqs: int,
    has_initial_states_p: Tensor | None,
    query_start_loc_p: Tensor | None,
    num_computed_tokens_p: Tensor | None,
    state_indices_tensor: Tensor,
    block_idx_last_scheduled_token: Tensor | None,
    block_idx_first_scheduled_token_p: Tensor | None,
    block_idx_last_computed_token: Tensor | None,
    nums_dict: dict | None = None,
    batch_ptr: Tensor | None = None,
    token_chunk_offset_ptr: Tensor | None = None,
) -> None

ShortConvAttentionMetadataBuilder

Bases: BaseMambaAttentionMetadataBuilder[ShortConvAttentionMetadata]

Source code in vllm/v1/attention/backends/short_conv_attn.py
class ShortConvAttentionMetadataBuilder(
    BaseMambaAttentionMetadataBuilder[ShortConvAttentionMetadata]
):
    metadata_cls = ShortConvAttentionMetadata

metadata_cls class-attribute instance-attribute