vllm.lora.layers.utils ¶
LoRAMapping dataclass ¶
_fully_sharded_can_replace ¶
decorator which adds the condition of fully sharded loras intended to wrap can_replace_layer()
Source code in vllm/lora/layers/utils.py
_get_lora_device ¶
Returns the device for where to place the LoRA tensors.
Source code in vllm/lora/layers/utils.py
_not_fully_sharded_can_replace ¶
decorator which adds the condition of not using fully sharded loras intended to wrap can_replace_layer()
Source code in vllm/lora/layers/utils.py
try_get_optimal_moe_lora_config ¶
try_get_optimal_moe_lora_config(
op_type: str,
w1_shape: tuple[int, ...],
w2_shape: tuple[int, ...],
rank: int,
top_k: int,
dtype: str | None,
M: int,
block_shape: list[int] | None = None,
) -> dict[str, int | None]