Bases: Module
Source code in vllm/model_executor/models/internlm2_ve.py
  instance-attribute  ¶
 attention = InternLM2Attention(
    hidden_size=hidden_size,
    num_heads=num_attention_heads,
    num_kv_heads=num_key_value_heads,
    rope_theta=rope_theta,
    rope_scaling=rope_scaling,
    max_position_embeddings=max_position_embeddings,
    cache_config=cache_config,
    quant_config=quant_config,
    prefix=f"{prefix}.attention",
)
 instance-attribute  ¶
 feed_forward = InternLM2MLP(
    hidden_size=hidden_size,
    intermediate_size=intermediate_size,
    hidden_act=hidden_act,
    quant_config=quant_config,
    prefix=f"{prefix}.feed_forward",
)
 instance-attribute  ¶
 feed_forward_ve = InternLM2MLP(
    hidden_size=hidden_size,
    intermediate_size=intermediate_size,
    hidden_act=hidden_act,
    quant_config=quant_config,
    prefix=f"{prefix}.feed_forward_ve",
)
 
 __init__(
    config: PretrainedConfig,
    cache_config: CacheConfig | None = None,
    quant_config: QuantizationConfig | None = None,
    prefix: str = "",
) -> None
Source code in vllm/model_executor/models/internlm2_ve.py
  
 forward(
    positions: Tensor,
    hidden_states: Tensor,
    residual: Tensor | None,
    visual_token_mask: Tensor | None = None,
) -> tuple[Tensor, Tensor]
Source code in vllm/model_executor/models/internlm2_ve.py
  
  Bases: InternLM2ForCausalLM
Source code in vllm/model_executor/models/internlm2_ve.py
   
  Bases: InternLM2Model
Source code in vllm/model_executor/models/internlm2_ve.py
  
 __init__(*, vllm_config: VllmConfig, prefix: str = '')
 
 forward(
    input_ids: Tensor,
    positions: Tensor,
    intermediate_tensors: IntermediateTensors | None = None,
    inputs_embeds: Tensor | None = None,
    visual_token_mask: Tensor | None = None,
) -> Tensor | IntermediateTensors