fix: align RotaryEmbedding and _init_weights with Qwen2Moe for transformers compat

#2
by kashif HF Staff - opened
  • RotaryEmbedding: use compute_default_rope_parameters for default rope_type (removed from ROPE_INIT_FUNCTIONS in newer transformers)
  • _init_weights: call super()._init_weights() for proper weight loading in transformers v5
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment