Skip to main content
Publication

Accelerating MoE Model Inference with Expert Sharding