deepseek-ai/DeepGEMM
A unified CUDA kernel library for LLMs optimizing FP8, FP4, and MoE matrix operations.
总 Stars:6,527 · 今日新增:+31
基于 2026-04-18 榜单,共收录 1 个 Cuda 热门仓库。
A unified CUDA kernel library for LLMs optimizing FP8, FP4, and MoE matrix operations.
总 Stars:6,527 · 今日新增:+31