deepseek-ai/DeepGEMM
A unified CUDA kernel library for LLMs optimizing FP8, FP4, and MoE matrix operations.
Total Stars:6,527 · Today:+31
Based on the 2026-04-18 board, 1 popular Cuda repositories are included.
A unified CUDA kernel library for LLMs optimizing FP8, FP4, and MoE matrix operations.
Total Stars:6,527 · Today:+31