Problems

Difficulty:
Category:
Tag:
Status Title Difficulty
SwiGLU Activation medium
Flash Attention Score Computation hard
Grouped-Query Attention hard
Multi-Query Attention medium
Top-K Gating medium
ALiBi Position Bias medium
Relative Position Encoding medium
Nucleus (Top-P) Sampling medium
Focal Loss medium
Speculative Decoding Accept/Reject hard
Mixture of Experts Routing hard
Prefix Tuning hard
Temperature Scaling easy
GELU Activation easy
Contrastive Loss (InfoNCE) medium
Depthwise Separable Convolution hard
Rotary Position Embeddings hard
LoRA Update medium
Cross Attention medium
RMSNorm easy
Label Smoothing easy
Sliding Window Attention medium
KV Cache for Autoregressive Decoding hard
Stochastic Depth medium
Squeeze-and-Excitation Block medium