Problems · CrackedAI

Title	Difficulty	Category	Tags
Implement Softmax	easy	primitives	activation basics numerical-stability
Implement ReLU	easy	primitives	activation basics
Implement Sigmoid	easy	primitives	activation basics
Implement Tanh	easy	primitives	activation basics
Implement Leaky ReLU	easy	primitives	activation basics
Implement Mean Squared Error	easy	primitives	loss basics regression
Implement Binary Cross-Entropy Loss	medium	primitives	loss classification binary
Implement Momentum Update	medium	primitives	optimization momentum sgd
Implement Scaled Dot-Product Attention	hard	primitives	attention transformer self-attention
Element-wise Operations	easy	framework	elementwise arithmetic torch.pow jnp.power
Apply Along an Axis	medium	framework	normalization axis-operations keepdim broadcasting
Implement Cross-Entropy Loss	medium	primitives	loss classification multi-class
Implement Adam Optimizer Step	hard	primitives	optimization adam advanced
Create a Tensor from a List	easy	framework	tensor-creation torch.tensor jnp.array
Broadcasting Addition	easy	framework	broadcasting addition numpy-style
Vectorize with vmap	medium	framework	vmap vectorization jax.vmap torch.vmap dot-product
Implement Linear Layer	medium	primitives	layer linear-algebra neural-network
Implement One-Hot Encoding	easy	primitives	encoding classification basics
Implement Cosine Similarity	easy	primitives	similarity linear-algebra basics
Reshape a Tensor	easy	framework	reshape tensor-manipulation torch.reshape jnp.reshape
Transpose a Matrix	easy	framework	transpose linear-algebra torch.transpose jnp.transpose
JIT Compile a Function	medium	framework	jit compilation jax.jit torch.compile
Implement Batch Normalization	medium	primitives	normalization training neural-network
Implement L2 Regularization	easy	primitives	regularization optimization basics
Implement KL Divergence	medium	primitives	information-theory divergence probability
Tensor Indexing and Slicing	easy	framework	indexing slicing torch.index_select jnp.take
Concatenate Tensors	easy	framework	concatenate torch.cat jnp.concatenate
Implement Layer Normalization	medium	primitives	normalization transformer neural-network
Implement Dropout	medium	primitives	regularization training neural-network
Implement Max Pooling 1D	medium	primitives	pooling cnn basics
Implement Average Pooling 1D	medium	primitives	pooling cnn basics
Implement 1D Convolution	hard	primitives	convolution cnn signal-processing
Implement Embedding Lookup	easy	primitives	embedding nlp basics
Rotary Position Embeddings	hard	research	rope rotary-embeddings su-2021 position-encoding transformer
SwiGLU Activation	medium	research	swiglu glu shazeer-2020 activation llm
Implement Gradient Descent Step	easy	primitives	optimization gradient-descent basics
Implement Positional Encoding	hard	primitives	transformer positional-encoding attention
Matrix Multiplication	easy	framework	matmul linear-algebra torch.matmul jnp.matmul
Stack Tensors	easy	framework	stack torch.stack jnp.stack
Masked Fill	medium	framework	masking torch.where jnp.where masked_fill
Compute Gradient	medium	framework	autograd gradient torch.autograd jax.grad
Top-K Values	medium	framework	topk sorting torch.topk jax.lax.top_k
Batched Matrix Multiply	medium	framework	bmm batched-matmul torch.bmm jnp.matmul
Mixed Precision Forward Pass	hard	framework	mixed-precision float16 torch.half jnp.float16 performance
Polynomial Regression	medium	end_to_end	polynomial regression feature-engineering
Gather Elements	medium	framework	gather indexing torch.gather jnp.take_along_axis
Compute Jacobian	hard	framework	jacobian autograd torch.autograd.functional.jacobian jax.jacobian
Two-Layer MLP	easy	end_to_end	mlp feedforward relu neural-network
Training Loop	easy	end_to_end	training-loop sgd gradient-descent linear-regression
Word Embedding Model	medium	end_to_end	embedding nlp average-pooling lookup
Cumulative Sum	easy	framework	cumsum reduction torch.cumsum jnp.cumsum
Scatter Add	medium	framework	scatter scatter_add torch.scatter_add jnp.at
Custom Activation with Gradient	hard	framework	custom-activation silu swish autograd jax.grad
Binary Classifier	easy	end_to_end	binary-classification sigmoid bce-loss logistic-regression
Mini-Batch Training	medium	end_to_end	mini-batch sgd training-loop batching
Parallel Map with vmap	hard	framework	vmap per-sample-gradient jax.vmap jax.grad torch.vmap
Multi-Class Classifier	easy	end_to_end	multi-class softmax cross-entropy classification
Simple CNN	medium	end_to_end	cnn convolution max-pooling fully-connected
Einstein Summation	medium	framework	einsum trace torch.einsum jnp.einsum
Efficient Attention with Masking	hard	framework	attention causal-mask torch.triu jnp.triu softmax
Linear Regression	easy	end_to_end	linear-regression mse regression
Simple RNN Cell	medium	end_to_end	rnn recurrent sequence-model tanh
Beam Search	hard	end_to_end	beam-search decoding sequence-generation search
Depthwise Separable Convolution	hard	research	depthwise-separable xception chollet-2017 efficient-conv cnn
LSTM Cell	medium	end_to_end	lstm recurrent gates sequence-model
Self-Attention Layer	hard	end_to_end	self-attention transformer softmax scaled-dot-product
Skip Connection Block	medium	end_to_end	residual skip-connection resnet deep-learning
Gradient Clipping	easy	end_to_end	gradient-clipping optimization training stability
GRU Cell	medium	end_to_end	gru recurrent gates sequence-model
Transformer Encoder Block	hard	end_to_end	transformer encoder self-attention layer-norm ffn
Feature Normalization Pipeline	easy	end_to_end	normalization preprocessing feature-engineering pipeline
Multi-Query Attention	medium	research	multi-query-attention mqa shazeer-2019 attention transformer
Autoencoder	medium	end_to_end	autoencoder encoder-decoder reconstruction unsupervised
Transformer Decoder Block	hard	end_to_end	transformer decoder causal-attention cross-attention
Data Augmentation	medium	end_to_end	data-augmentation preprocessing transforms pipeline
Grouped-Query Attention	hard	research	grouped-query-attention gqa ainslie-2023 attention transformer
Sequence Classifier	medium	end_to_end	sequence-classification rnn embedding nlp
Simple GAN Generator	hard	end_to_end	gan generator generative-model adversarial
Weight Initialization	easy	end_to_end	initialization xavier he-init kaiming
Sliding Window Attention	medium	research	sliding-window longformer beltagy-2020 local-attention efficiency
Learning Rate Scheduler	medium	end_to_end	learning-rate scheduler cosine-annealing training
Contrastive Loss (InfoNCE)	medium	research	infonce contrastive-loss oord-2018 self-supervised clip
Squeeze-and-Excitation Block	medium	research	squeeze-excitation se-net hu-2018 channel-attention cnn
RMSNorm	easy	research	rmsnorm normalization zhang-sennrich-2019 llm
Top-K Gating	medium	research	top-k-gating moe sparse-routing gating
ALiBi Position Bias	medium	research	alibi position-bias press-2022 attention transformer
Relative Position Encoding	medium	research	relative-position shaw-2018 position-encoding attention
Flash Attention Score Computation	hard	research	flash-attention online-softmax dao-2022 attention efficiency
KV Cache for Autoregressive Decoding	hard	research	kv-cache autoregressive decoding inference transformer
Mixture of Experts Routing	hard	research	mixture-of-experts moe shazeer-2017 sparse routing
Cross Attention	medium	research	cross-attention decoder encoder-decoder transformer
Label Smoothing	easy	research	label-smoothing szegedy-2016 regularization classification
Temperature Scaling	easy	research	temperature-scaling calibration guo-2017 softmax
Prefix Tuning	hard	research	prefix-tuning li-liang-2021 parameter-efficient fine-tuning
Focal Loss	medium	research	focal-loss lin-2017 class-imbalance object-detection loss
Nucleus (Top-P) Sampling	medium	research	nucleus-sampling top-p holtzman-2020 text-generation decoding
LoRA Update	medium	research	lora hu-2021 parameter-efficient fine-tuning low-rank
Stochastic Depth	medium	research	stochastic-depth huang-2016 regularization residual dropout
Speculative Decoding Accept/Reject	hard	research	speculative-decoding leviathan-2023 inference acceleration llm
GELU Activation	easy	research	gelu activation hendrycks-gimpel-2016 transformer

Implement Softmax

easy

primitives

activation basics numerical-stability

Implement ReLU

easy

primitives

activation basics

Implement Sigmoid

easy

primitives

activation basics

Implement Tanh

easy

primitives

activation basics

Implement Leaky ReLU

easy

primitives

activation basics

Implement Mean Squared Error

easy

primitives

loss basics regression

Implement Binary Cross-Entropy Loss

medium

primitives

loss classification binary

Implement Momentum Update

medium

primitives

optimization momentum sgd

Implement Scaled Dot-Product Attention

hard

primitives

attention transformer self-attention

Element-wise Operations

easy

framework

elementwise arithmetic torch.pow jnp.power

Apply Along an Axis

medium

framework

normalization axis-operations keepdim broadcasting

Implement Cross-Entropy Loss

medium

primitives

loss classification multi-class

Implement Adam Optimizer Step

hard

primitives

optimization adam advanced

Create a Tensor from a List

easy

framework

tensor-creation torch.tensor jnp.array

Broadcasting Addition

easy

framework

broadcasting addition numpy-style

Vectorize with vmap

medium

framework

vmap vectorization jax.vmap torch.vmap dot-product

Implement Linear Layer

medium

primitives

layer linear-algebra neural-network

Implement One-Hot Encoding

easy

primitives

encoding classification basics

Implement Cosine Similarity

easy

primitives

similarity linear-algebra basics

Reshape a Tensor

easy

framework

reshape tensor-manipulation torch.reshape jnp.reshape

Transpose a Matrix

easy

framework

transpose linear-algebra torch.transpose jnp.transpose

JIT Compile a Function

medium

framework

jit compilation jax.jit torch.compile

Implement Batch Normalization

medium

primitives

normalization training neural-network

Implement L2 Regularization

easy

primitives

regularization optimization basics

Implement KL Divergence

medium

primitives

information-theory divergence probability

Tensor Indexing and Slicing

easy

framework

indexing slicing torch.index_select jnp.take

Concatenate Tensors

easy

framework

concatenate torch.cat jnp.concatenate

Implement Layer Normalization

medium

primitives

normalization transformer neural-network

Implement Dropout

medium

primitives

regularization training neural-network

Implement Max Pooling 1D

medium

primitives

pooling cnn basics

Implement Average Pooling 1D

medium

primitives

pooling cnn basics

Implement 1D Convolution

hard

primitives

convolution cnn signal-processing

Implement Embedding Lookup

easy

primitives

embedding nlp basics

Rotary Position Embeddings

hard

research

rope rotary-embeddings su-2021 position-encoding transformer

SwiGLU Activation

medium

research

swiglu glu shazeer-2020 activation llm

Implement Gradient Descent Step

easy

primitives

optimization gradient-descent basics

Implement Positional Encoding

hard

primitives

transformer positional-encoding attention

Matrix Multiplication

easy

framework

matmul linear-algebra torch.matmul jnp.matmul

Stack Tensors

easy

framework

stack torch.stack jnp.stack

Masked Fill

medium

framework

masking torch.where jnp.where masked_fill

Compute Gradient

medium

framework

autograd gradient torch.autograd jax.grad

Top-K Values

medium

framework

topk sorting torch.topk jax.lax.top_k

Batched Matrix Multiply

medium

framework

bmm batched-matmul torch.bmm jnp.matmul

Mixed Precision Forward Pass

hard

framework

mixed-precision float16 torch.half jnp.float16 performance

Polynomial Regression

medium

end_to_end

polynomial regression feature-engineering

Gather Elements

medium

framework

gather indexing torch.gather jnp.take_along_axis

Compute Jacobian

hard

framework

jacobian autograd torch.autograd.functional.jacobian jax.jacobian

Two-Layer MLP

easy

end_to_end

mlp feedforward relu neural-network

Training Loop

easy

end_to_end

training-loop sgd gradient-descent linear-regression

Word Embedding Model

medium

end_to_end

embedding nlp average-pooling lookup

Cumulative Sum

easy

framework

cumsum reduction torch.cumsum jnp.cumsum

Scatter Add

medium

framework

scatter scatter_add torch.scatter_add jnp.at

Custom Activation with Gradient

hard

framework

custom-activation silu swish autograd jax.grad

Binary Classifier

easy

end_to_end

binary-classification sigmoid bce-loss logistic-regression

Mini-Batch Training

medium

end_to_end

mini-batch sgd training-loop batching

Parallel Map with vmap

hard

framework

vmap per-sample-gradient jax.vmap jax.grad torch.vmap

Multi-Class Classifier

easy

end_to_end

multi-class softmax cross-entropy classification

Simple CNN

medium

end_to_end

cnn convolution max-pooling fully-connected

Einstein Summation

medium

framework

einsum trace torch.einsum jnp.einsum

Efficient Attention with Masking

hard

framework

attention causal-mask torch.triu jnp.triu softmax

Linear Regression

easy

end_to_end

linear-regression mse regression

Simple RNN Cell

medium

end_to_end

rnn recurrent sequence-model tanh

Beam Search

hard

end_to_end

beam-search decoding sequence-generation search

Depthwise Separable Convolution

hard

research

depthwise-separable xception chollet-2017 efficient-conv cnn

LSTM Cell

medium

end_to_end

lstm recurrent gates sequence-model

Self-Attention Layer

hard

end_to_end

self-attention transformer softmax scaled-dot-product

Skip Connection Block

medium

end_to_end

residual skip-connection resnet deep-learning

Gradient Clipping

easy

end_to_end

gradient-clipping optimization training stability

GRU Cell

medium

end_to_end

gru recurrent gates sequence-model

Transformer Encoder Block

hard

end_to_end

transformer encoder self-attention layer-norm ffn

Feature Normalization Pipeline

easy

end_to_end

normalization preprocessing feature-engineering pipeline

Multi-Query Attention

medium

research

multi-query-attention mqa shazeer-2019 attention transformer

Autoencoder

medium

end_to_end

autoencoder encoder-decoder reconstruction unsupervised

Transformer Decoder Block

hard

end_to_end

transformer decoder causal-attention cross-attention

Data Augmentation

medium

end_to_end

data-augmentation preprocessing transforms pipeline

Grouped-Query Attention

hard

research

grouped-query-attention gqa ainslie-2023 attention transformer

Sequence Classifier

medium

end_to_end

sequence-classification rnn embedding nlp

Simple GAN Generator

hard

end_to_end

gan generator generative-model adversarial

Weight Initialization

easy

end_to_end

initialization xavier he-init kaiming

Sliding Window Attention

medium

research

sliding-window longformer beltagy-2020 local-attention efficiency

Learning Rate Scheduler

medium

end_to_end

learning-rate scheduler cosine-annealing training

Contrastive Loss (InfoNCE)

medium

research

infonce contrastive-loss oord-2018 self-supervised clip

Squeeze-and-Excitation Block

medium

research

squeeze-excitation se-net hu-2018 channel-attention cnn

RMSNorm

easy

research

rmsnorm normalization zhang-sennrich-2019 llm

Top-K Gating

medium

research

top-k-gating moe sparse-routing gating

ALiBi Position Bias

medium

research

alibi position-bias press-2022 attention transformer

Relative Position Encoding

medium

research

relative-position shaw-2018 position-encoding attention

Flash Attention Score Computation

hard

research

flash-attention online-softmax dao-2022 attention efficiency

KV Cache for Autoregressive Decoding

hard

research

kv-cache autoregressive decoding inference transformer

Mixture of Experts Routing

hard

research

mixture-of-experts moe shazeer-2017 sparse routing

Cross Attention

medium

research

cross-attention decoder encoder-decoder transformer

Label Smoothing

easy

research

label-smoothing szegedy-2016 regularization classification

Temperature Scaling

easy

research

temperature-scaling calibration guo-2017 softmax

Prefix Tuning

hard

research

prefix-tuning li-liang-2021 parameter-efficient fine-tuning

Focal Loss

medium

research

focal-loss lin-2017 class-imbalance object-detection loss

Nucleus (Top-P) Sampling

medium

research

nucleus-sampling top-p holtzman-2020 text-generation decoding

LoRA Update

medium

research

lora hu-2021 parameter-efficient fine-tuning low-rank

Stochastic Depth

medium

research

stochastic-depth huang-2016 regularization residual dropout

Speculative Decoding Accept/Reject

hard

research

speculative-decoding leviathan-2023 inference acceleration llm

GELU Activation

easy

research

gelu activation hendrycks-gimpel-2016 transformer