Vectors vs Tensors

Vectors vs Tensors — Are They the Same? Short answer: related but not identical. A vector is a special case of a tensor. The math hierarchy Term Rank Shape example Scalar 0 a single number Vector 1 [d] — a 1D array Matrix 2 [m, n] — a 2D array Tensor N [d1, d2, ..., dN] — generic N-dimensional array Every vector is a tensor (specifically, a rank-1 tensor). Not every tensor is a vector. ...

May 21, 2026 · 2 min

Multi-Layer Perceptron (MLP)

Multi-Layer Perceptron (MLP) A Multi-Layer Perceptron (MLP) is one of the foundational types of artificial neural network. It learns to map inputs to outputs by passing data through a series of layers of interconnected nodes (“neurons”), adjusting internal weights during training until its predictions improve. Background: The Single Perceptron To understand an MLP, start with its building block — the perceptron (single neuron): It takes several numerical inputs $x_1, x_2, \ldots, x_n$. Each input is multiplied by a learned weight $w_i$ (how important that input is). The results are summed, a bias term $b$ is added (a constant that shifts the output), and the total is passed through an activation function $f$ to produce an output. $$\text{output} = f!\left(\sum_{i} w_i x_i + b\right)$$ ...

May 17, 2026 · 4 min

Attention in Machine Learning

Attention in Machine Learning Attention is a mechanism that lets a model dynamically decide which parts of the input matter most when producing each piece of output. Instead of compressing everything into one fixed representation, the model computes a weighted combination of inputs where the weights are learned and depend on context. Intuition When translating “the cat sat on the mat” to French, generating the word for “cat” should mostly pay attention to “cat” in the source — not “mat” or “on.” Attention makes this routing explicit and differentiable. ...

May 17, 2026 · 3 min

Diffusion Models in AI

Diffusion Models in AI Diffusion models are a class of generative AI models that learn to create data (images, audio, video, etc.) by learning to reverse a gradual noising process. The Core Idea The training process has two phases: Forward process (destroying data): Take a real image and progressively add Gaussian noise over many steps (say, 1000 steps) until it becomes pure random noise. This is fixed and requires no learning. ...

April 17, 2026 · 2 min