Shashank Shekhar

❯

❯

FLOPs (Floating Point Operations)

FLOPs (Floating Point Operations)

May 01, 20251 min read

Performance
Computation
Hardware

FLOPs (Floating Point Operations)

FLOPs (Floating Point Operations) are a measure of computational work, specifically counting the number of floating-point addition, subtraction, multiplication, or division operations performed.

Significance

Performance Metric: Hardware performance is often measured in FLOPs/second
Workload Estimation: Used to quantify computational complexity of algorithms
Efficiency Analysis: Helps determine how effectively hardware is being utilized

Common Precision Formats

FP32: 32-bit floating point (single precision)
FP64: 64-bit floating point (double precision)
FP16: 16-bit floating point (half precision)
BF16: 16-bit brain floating point (used in ML)
INT8: 8-bit integer (used in quantized operations)

Measurement Scales

GFLOPS: Gigaflops (10^9 FLOPs/second)
TFLOPS: Teraflops (10^12 FLOPs/second)
PFLOPS: Petaflops (10^15 FLOPs/second)

Counting Conventions

Matrix multiplication of size M×K by K×N requires approximately 2MKN FLOPs
- M×K×N multiplications
- M×K×N additions
Modern hardware often combines multiply and add operations (FMA - Fused Multiply-Add)

Notes

Peak theoretical FLOPs/s rarely achieved in practice due to memory bottlenecks
Different precision formats offer different FLOPs performance (e.g., INT8 operations typically 2-4× faster than FP32)
Mixed precision operations combine different formats to balance performance and accuracy

Graph View

FLOPs (Floating Point Operations)
Significance
Common Precision Formats
Measurement Scales
Counting Conventions
Notes

Backlinks

MXU (Matrix Multiply Unit)
Roofline Model
Systolic Array
Scaling Book Part 1. All About Rooflines
Scaling Book Part 2. How to Think About TPUs

Created with Quartz v4.4.0 © 2025

GitHub
Discord Community