Political Science Observations

Neural networks: speed explained

There is no single answer—speed depends entirely on whether you are training the network or running it (inference), and on the hardware you use.

Training speed (the slow part)

Training is the iterative loop that adjusts weights. It is computationally expensive because millions of examples must pass forward and backward through the network. On a CPU (central processing unit), progress is very slow; a simple model might take hours and a complex one weeks. GPUs (graphics processing units) are the standard because they excel at parallel computation. Small models like convolutional neural networks typically train in minutes to hours, while large language models such as GPT may require weeks or months using clusters of hundreds of GPUs. Google’s TPUs (tensor processing units) are custom‑built for neural network math and can sometimes reduce training time by a factor of two or three compared to GPUs.

Inference speed (the thinking part)

Inference is how fast the network produces an answer after training, and it is usually measured in milliseconds. On a CPU, small networks can run very efficiently—object detection on a Raspberry Pi is feasible. On a GPU, inference is extremely fast; for a model like GPT‑4 or Gemini, the latency to the first generated word is often just milliseconds, though producing long responses takes longer. There is a direct trade‑off: larger, more accurate networks are slower. Engineers frequently use techniques such as pruning (removing neurons that contribute little) and quantization (using smaller numeric representations) to make networks run two to five times faster on phones or browsers.

Summary

A neural network for real‑time video processing must run at 30 frames per second, meaning each inference takes about 33 ms. A large language model might generate 50 to 100 words per second. Yet training that same network could have consumed thousands of hours on a supercomputer.

Speed is a spectrum: from milliseconds per inference to weeks of training, shaped by hardware choices and model design.

Political Science Observations

Tuesday, March 17, 2026

Training speed (the slow part)

Inference speed (the thinking part)

Summary

No comments:

Post a Comment

Report Abuse

Labels