Summa Intelligentiae: The Mechanics of Intelligence - How Machines Learn
Quaestio Secunda
02
The Mechanics of Intelligence
How Machines Learn
Every child learns the hard way that a stove is hot. Touch. Recoil. Remember. That loop of mistake, feedback, and correction is the essence of learning.
Machines learn the same way. Their burns are data points, not blisters. Each wrong answer becomes a small adjustment, a step toward being less wrong next time. Over time, this refinement begins to resemble intelligence.
Most systems follow instructions. Neural networks learn from errors. That’s what makes them powerful and, at times, unpredictable. So, if machine learning isn’t about writing rules but discovering them, what exactly does it mean when we say a machine has “learned”?
Insights to Expect
How neural networks actually "learn" from error and feedback
What weights, gradients, and backpropagation mean in practice
Why deep learning isn't intelligence but stacked abstraction
Why AI sometimes looks certain but isn't
How these mechanics explain AI’s unpredictability in business contexts
Early computers were rule-followers: perfectly obedient and entirely deterministic. These “expert systems” performed, and continue to perform, chained series of if-then logic functions. Faced with unstructured data, they fail fast and wait for correction.
Machine learning flipped that script. Instead of programming every decision, we started showing systems examples, allowing them to find the relationships themselves. In neural networks, models predict, measure error (the loss), and adjust internal settings to refine and improve. This isn’t cognition, it’s calibration.
A three-layer network of neurons. [Source: Nielsen, M.]
Just as the burn and the lingering blister remind of the hot stove, so does a model's comparison to truth inform and adjust its future actions. This innovation in science and mathematics was the birth of systems that learned from experience, though “experience” means billions of examples and a lot of vector math.
Image generated by ChatGPT
Learning by Mistake
To understand how learning actually works, let’s step back to a time before algorithms when discovery depended on seamanship and patience.
Imagine leading a voyage in the Age of Exploration. You set course to the West, all in hopes of finding the seaward route to India. Your map runs out beyond the horizon. So, while the navigator can’t see the destination, he can measure deviation from the intended heading. He informs the captain and crew of the deviation and the sails are trimmed, rudder steered, and the ship’s course improves to match the heading that will most rapidly see you to your speculative destination.
Each day repeats: measure, correct, improve. Progress comes not from knowing where land is, but from knowing which direction reduces error.
How a Model Trains
The ship is the model
The sails, rigging, ballast, and rudder are the parameters (weights)—thousands, even billions of adjustable levers.
The navigator’s reading is the loss function, measuring deviation from the goal.
The course correction is the gradient—the direction that most rapidly reduces that deviation (i.e., turning to heading 270 from 273 is best achieved with a course change of 3 degrees versus 357 degrees)
The captain’s continuous adjustment is gradient descent—the act of learning through micro-corrections.
Each adjustment affects the rest. Pull one sail line, another tightens. Adjust one parameter, and others compensate. The ship sails as a cohesive unit, a symphony of actions yielding an efficient voyage. The training process is the art of balancing these tensions so the entire model trends toward lower error.
Sometimes the crew finds land. Sailing West from Spain, you may have just hit India. Correct heading and reaching dry land was certainly the definition of success.
But not all land is India. Many explorers mistook new islands for success. Models do the same. They settle in local minima, false lands that feel final. When the corrections no longer improve the heading, the model has converged. It hasn’t found truth, only equilibrium.
The Architecture of Thought: Layers and Backpropagation
Continuing with Westward voyage as our metaphor, a neural network is an armada of ships with a common destination, interwoven dependencies with some loose (e.g., movement of one ship impacting another) and others of supreme importance (e.g., any given ship’s rudder angle) each responsible for a piece of motion. The coordination between them is what produces the behavior we interpret as intelligence.
A neural network is built in layers:
The input layer takes in the data—pixels, words, numbers.
The output layer makes a decision—cat or dog, fraud or not, answer or silence.
Neural network representation of input, hidden, and output layers. [Source: Nielsen, M]
When a prediction misses, backpropagation sends the error backward through these layers, calculating how much each weight contributed to the mistake (Rumelhart, Hinton & Williams). Each connection updates slightly. Billions of small corrections later, the network becomes astonishingly good at the task without ever “understanding” what it’s doing.
Deep learning, whenever referenced, is the stacking of many of these layers, each learning more abstract representations than the one before. The first layer detects letters, the next words, the last sentiment. (LeCun, Bengio, & Hinton).
A conceptual example of how layers can abstract the processing of an image from pixels, to features, to 'face' in a layered approach. [Source: Nielsen, M.]
Final Thoughts
A model’s confidence isn’t truth, it’s fit. Show it a blur, and it may declare "5" with conviction. It isn’t seeing a number; it’s minimizing error.
That’s the essence of machine learning: not insight, but adjustment. A model is the result of billions of minute refinements finding balance in uncertainty. For business leaders, that means confidence is statistical, not cognitive. Reliability comes from the data that shapes those corrections and from how models are tuned in response. Learning, in the end, is adjustment, not awareness.
Next, we turn to the foundation beneath every refinement: data. The fuel of machine learning and the source of everything from performance to bias, drift, and deception.
Did you enjoy this article?
Share it with a friend and don't forget to subscribe.