This is the second post in a series inspired by Andy Clark’s book “Surfing Uncertainty“. In the previous post I’ve mentioned that an important concept in the Predictive Processing (PP) framework is the role of confidence. Confidence (in a prediction) is inevitably linked to a similar, but distinct idea: precision. In this post I will discuss both, trying to summarise/synthesise the role that precision and confidence play in the proposed brain architecture. I will be doing this for a few reasons: first and foremost, much of the appeal of PP becomes evident only after integrating these concepts in the overall interpretative framework. Secondarily, Clark does an excellent job in linking together the vast number of phenomena where precision and confidence are thought to play a crucial role, thus an overview is necessary in order to allow enumerating them (in a follow-up post). Finally, reading the book allowed me to pinpoint what doesn’t quite convince me as much as I’d like. This post will thus allow me to summarise what I plan to criticise later on.
[Note: this series of posts concentrates on Clark’s book, as it proposes a comprehensive and compelling picture of (mostly human) brains as prediction engines, from perception to action. For a general introduction to the idea, see this post and the literature cited therein. As usual, I’ll try to avoid highly abstract maths, as I’d like my writing to be as accessible as possible.]
Precision and confidence: definitions.
Precision is a common concept in contexts such as measurement, signal detection and processing. Instruments that measure something (or receive/relay some signal) can never produce exact measures: on different occurrences of the same quantity (whatever it is that it’s being measured/transmitted), the resulting reaction of the device will change slightly. To be honest, it’s more complicated than that: in discussing precision, one should also mention accuracy and how both values are needed to characterise a measurement system – as usual, Wikipedia does a good job at describing the two, allowing me to gloss over the details, for now.
The point where we first encounter precision is when dealing with perception: it goes without saying that perceptions rely on sensory stimuli, and these can be captured in ways that are more or less precise. For example, eyesight can be more or less precise in different people, but for all, the precision will drastically drop when looking underwater with our naked eyes. Our vision underwater becomes heavily blurred, and I think that we can all agree to describe this situation as a marked drop of precision in the detected visual signals.
Confidence is more slippery concept: the term itself is loaded because it presupposes an interpreter. Someone must have a given degree of confidence in something else: “confidence” itself cannot exist without an agent. I’ll come to this thorny philosophical issue (and others) in later posts. For now, we can discuss how Clark uses the concept (which is typical of PP frameworks). The general idea is that perception is an active business. Brains don’t passively receive input and then try to interpret it. In PP, brains are constantly busy trying to predict the signals that are arriving; when a prediction is successful, it will also count as a valid interpretation of the collected stimulus (one attractive feature of this architecture is that it allows to collapse certain powerful forms of learning along with active interpretation of sensory input: if PP is roughly correct, they happen within the same mechanism). In mainstream PP theories, prediction happens continuously at multiple layers within the brain architecture and is organised hierarchically, different layers will be busy predicting different aspects of incoming signals.
Within this general view, the idea of multiple layers allows to avoid positing a central interpreter that collects predictions: at any given time, each layer will be busy producing predictions for the layer below, while also receiving predictions from above. Thus, having dispensed of the dreaded homunculus (a central, human-like interpreter), the concept of confidence becomes more tractable: a given prediction is now a bundle of nervous signals, which can come encoded with some associated confidence (indicating the estimated likelihood that the prediction is correct), without having to sneak-in a fully fledged interpreter. The encoded confidence can have systematic effects on the receiving layer and exert such effects in a purely mechanistic way.
Thus, we can generally expect incoming (sensory) signals to arrive along with their evaluated precision (a mix of precision and accuracy, to be fair) while the downward predictions travel with a corresponding (but distinct!) property which looks at least analogous to what we normally call confidence.
What counts, and what is proposed to explain a fantastically diverse range of phenomena (from attention to psychosis, from imagination to action), is the interplay between precision (coming up, arriving in) and confidence (going down, from centre towards the sensory periphery). Let’s see a general overview, which will allow to refine the current sketch.
Interplay and conflation between precision and confidence.
In PP, any given layer would receive two inputs, one is arriving from the sensory periphery, the other is the prediction issued by higher-level layer(s). The general schema posits that the two inputs are compared. If the two signals match perfectly, the layer will remain silent (a sign of a successful prediction), otherwise the difference will be sent back to the higher level layer, signalling a prediction error. What precision and confidence do, in the PP flavour generally espoused by Clark, is change the relative importance of the two inputs (within a layer) and the importance of each signal in general, across all layers. Thus, a very precise signal will, in a sense, overpower a not-so-confident prediction; a very confident prediction will in turn be able to override a not so precise signal. Simple, uh? Perhaps an example can help clarifying. Our eyesight is quite precise in detecting where the source of a given signal is: we can use sight to locate objects in space with very good precision. Not being bats, the same does not apply to our auditory abilities. We can roughly localise where a noise comes from, but can’t pinpoint exactly where. Thus, vision has high spacial precision, hearing does not.
When I’m slumped on the sofa watching TV, the sounds I’ll perceive will come out from the speakers; however, I’ll perceive voices as if they were coming from the images of talking people within the screen. Why? According to PP, there will be a layer in my brain that combines auditory and visual “channels”. The visual one will be producing a prediction that a given sound comes from (the image of) a given mouth, the auditory channel will suggest otherwise (sound comes from where the speakers are). Thus, combining the two is a symmetric business: it could be that a given layer (driven by vision) produces the “source of sound” prediction and sends it to a layer which receives auditory data (from below). Otherwise the reverse could be the case, and the upcoming signal is visual, while the descending prediction is informed by the auditory channel. Either way, the visual channel (when discerning location) will have high precision (if upcoming) or high confidence (when issuing a prediction), while the auditory has low precision or confidence. When the two are combined to produce the prediction error (one that applies specifically to the combination of these two channels!), the visual signal will matter more, as it’s more precise/confident. Thus, if the prediction is visual, the error signal will be somewhat suppressed, signalling that the expectation (sound should come from where the mouth is seen) is likely to be correct. Vice-versa, if the prediction comes from the auditory channel, the error signal will be enhanced (signalling that the expectation is likely to be wrong). Either way, the end result doesn’t change: because vision is spatially more accurate than hearing, the final hypothesis produced by the brain will be that the voice is coming from where the mouth is seen, and the discrepancy across the two channels will be superseded.
This (oversimplified) example is interesting for a number of reasons. First of all, allows me to introduce another fundamental concept, which I’ll enunciate for completeness’ sake (I will not explain it in this post). In PP, what we end up perceiving at the conscious level is the most successful overall hypothesis: the combination of what all the layers produced, or the one hypothesis that is able to better suppress the error signals globally (within a single brain). There is a lot to unpack about this concept, so much so that even a full book can’t hope to explore all implications (more will follow!); for now, I will need my readers to take the statement above at face value.
The second interesting point is that the description above shows a peculiar symmetry: it doesn’t matter whether auditory information is used to produce a prediction, which is then matched to what is arriving via the visual pathway (in PP, this will be itself a residual prediction error), or vice-versa. In either case, we’ll perceive the sound as if it was coming from the viewed mouth. In turn, this means that the confidence of predictions (flowing down) and the precision of sensory signals (which are, after the very first layer, always in the form of residual errors!) are always combined, and can be modelled in terms of relative weight (higher weight is given more importance). In other words, the two values matter only relatively to one another; at a given layer, the effect of precision and confidence is determined by relative importance alone. That’s quantifiable in a single number, or, if you prefer, by a unidimensional, single variable.
Third observation is that, in view of the last point, the conflation of precision and confidence espoused by Clark and most of the PP theorists (for a paradigmatic example, see Kanai et al. 2015, where precision and confidence are described as a single variable, encoded by the strength of neural signals) is justified – at least, it is justified at this level of analysis. Because of how PP is supposed to work, it seems reasonable to conflate the two and sum them up in a single measure. In practical terms, the move is sensible: to describe the effects of precision and confidence on a single PP layer, all we need is a single measure of relative weight. Conceptually, it also makes sense: after the first layer, the upcoming signal (what I’ve described so far as incoming, sensory, information) is in fact a prediction error, which is in itself heavily influenced by the predictions that shaped it along the way. Thus, upcoming (incoming) signals cannot be said to encode their own precision (as they aren’t measurements any more), they de-facto encode a precision-cum-confidence signal. Overall, to fully embrace the PP hypothesis we are asked to collapse the (usually) distinct concepts of precision and confidence (at least for the upcoming signal); failing to do so would count as an a-priori rejection of the whole paradigm.
The above might look preposterous and over-complicated, however, I would like to remind my readers that brains are the most complex objects known to humanity (How complex? Beyond our ability to comprehend!). Thus, it would be unreasonable to expect that we could make sense of how they work via a single approach that also happens to be simple. Moreover, it’s relevant to note that both the concepts of perception (intended as mere signal detection) and prediction include their respective evaluation of reliability: any system described via one of the two concepts requires to treat either precision or confidence, in order to be fully functional (as commonly understood). What use is a weather forecast if it doesn’t at least implicitly come with an assurance that what it predicts is more accurate than pure guesswork? Would you use a measuring instrument that returns random numbers? Thus, I’d argue that a discussion of precision and confidence is necessary for any serious PP model, it is not a secondary hypothesis (or ingredient), it is as fundamental as the idea of prediction itself.
Finally, in the next post we’ll see that indeed, the proposed role of the interplay between precision and confidence is also the reason why PP is such an attractive proposition: the potential explanatory power of this orchestration is indeed stunning, to the point of being, perhaps, too good to be true.
Clark, A (2016). Surfing Uncertainty: Prediction, Action, and the Embodied Mind Oxford Scholarship DOI: 10.1093/acprof:oso/9780190217013.003.0011
Kanai R, Komura Y, Shipp S, & Friston K (2015). Cerebral hierarchies: predictive processing, precision and the pulvinar. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 370 (1668) PMID: 25823866