This is going to be a very unusual post, it’s an ad-hoc effort, responding directly to Sabrina Golonka and Andrew D Wilson‘s call for feedback: they have recently published a pre-print on bioRxiv, entitled “Ecological Representations“. In the accompanying blog post, they explicitly ask for comments, and since I’ve found their paper extremely promising, I’ve agreed to produce my detailed commentary here. This is also the proper follow-up to a brief discussion I’ve had with Sabrina, mentioned also in my reply to Epstein’s essay.
You are about to read a form of pseudo “open peer review”: it’s not exactly open peer review because I’m not really their peer. I’m not a psychologist and not even an active academic in related fields. Never mind: I have strong opinions and when I started blogging I’ve decided I will make them public.
For the reminder of this post, I will address Sabrina and Andrew directly (makes for easier writing), however, note that they are interested in collecting opinions (not my opinions in particular!), so please do feel free to chip-in with your own comments.
Before reading what follows, it’s probably better to read their paper in full (time well spent: you won’t regret it).
Sabrina and Andrew,
thank you for writing this paper, reading it confirmed my higher hopes: I think it’s a very needed move in the right direction, and could help cut through an impressive amount of conceptual knots. I really hope your paper will become a cornerstone of both psychology and philosophy of mind, so I am thrilled by the opportunity you’re giving me to try contributing. However, I do think you’ll find what follows difficult to take onboard, for multiple reasons, so I guess it’s better to make the main ones explicit.
First, our backgrounds are very different, my formal preparation is in molecular biology, biophysics and neuroscience (all growing steadily out of date). My interest in psychology is personal, pursued in my private time, not formal in any way (meaning it is patchy, as I only dig into what grabs my interest) and not focussed on Gibson’s work at all. Furthermore, in the last few years I’ve been actively concentrating more on philosophy of mind, not empirical psychology per se. Thus, my language and point of view is very different from yours, making effective communication harder; it is also quite possible that I’ll be barking at the wrong tree (as far as you are concerned), and that my views/comments will simply not apply to what you’re trying to achieve.
With this important disclaimer in mind, I can describe the structure of what follows.
In general, I found your paper to be well written, (mostly) clear, easy to follow and very promising. However, I also think that you have overlooked or misplaced an important conceptual step, which in turn justifies why I disagree with one important conclusion you reach. The possibility that I am misreading or misunderstanding you however is very concrete: it’s possible that I simply don’t grasp the concept of Ecological Information (EI) well enough, and that therefore my main criticism is misplaced. If that’s so, please feel free to expose my own mistake/ignorance, without sugar-coating. As you’ll see, I won’t be pulling my punches either: in the interest of clarity, I’ll be very direct (knowing/expecting that this is what you’d appreciate more than anything).
The big reason why I’m investing a few days of my spare time is simple (also insanely ambitious): I hope that addressing the step I believe you’ve missed will make the paper even more groundbreaking; as it is, your paper tries to bridge EI with almost “traditional” representations (as used in Cognitive Psychology), but in my opinion, it currently falls just short of the mark. If you’ll find my comments somewhat useful, you might be able to also bridge EI with Shannon’s Information (SI) and, for the same price, open the door to the field of prediction-based perception. If I’ll succeed (unlikely, but it’s worth trying) you may eventually see how to unify all four approaches (instead of “just” two), saving the best sides of each and concurrently solving related philosophical problems. A high-stakes, high-risk game, so I’m happy to take the risk as I personally have nothing to lose :-).
As it’s unlikely that you’ll find my high-risk suggestions useable, I’ll add a few low-risk, low-gain suggestions at the bottom of this post, hoping to be useful at least a little.
What follows is rich in quotes (I hope you don’t mind), I’ll start them with the indication of the page where they appear, to aid navigation.
Take-home message so far: please be aware of the distance between our backgrounds, also do keep in mind that I’m deliberately taking a long shot, so the chances I’ll miss my self-selected distant mark are high. In the rest of this post I will try to:
- Expose what I think is a (very big) gap in the picture you are painting.
- Explain why I think the presence of this gap doesn’t allow you to justify some of your conclusions.
- Propose my own way of filling the gap.
- Briefly explain why I think the proposed addition is worth the risk.
- Finally, I’ll close up with minor suggestions and general praise.
I will start by declaring my main problem with your aim and conclusions, you write:
[P3-4] We propose that Gibsonian ecological information perfectly fits the definition of a representation.
I think this is wrong: Gibsonian ecological information (EI) does not fit the definition (on its own), but what you propose as “neural representations” do fit the definition of representation you’ve adopted. This ties-in nicely with what I think is the main gap you’ve failed to bridge.
The way I read you, you claim that EI is “out there”, it is collected by sensory organs, transformed into “neural representations” of EI and then used to both control and select actions. At worse, this picture is sketchy to the point of being wrong (my biophysics background comes to the fore).
In plain terms, what we do know is that sensory organs collect and transduce (transform into nervous signals) a vast amount of data. I’m not using the term “information” yet, for reasons that should become clear below. At any given time, touch receptors, smell-cells, photoreceptors, proprioceptors (and more) are all active doing this: they collect everything that hits them (if it’s able to influence their specific receptors) and send a corresponding signal towards the brain. Once transduced, what was before unspecific energy or molecules becomes something which can be directly interpreted as a signal (the action potentials travelling through the axons of sensory neurons). Nothing particularly new in this, but this very general and universally accepted picture is apparently hard to reconcile with the vision you are proposing.
In your view, EI is out there, it is collected by the senses and then used to control/select action. This isn’t plain wrong, but glosses over important details:
EI is indeed out there, but is bathed in a sea of unspecific stuff. All this unspecific energy and molecules will disorderly hit receptors and all of it has the potential of being transduced (within the boundaries of what sensory organs can collect). Thus, the first signals that are transmitted within an organism are unspecific, they potentially contain also EI, but they actually “contain” a lot more. At this point the task for the organism is to discard all the information that isn’t currently useful and retain only what counts as EI. Crucially, in your paper you mention something that is close to this need only when you briefly comment on “learning to perceive”, but otherwise you ignore the whole subject. This for me is a deal breaker: if I were formally peer-reviewing your paper I would recommend to reject unless you are willing to show how organisms manage (or may manage) to extract EI from unspecific stimuli. Unfortunately, doing so is potentially negating one of the things you find exciting: EI is indeed present outside, but already considering it a representation is at the very least misleading, as it is effectively hidden by the vast amount of potentially irrelevant data. At “collection time” EI is present, but useless: one needs a way to extract it, and a way which is flexible enough to accommodate the somewhat unpredictable ecological needs of the perceiving organism.
Thus, one of the main points of the paper:
[P9] we propose that ecological information simply is the representation that closes the poverty of stimulus gap, though it is external and ecological rather than internal and mental.
Is strongly undermined by what I’m proposing here (sorry!). Specifically: I’m not convinced that it is already useful to consider EI, when external, as a representation. To be considered as such, one needs to take as a given (gloss over?) the context and internal state of the perceiving organism: depending on contingent factors, including the task at hand, what counts as EI changes all the time, so I think we’d be better off by accepting that EI is such in virtue of internal factors as defined by the organism itself and, crucially, its own ecological needs.
To use the example you make of coordinated rhythmic movement: the dynamic pattern of “relative direction” used to coordinate action is external in the sense that it is potentially available to any third party observer. This is of extreme importance when it comes to empirically reverse engineering how organisms produce behaviour. However, in the world out there there is a hell of a lot more structures and dynamics, all of them co-existing in a seemingly chaotic mixture. A priori, all of them may have important ecological implications for a perceiving agent. Importantly, in your own example, what makes the “relative direction” criteria relevant to the subject is determined by something inside the subject (in this case, what the subject is trying to do).
Of the collected signals, what comes from visual (and I’d guess propioceptive) sensors can be used to determine in what direction each limb is moving. From there, you can also derive the relative direction, and use it to coordinate. Thus, functionally, the Shannon’s kind of information (SI) that is available outside is massively filtered at the level of sensory organs (only some physical properties can have effects on the activity of a given receptor), it is then processed, and via this processing, the ecological information is “extracted”. The process is computationally analogous to compression: you have a hell of a lot of bits outside, transduce only some (the potentially relevant, as defined by evolutionary processes) and “process” it further in order to progressively reduce them, ending up with the minimal amount of bits which are enough to react appropriately (coordinate, in this case). If we change the task, what information is thrown away and what is instead retained would change accordingly. Thus, if we are trying to explain how an organism does all this, the fact that the relevant information, AKA Ecological Gibsonian information is out there is indeed important, at least because it enables to design solid empirical investigations. However, the crucial functionalities that can allow an organism to function are:
- Ability to collect whatever is potentially relevant.
- Ability to extract what is actually relevant for a given task.
Because what is actually relevant is a function of the organism, the context, and of the contingent organism’s state, saying that the representation is already external hides the fact that what counts as EI is determined by the organism itself. For this reason, I fail to see why it’s useful to declare that external EI is representational (apart from saving a signature aspect of radical embodiment). Photons bouncing around can be seen as ecological information only if someone is detecting them.
Time for a little detour: one factor of extreme importance is that something like the “relative direction” of the organism’s limbs is something that can be detected, using the vast amount of (potentially) detectable stuff in the world. Crucially, this data is collectable by virtually any third-party observer: it is an objectively measurable property of the environment, and thus, directly amenable to empirical investigation. This enables doing science, justifying the successes of radical embodiment; one can hypothesise: “this particular pattern is what the organism uses to coordinate”. With such a hypothesis, you can make predictions and on this basis, verify if the hypothesis seems to hold. In this way, you get to specify what kind of signals may be collected in 1. and how they need to be transformed (filtered/compressed) to perform 2. Paired with good old boring biophysics (specifying what and organism can actually collect, in terms of SI), you’ve narrowed down the possibilities to a tremendous extent.
Yes, in a sense the EI is out there, it is external, but what makes it “ecological” or, if you prefer, what makes it possible to extract the signal, differentiate it from the irrelevant (not ecological, not relevant for the organism for the current task) is exclusively internal. Since we are interested in understanding how the organism detects the relevant information (from the messy bulk of stimuli collected by lots of sensory organs), the information needed is by definition out there, but it actually becomes proper Ecological Information because of how it is internally processed.
This leads to what you call “neural representations”. What your paper seems to suggest is that EI is directly collected and transformed into “neural representations”. What I’m suggesting is that the “directly” part is (if implied, as I think) misleading. Furthermore, how neural representation of EI are generated is exactly the interesting passage in the whole story. I appreciate you probably have consciously decided not to tackle this aspect, but I think it’s a mistake:
a. It makes your paper vulnerable to the kind of criticism I’m making.
b. It misses a tremendous opportunity, while weakening your claims.
Specifically, your paper already tries to unify traditional Cognitive Psychology and Radical Embodiment, while keeping the best sides of both views. To do so, you gloss over a major aspect of perception, opening up to criticism. Instead, you could bite the bullet, strengthen your argument, and get additional unifying powers:
I. As hinted above, the revised story I’m proposing is also mapping the relation between Shannon’s Information and Ecological Information. (See also my attempt to link structure and dynamics to SI.)
II. Showing how information is filtered/compressed in order to extract EI from raw sensory input allows to slot-in the other main hot-topic in neuroscience: the predictive approach. Doing so solves a problem and reconciles apparently antagonistic views…
I’ll allow myself to briefly discuss this second benefit. We already know that all sorts of raw unspecific signals are collected by sensory organs, we know they are processed along neural pathways at each identifiable step (at the very least, when signals pass from one cell to the other). The story I’ve been painting starting from your paper then allows to clearly define what is the main function of the transformations that happen during and after the first transduction. The aim is to isolate EI and to discard the rest.
The problem is that what counts as EI is both context-dependent and internally defined (depends on the state of the organism). Thus, the system that extracts EI needs to be potentially universal (we agree on this, apparently), or at least, as versatile as possible. It’s like designing a targeting system while not knowing what kind of projectiles and targets will be used. Such a system needs to be dynamically able to identify the correct kinematic projections from the original (outside world) dynamics. At any given time, the set of possible kinematic projections is effectively infinite. How can a system optimally isolate the correct ones when it can’t make many assumptions on what will make them “correct”? [If you wish, I’m merely restating the framing problem.]
One solution comes from the prediction-based approach: if you can manage to transform input at time A in such a way that it efficaciously predicts input at time A+1, you are guaranteed that you are keeping as much potential EI as possible, while at the same time you are discarding everything else – you are distilling the potential EI while filtering out all the noise. For brevity, I’ll leave this as a hint, but do note that I have a lot more to say, so in case I’ve tickled your curiosity, feel free to ask. [Note also that, like Andrew, I still have to read Clarke’s latest book, but I do usually agree with him. See also this brief article by Tim Genewein on why Bayesian approaches can be understood in terms of lossy compression.]
This concludes the highly challenging and propositional side of my comments. To close off the main commentary, I still need to address the one conclusion you make which I don’t think is appropriately justified. It will take just a little longer.
[P18] Our developing solution begins by identifying that information can not only control actions; it can also select them
Yes, no problem with this. Once the organism has isolated an applicable form of EI, it can select actions, not only control them. Interesting here: to select, one discards most of the collected SI, and remains with the amount of bits necessary to discriminate across the available options, so very few bits. In controlling action, frequent and highly tuned corrections are needed, so less SI is discarded. This leads to a vision of “higher order” cognition as the most impoverished form of cognition! It is also the only cognition which we consciously experience, so putting the two things together, you end up explaining a few interesting things:
i. Traditional cognitive psychology starts from the ideas of impoverished signals and of enriching representations because, well, that’s what we experience, so makes intuitive sense. It’s also somewhat wrong. The most impoverished signals are objectively poor because they are very rich in EI. One could say they are objectively poor and subjectively rich (!).
ii. A signal rich in EI, can be used to produce high-level predictions, making it possible that such signals are indeed sometimes used to fill-in the blanks, as assumed by cognitive psychology.
iii. The enrichment/filtering process is likely gradual: if used to control movements the signal can be routed towards outputs without being impoverished to the max.
iv. This also directly explains why [P19] “there is no convincing evidence that we can instantiate a neural representation of information sufficient to support action control unless the relevant information is present in the current environment“. We only store the most enriched EI, why would we store anything else? But because of that, the information is poor (objectively) and thus, not sufficient to drive highly refined behaviours. It only suffices to effectively select behaviours. In other words, I agree with your entire “motivation 3” discussion (from P16), and think it should be extended.
This brings us to my last problem, appearing at the bottom of p20:
But these neural representations, while internal, are not the mental representations of standard cognitive theories.
Unfortunately, you have not convinced me that the two kinds of representations aren’t one and the same. You describe impoverished representations which can produce perceptions (hear my inner speech, for example), and can be used both to produce inferences and select behaviour. Without other context, I would be recognising these representations as the classic cognitive psychology ones. The one thing you’ve added is showing why they need to be intensional (because by being so they solve the problems associated with representations and therefore make representations ecologically useful!), and thus you are showing why cognitive psychology is wrong when it understands representations in a way that can only make them extensional.
In other words, you are correcting a very big and frequent mistake made in cognitive psychology, you are showing what the representations we talk about actually are, but you are not negating their existence. Perhaps is my relentless drive towards unification that is speaking here. [Side note: the intensional/extensional distinction you make is spot on, and the main reason why I agree that your view actually goes a long way in naturalising intentionality. I would love to see your paper published for this reason alone.]
While re-reading your paper, I took a lot of notes. I will include them here with minimal editing. If the ambitious commentary above will prove to be useless to you, perhaps you’ll find something useful in what follows.
Across the whole paper you sometimes refer to “information”, sometimes to “ecological information”. When I read (unqualified) “information”, I automatically understand “Shannon’s information” – problem is, I don’t think you ever refer to SI, so the effect is confusing and (for me, a non-Gibsonian) an extra effort is required. I guess most scientists would experience a similar effect, so why not introducing the EI acronym and using it throughout?
[P3] They have yet to develop any widely accepted explanations for the ‘high-order’ cognitive activities driving Motivation 3.
Not sure I follow the grammar, here. What is the “3” for? Why “driving motivation”?
Use of “intentionality” on P3: it’s not immediately clear whether you are talking about having “intentions/plans” or intentionality as “aboutness”. P4 clarifies that it’s the latter, but:
[P4] a cognitive system must be able to behave as an (almost) arbitrary function of the environment. In other words, a cognitive system has to be able to be ‘about’ anything it encounters in the world.
I do see the link between these two sentences, but only because I already agree with it, thus I fear this passage might be confusing to others.
[P5] We take Motivation 1 (getting intentionality out of a physical system) to be the primary job of representations. Motivations 2 and 3 are constraints on exactly how Motivation 1 might be implemented given the existence of the two gaps.”
Do you need the second sentence? After re-reading the rest of the paper I don’t think you need to qualify.
Note 5 on P6: I don’t understand it! This note confused me more than anything else.
[P14] Ecological Information Supports System-Detectable Error
This is the only section where you hint towards the big problem you are otherwise largely ignoring: how is the correct EI isolated? The fact that you do mention this makes me hope that my main criticism may not be too wrong.
[P16] Motivation 2 is that representations are required to bridge a poverty of stimulus gap.
I found the bit that follows a little confusing. For me the poverty of the stimulus refers to the fact that we don’t collect all possible signals from the environment, and that sometimes the signals are very indirect (i.e. a pawn print isn’t the tiger, but still a worrying sign, I guess). However, as I’ve explained above, a huge issue is the one of isolating EI from the raw incoming signals, it’s a matter of reducing a huge amount of bits to much, much fewer (i.e. to specify whether to do this instead of that you end up needing only one bit!), but of course, the problem is doing it effectively. Thus, once we have a grasp of how to collect intensional information (see above: I think you can bridge this gap with the predictive approach), reducing it to its bare minimum, AKA impoverishing the signal, is precisely what needs to be done. It goes without saying that the result is necessarily symbolic/representational.
The good stuff
Before concluding, a little praise, in the form of a selection of quotes I’ve absolutely loved (there are many more!).
[P4] These informational representations solve both the symbol grounding and system-detectable error problems [yes, they do], and they constrain the form (and empirical investigation) of neural representations caused by interacting with information.
If we do fill in the blanks (see above), this hits the nail on the head.
[P4] these two ecological representations then address all three motivations for representational accounts described above, including, as we develop below, the major challenge of supporting ‘higher-order’ cognition.
Yes! If you’ll manage to get this view in the mainstream, major problems could be finally surpassed, great stuff.
[P9, on coordinating rhythmic movement] The kinematic information variable ‘relative direction’ is standing-in for the dynamical world property ‘relative phase’ and it requires no additional enrichment from a mental representation in order to do so.
[P12] understanding the function and structure of neural representations requires understanding the structure and environmental cause of ecological information, which is not how cognitive neuroscience currently guides its work.
These two quotes, along with the ones below, summarise why I’m so excited, you are fixing stuff that has been broken for much too long…
[P14] Informational representations [ecological information, once distilled inside the organism], however, are immune to the grounding problem.
Agreed, with the modification I’m proposing.
[P14-15] because, from an ecological perspective, perceiving and acting are fundamentally intensional (Gibson, 1979, Turvey et al., 1981) and because the content of informational representations is accessible to the perceiving-acting organism, organisms can be aware of when these representations are wrong and this awareness can have a consequence on future behavior.
[P16] because information specifies properties and not individuals (Turvey et al, 1981) informational representations can explain our behavioral flexibility. When we encounter a novel object or event, it will likely project at least some familiar information variables (e.g., whether it is moveable, alive, etc), giving us a basis for functional action in a novel context.
In a way, I’m paradoxically disappointed by how quickly (but, in my view, effectively) you sweep through the solution to the problems of intentionality, of grounding representations and the role of the intensional/extensional distinction. I realise this isn’t new, but does bear repeating and re-proposing because it is exactly right. Failing to focus on intensional content is the source of so many errors and confusion in traditional cognitive approaches.
I am very aware that I’ve grabbed your ball, took it to a different court and started playing my own game with it. I’m doing so because I have some hope that you’ll like my modified game. If you don’t, please do feel free to tell me to give the ball back and eff-off. I’ll comply and won’t be holding a grudge, it’s a promise. If, otherwise, you do like some of my lucubrations, please feel free to use them as you wish (no strings attached). What you are doing is immensely useful to what I’m trying to put together here, so I do know I’m not wasting my energy (just thinking about these things is useful to me a-priori).
Finally, if, in virtue of some extraordinary stroke of luck, you think I can help you some more, please do let me know, I’ll be very happy to try.
In all cases, I’m looking forward to your feedback. Thanks for reading!
Golonka, S, & Wilson, AD (2016). Ecological Representations bioRxiv DOI: 10.1101/058925