Skill and Spatial Content

Rick Grush

0. Introduction.

[1] It is well-known that Evans laid the groundwork for a truly radical and fruitful theory of content -- a theory according to which content is a genus with at least conceptual and nonconceptual varieties as species, and in which nonconceptual content plays a very significant role. It is less well-recognized that Evans was also in the process of working out the details of a truly radical and groundbreaking theory of representation, a task he was unfortunately unable to bring to any satisfactory stage of fruition. I am here drawing the distinction between a theory of content and a theory of representation in the following way. Representations have traditionally been analyzed as a vehicle that carries some content -- the ink spots arranged on the page that mean "Eiffel Tower", the infamous brain state that represents the cow. A theory of representation is, at least in part, a theory of how it is that some vehicle or other comes to carry a content. A theory of content, by contrast, need not concern itself with how or why contents are carried by this or that vehicle -- rather, it is concerned with what contents are (things? states of affairs?), how to characterize them (propositions? presentations of the world? images?), relations between contents (inferential? association?), and whether there are different kinds of content, such that different answers to some of the above questions may be appropriate to each.

[2] A very lively area of application of theories of representation these days concerns how brains states can carry a content: in virtue of what does some physical state of the organism (typically some brain state) genuinely represent something external to that organism? Many sorts of lines have been pursued: similarity (or more abstract relations of isomorphism), causal history, nomic relations, and evolutionary pressures, to name just a few. Evans' idea in the theory of representation was that the mastery of certain skills, especially bodily, sensorimotor skills, can be that in virtue of which an organism is able to grasp a content. Or to put it in slightly different terms which make a closer parallel to the previous examples: a brain state can carry a content in virtue of the fact that that state is part of the implementation base of an appropriate skill. I will for sake of a handy label, call this the skill theory of representation, or simply the skill theory.1

[3] Many may believe, and perhaps Evans did, that all and only contents carried in virtue of mastery of skills are nonconceptual. The idea here would be that an organism can master skills that allow the organism to do certain things successfully or correctly, in an epistemically responsible way, without thereby mastering the concepts required to characterize those success conditions. This will include, but not be limited to, the ability to discriminate many shades of color. But such issues, as interesting as they are, are not my immediate concern. I want to leave it open for purposes of this paper that there may be no necessary connection between the skill theory and nonconceptual content, and focus rather on trying to make sense of the skill theory itself. How it would be best to understand the contents thus carried -- conceptual, nonconceptual, or whatever -- is a question I leave for another day.

[4] I will approach this topic by expanding on Evans' own treatment of it in "Molyneux's Question", henceforth MQ (Evans, 1985).2 In that paper, Evans argued that the spatiality of vision was not something intrinsic to visual qualia themselves, but rather was supplied to visual experience in virtue of the organism's mastery of a battery of sensorimotor skills. Evans chose to present this idea via a defense of an affirmative answer to the Molyneux question. In Section 1, I will briefly outline the Molyneux question and Evans' proposal. This will be mostly by way of a number of quotes from MQ. This will accomplish little more than a statement of the position, and certainly not an explanation, defense, or exploration of it.

[5] In Section 2, I will begin the explanation and exploration. Taking as a starting point a criticism of Evans made by Ruth Millikan, I will try to show how there might be a tight connection between the possession of certain skills, and the ability to entertain certain contents. In this section, the contents in question will not be spatial contents, but rather what might be called rhythmic contents -- the capacity to noninferentially apply predicates like "rhythmic" or "pulsating" to certain experienced phenomena. In Section 3, I apply the lessons of Section 2 to the issue of the spatial content of perception. The result will, I hope, be an elaboration of Evans' stated views comparable to what Evans himself would have developed.

1. The Molyneux question, and a preliminary presentation of the skill theory

[6] The Molyneux question, posed by William Molyneux in a letter to John Locke toward the end of the 17th century, is whether or not a man born blind, and taught to distinguish, by touch, a cube and a sphere, would, upon regaining his sight and being presented with a cube and sphere visually but not through touch, be able to tell which was which. One of the issues touched by this question is the nature of spatial representation, in particular, whether each sense modality has its own spatial characteristics which are merely associated with the spatial characteristics of the other modalities through experience (we might call this the empiricist view), or if rather spatial content is something apart from sensory modalities, and which serves as the sole supplier of spatial content for all spatial perception, regardless of modality (we might call this that Kantian view). These two options may not exhaust the alternatives, but that is not relevant for the current discussion.

[7] In MQ Evans took a particular stand on this issue, which was that spatial content was not supplied directly through sensation, but rather supplied to sensation. Spatial content is made available to an organism, according to Evans, via that organism's mastery of a suite of spatially directed sensorimotor skills, such as orienting, tracking, and coordinating sensation and action. Given this, if there is any spatial content made available to an organism in the deliverances of a given modality, it will be only in virtue of that modality's involvement in the cueing and guiding of these skills. And because these skills are geared to successful negotiation of locations in the organism's environment, and successful interaction with objects in that environment, the spatial content thus supplied will be characterizable in terms of the organism's egocentric space (or better: behavioral space). Furthermore, since any other modality is in the same boat, that is, will supply experience imbued with spatial content only in virtue of being involved with those same skills, any spatial content affiliated with the deliverances of that modality will also be specified in terms of the organism's behavioral space, and hence will be specified in the same terms as the spatial content of the other modality. (A brief word on my use of the terms "poise" and "cue". I do not, nor did Evans, think that an experience actually had to lead to the execution of this or that skill on order to have spatial content. Rather, the experience needs to cue the skill, in that it is experienced by the organism as something upon which it could bring any of a number of such skills to bear. Clearly an organism must have the skill in question in order for this to occur. Similarly, a red light in my visual field appears to be located up there only if cues some number of skills, such as orienting to the light so as to foveate on it by means of a very specific sort of saccade. It is not required that I actually employ the skill. I may not want to, or have the need to. My eyes and indeed my body may be paralyzed, and so it might be impossible for my to in this instance.)

[8] This view directly challenges the empiricist assumption, still very much alive today, that experience from each modality (especially vision and touch) comes pre-packaged with its own proprietary brand of spatial content. This view has it that through experience, an organism learns associations between the "space" of, for instance, vision and the "space" of touch, and thus learns that things that look like so-and-so, will feel like such-and-such. If this is how one thinks of spatial content, then one will of course agree with Locke and Berkeley that Molyneux's subject (let's call her Molly) will not be able to tell, through vision, which is the cube and which the sphere. If on the other hand, one agrees with Evans, then one will conclude that if in fact Molly can see, in the sense that her visual experience has a genuinely spatial element when her vision is restored, then she will be able to tell which is the cube, and which the sphere.

[9] As I have already said, Evans' view was that spatial content is made available in perception by an organism's mastery of a suite of spatially relevant skills. It is because experience of a certain sort poises an organism to employ some of these skills, and guides the organism's execution of those skills, that that experience is infused with spatial content. I will introduce this by way of some quotes from Evans' writings. Sorting the core aspects of theory out and providing some substance will occupy the remaining sections of this paper.

The subject hears the sound as coming from such and such a position, but how is this position to be specified? We envisage specifications like this: he hears the sound up, or down, to the right or to the left, in front or behind, or over there. It is clear that these terms are egocentric terms: they involve the specification of the position of the sound in relation to the observer's own body. But these egocentric terms derive their meaning from their (complicated) connections with the actions of the subject... (MQ: 384)

Auditory input, or rather the complex property of auditory input which codes the direction of the sound, acquires a spatial content for an organism by being linked with behavioral output... (MQ: 385)

Indeed, when he uses his hand, the blind man gains information whose content is partly determined by the dispositions he has thereby exercised -- for example, that if he moves his hand forward such-and-such a distance and to the right he will encounter the top part of the chair. And when we think of a blind man synthesizing the information he receives by a sequence of haptic perceptions of a chair into a unitary representation, we can think of him ending the process by being in a complex informational state which embodies information concerning the egocentric location of each of the parts of the chair; the top over there to the right (here, he is inclined to point or reach out), the back running from here to there, and so on. Each bit of information is directly manifestible in his behavior, and is equally and immediately influential on this thoughts. One, but not the only, manifestation of this latter state of affairs is the subject's judging that there is a chair-shaped object in front of him. (MQ: 389)

...we must say that having the perceptual information at least partly consists in being disposed to do various things.... (MQ: 383)

2. Content and Vehicle: Millikan's objection3

[10] In MQ, Evans entertains the following speculation, of which he says that "[f]ew of us would have a doubt about the outcome":

...whether a man born deaf, and taught to apply the terms "continuous" and "pulsating" to stimulations made on his skin, would, on gaining his hearing and being presented two tones, one continuous and the other pulsating, be able to apply the terms correctly. (Evans 1985: 372)

In a discussion of this passage, Ruth Millikan (1991) comments on Evans' "confident and unargued" affirmative answer to this question, and makes the following rather interesting criticism:

The assumption behind Evans' confidence seems to be that continuousness and pulsatingness in whatever medium must be represented by continuousness and pulsatingness, and hence will always be recognized again. Yet Evans, and then I, have just now represented pulsatingness and continuousness to you without the pulsatingness or continuousness of anything in order to do so. (Millikan 1991: 443)
Millikan is surely right to warn us against making content/vehicle confusions of the sort she describes. And she is also quite correct that Evans' answer here is unargued. But she is quite wrong to saddle Evans with a content/vehicle confusion.4 Furthermore, a defense of Evans' conclusion concerning pulsatingness can be organized, in part from materials he has provided elsewhere, and in so doing, I think considerable light can be shed on his views on spatial content.

[11] The first step in the defense is to note that earlier in MQ, Evans has already limited his inquiry to cases in which the perceptual episode has what he calls "the appropriate perceptual representations". He says: is not sufficient for the possession of genuine spatial concepts that one can correctly use spatial terms of a public language, for it is possible that this could be done without the appropriate perceptual representations; in much the same way one might be able to apply color terms correctly by analysis of wavelength of light, or as someone might be able to apply "to the right" and "to the left" to sounds simply upon the basis of the difference in the time the sound waves meet the two ears, and with no spatial meaning at all. (MQ: 368)
It is not entirely clear here what Evans means by "appropriate perceptual representations". I will be addressing that in what follows, but for now wecan simply mark the distinction typographically in the following way. Let us say that when the appropriate perceptual representations are employed in a perceptual episode, the subject perceives (no special typography), but if they are absent, the subject merely perceives*. It will be handy to have a neutral term to be used when no stand is taken on the presence or absence of the appropriate perceptual representations, so let us call this superordinate category PERCEPTION.

[12] So the distinction is this: one can, on the basis of auditory input, determine that since the sound arrived at the left ear n msec earlier than the right ear, it must be located over there. In such a case we can say that though the subject is perceiving* the sound as coming from over there, the subject is not perceiving the sound as coming from over there. Similarly, a high functioning autist might have some explicit theoretical knowledge to the effect that if someone's mouth is angled downwards, then she is depressed; but this would be quite unlike the way in which a normal subject might simply perceive the depression. As Millikan points out, there might be any number of ways in which one could not only REPRESENT that, but even PERCEIVE that, something is depressed, over there, or pulsating. One can write "My heartbeat is pulsating", one might see a digital readout on a heart rate monitor, or one might feel one's own pulse, and all such cases either REPRESENT or PERCEIVE the fact that one's heart is beating. Evans is distinguishing between two kinds of perceptual access (perceptual access and perceptual* access), and derivitavely, two kinds of content. It is this distinction which will ultimately be the basis for Evans' confident answer to the question, and not any content/vehicle confusions.

[13] Let's reflect for a moment on some of the differences between two cases of access to my pulse: seeing the digital readout of a heart rate monitor, and feeling my pulse, say, by putting my finger on my neck. One difference is that the digital readout might provide me with more or better information of some sorts. For example, it will tell me quite accurately how many times my heart is beating per minute. This may not be something that I will be able to immediately determine by feeling my pulse. But it would be rash to conclude that in general I get more or better or more useful information from the digital readout. For example, when I can feel my pulse, I will immediately be able to wave my hand back and forth, like a symphony conductor, in phase with the pulse. The digital readout alone will not allow me to do this.

[14] To see why this is important, consider the case of a person, call him Fourier, whose PERCEPTUAL apparatus is different from ours in the following way. When (and only when) Fourier puts his hand on an electrode with a slowly pulsing electrical current (say 1 Hz), Fourier has a qualitative experience similar to our experience of constant warmth -- so he can reliably tell when a slow electrical pulse is present. Fourier can discriminate changes in the rate of the pulsation in that when the frequency increases, it feels warmer. And when there is no cycling at all, all temperature sensation ceases (call this luke warm). Similarly, let us suppose that Fourier can auditorily discriminate pulsating sounds, like a siren. But the quale which is associated with this discrimination is what we would experience as timbre: that is, if the frequency of the siren's oscillation from lower to higher to lower pitch increases, Fourier experiences a constant pitch which changes timbre, so that very quickly oscillating sirens will cause a quale with a flute-like timbre, while slowly oscillating ones might cause a quale with the timbre of an accordion.

[15] Fourier will thus be able to come to know, by PERCEPTUAL means, not only that there are various sorts of pulsating going on around him, but he will even be able to discriminate pulsatingness which is pulsating at different frequencies. The question I want to pursue in a moment is this: Does Fourier perceive pulsatingness, or merely perceive* pulsatingness? Or better, does he feel and hear the pulsation, or merely feel* and hear* it?

[16] But before I get into that, I want to emphasize that there is nothing particularly strange about Fourier's sensory systems as described. They simply act as filters in much the same way a heart rate monitor readout does. In fact, our own hearing does exactly this with frequencies over 30 or so Hz: we do not hear the individual compression fronts, but the frequency of the vibrations has as its qualitative analogue a scalar value (called pitch) which increases monotonically with frequency, rather than oscillate in phase with the frequency. I could as easily have asked a similar question about a normal subject: Does a normal subject perceive the pulsations in the compression waves in the air, or merely perceive* them?

[17] Now given this sort of sensory apparatus, we can easily see that Fourier would have no reason to generalize from tactually PERCEIVED pulsation to auditorily PERCEIVED pulsation: not any more than we generalize from our PERCEPTION of the frequency of oscillation of photons to our PERCEPTION of the frequency of oscillation of compression waves in the air -- we don't PERCEIVE red as a much faster version of middle C, for instance, even though in a very straight-forward way it is. But what is different in the case of our PERCEPTION of low frequency pulsatingness and Fourier's? There seems to be a certain plausibility to the claim that we could generalize, and a certain implausibility to the claim that Fourier could. But what is the basis of this seeming plausibility? In giving this answer, we must be careful not to run afoul of Millikan's strictures against content/vehicle confusions, and in particular, we must not appeal to what might be thought to be a pulsating vehicle.

[18] I think the difference can be cashed out in terms of the different sorts of content which is made available in our PERCEPTION and in Fourier's. I want to focus on one feature of that content: our perception of the siren puts us in a position to exercise, in an immediate and non-inferential manner, a battery of skills; for example to wave our hand or nod our head in unison with the siren. Fourier would be as unable to do this as we would be unable to make judgments about the frequencies of photons from untutored visual experience. Now here comes the crucial bit: our experience of the low frequency electrical pulse, or of a light bulb which grows and diminishes in intensity, or anything else whose pulsations we can perceive, all put us in a position to exercise some of the same skills. Clearly I can nod my head in unison with a 1Hz siren as easily as a 1Hz pulsating light. These will be the same sort of skills that an animal brings to bear when timing its lunge it the legs of fleeing prey, for example. But Fourier's experience, as described, does not necessarily put him in a position to immediately exercise any of these skills. No more than I would be able to nod my head in unison with my pulse merely by seeing the digital readout of the heart rate monitor.

[19] Now against this it might be objected that Fourier could very well have a slightly more complicated form of sensory apparatus, one which provided him not only with some way to distinguish frequencies, but also some means of discriminating phase information. So for example, Fourier's PERCEPTION of a given siren might be a constant tone of a certain timbre while his PERCEPTION of a different siren with the same frequency but out of phase would be of a tone with the same timbre ('coding' same frequency), but of a different volume. And we can perhaps imagine that some similar gerrymandering of the qualia might allow him to make similar phase discriminations in the tactile modality without a pulsating quale of any sort. Let us grant all this.

[20] Now I want to ask: does this frequency and phase information about the auditorily PERCEIVED siren, regardless of exactly how it is made available in experience, put Fourier in a position to non-inferentially exercise the kinds of tracking skills we can? Does his awareness of the specific timbre and volume, or whatever the qualitative character of his perceptual episode is, guide him in moving his hands or nodding his head in phase with the siren? Fortunately, we don't need to test our intuitions on this question in order to get to the nugget of gold from it. For all we need to recognize is that if Fourier's experience does allow him to engage these skills in the cases of the two modalities, then he will have prima facie ground for generalizing from the one modality to the other. That is, if his experience of the siren as being at a certain volume and timbre and his sensory access to the electrical pad allows him to (noninferentially) swing his hand or nod his head in unison with the pulsating electrical current, then he will have good reason to think of them as having some common feature, and we as theorists will have good reason to ascribe to him some sort of perception of the pulsatingness (as opposed to mere perception* of pulsatingness).

[21] To put the point as bluntly as possible, part of the normal content of pulsatingness, for us, is that it is something with which we can coordinate a number of sensorimotor skills. These skills include not only a capacity to play conductor, but to generate expectations, to compare the phase of different oscillating objects, and many others. This is simply part of that experience's content. Our experience presents the oscillating thing as something on which we could bring any of a host of such skills to bear (I seem to recall the term "affordance" being used for this sort of thing in the not-too-distant past). And since our perception of pulsatingness in various modalities licenses the exercise of many of the same skills, they share this aspect of their content, and it is this that is the basis for generalization, not any content/vehicle confusions.

[22] On the other hand, if Fourier cannot bring these skills to bear, then it is difficult to see how he could be credited with a perception of the pulsation, as opposed to mere perception* pulsatingness. If a subject claims that she perceives a slowly pulsating light, but has no ability to raise and lower her hand in time with the pulsations, though she can with a pulsating sound, then we can most reasonably take this to be evidence that she either does not, despite her claim, perceive the pulsating of the light, or does not really understand what the term "pulsating" means.

[23] Furthermore, in the case of someone with normal sensory systems, if that person lacked all the skills which are relevant to rhythms, if the person could not play conductor, could not anticipate the next beat, etc., then I think it is fair to say that that person would not experience anything as a pulsation or a rhythm. Such a person might very well hear a number of percussive sounds in some interval of time, but would not hear them as a rhythm in absence of these skills. I think that this is a good way to cash out what Evans was getting at with his phrase "appropriate perceptual representations". They are experiences which have the appropriate content in virtue of the fact that they poise the organism to non-inferentially engage some range of skills, and guide the organism's execution of these skills, if they are executed.

[24] What is important about this example is that, contra Millikan, we have the tools to answer the question as Evans confidently did, without making an illegitimate appeal to the nature of the representational vehicle to the effect that it is pulsating. It is not incumbent upon me, or anyone sympathetic to this view, to accept that the only way Fourier or anyone else could manifest these skills would be if there were a pulsating vehicle. Any oscillations or lack thereof in the vehicle carrying the content are quite immaterial. What is important is whether or not the subject is poised to bring the right group of sensorimotor skills to bear. As Evans put it, it a quote I've already used, and which Millikan in her critique took insufficient notice of, "...we must say that having the perceptual information at least partly consists in being disposed to do various things..." (MQ: 383).

3. Space and the "visual field"

[25] In the previous section, I tried to make a convincing case that an organism's being cued to engage skills can be that in virtue of which the organism grasps certain contents.5 The test case was our perception of pulsation. The goal was to put some flesh on the idea that mastery of skills can make certain sorts of content available. But rhythm and pulsation is at least a day's journey from the original topic, which was spatial content. I want to get into the topic of spatial content by addressing what might seem to be a problem for the skill theory. The skill theory claims that egocentric spatial content is content made available via a suite of skills that an organism is poised to bring to bear (not necessarily the same skills involved in perceiving pulsations, of course). Specifically, if a given sensory experience cues the appropriate skills, then it will, in virtue of this fact alone, be an experience carrying a spatial content. Sensory information which does not cue such skills will not be endowed with spatial content.

[26] The objection has to do with the seeming plausibility of the claim that a creature could have genuinely spatial visual experience, without that experience being in any way linked to skills of any sort. The simplest case would be a Humean or Berkeleyan sense-datum creature, which was merely the experiencer of a blooming, bussing confusion of colored two dimensional shapes. In fact, let us stick with Berkeley's thoughts on this. For Berkeley, visual experience is sui generis in that it has nothing intrinsically in common with any other modality of experience, such as tactile. Distance, for Berkeley, is not something that was given visually. Our conviction that we can perceive depth comes from the fact that we have learned, through experience, to associate certain features of visual experience with motor commands and tactile experience. Furthermore, for Berkeley, the visual field is oriented by coordination with motor action as well, for example eye movements. If some color, which is originally at side A of the visual field, moved towards the center when one moves the eyes up, then that is evidence that side A is "up". But what Berkeley never denies, and indeed what it seems very difficult to deny, is that there is some minimal inherent spatial structure to the visual field -- something like a two-dimensional screen upon which colors and shapes appear, move about, change shape and color, and disappear. If we take this away, then it seems we take away visual qualia altogether, for what are visual qualia if not two-dimensionally extended color patches (or perhaps seeming-to-see two dimensional color patches)?

[27] If this is so, then the skill theory appears to be in a predicament. Imagine a case of someone who, for whatever reason, has normal biological visual machinery, but has never acquired any skills which incorporate that machinery. We can imagine an otherwise normal subject , call him Hugh, who from birth has had special goggles attached to his head, that have fed him visual stimulations of colors and shapes, in various motions (producing on his retinae stimulations that are similar to the stimulations normal visual experience produces in normal subjects). But it is done in such a way that the visual stimulations are never in any way dependent on or linked to the exercise of any motor skills -- no motor commands Hugh issues make any difference to the course of the visual stimulations projected on his retinae. His walking around, moving his head, etc. have none of the sorts of regular effects that they would for us. In order for this to work, it would also be necessary to preempt any eye-tracking skills from developing. On the skill theory, being able to track something with one's eyes, or skillfully orient one's eyes toward something in the visual periphery, are also skills. Such circumstances could be avoided either by moving the visual stimulation in just the right way whenever the eyes move so as to nullify the effect of the eye movement, or perhaps more easily by simply paralyzing the eye muscles.

[28] The objector to the skill theory maintains that it seems plausible to think that Hugh would have visual experience of a two dimensional region, with shapes and colors that could very well stand in two dimensional spatial relations. He might not develop the ability to use his eyes to perceive the real world in any useful way, but this doesn't seem to be enough to claim that he would not have the 2-dimensional visual sense-datum "screen".

[29] But a skill theorist, when presented with such a thought experiment, must suppose either that Hugh really does not have visual experience at all -- no visual qualia-- , or that he, while having visual qualia, does not have them in a way that incorporates any spatial element in their content. Both of these seem difficult pills to swallow. Many will be tempted to give up on the first, and claim that Hugh has no visual qualia, because he has never seen the actual world. This may be an option.6 But it is not one I'll take. Rather, I will try to make a case that Hugh would have visual qualia, brightnesses, colors, and all the rest, without there being any spatial element to this visual experience. This might seem barely intelligible, and many will think that I have thrown the pin and kept hold of the grenade.

[30] And perhaps I have. But I want to try to chip away at this intuition a bit. The most gripping reason that it seems senseless to say that Hugh (or any organism) could have visual qualia without spatial content is that we cannot imagine this. We do not find it difficult to imagine visual experience limited to two dimensions, but visual experience without spatial dimensions at all is not something we can imagine. Now I grant that we cannot imagine it, but I insist that the reason we cannot is that all of our visual experience, even our imagined visual experience, is such as to immediately cue all of the skills which provide spatial content. We can no more choose to disengage these skills than we can simply choose to hear our mother tongue as meaningless noise.

[31] The case of mother versus alien languages suggests a strategy for me to pursue. In the case of language, it might seem plausible to someone who had never been exposed to people outside her language community that the noises that come out of her mouth, and the mouths of her friends, carry their meanings with them like business cards that they hand out to anyone within earshot. In order to shake this intuition, we might introduce this person to another language community, one full of people who produce vocalizations that strike her as meaningless noise, and yet who treat them as meaningful, and who react to her vocalizations as if they are devoid of content. If our subject is sufficiently reflective, such an experience should help her to understand that, contrary to what she may have thought before, the nouns she has used from her youth are no more essentially connected to the objects in her environment than any other possible vocal productions. She might come to see that even though she cannot choose but to hear her mother tongue as meaningful, someone else might have no choice but to hear the very same vocalizations as no more than curious exercises of her respiratory apparatus.

[32] This will be my strategy. I will not be able to get you to imagine visual qualia without spatial content, for reasons I have explained. But I hope to be able to get you to recognize that someone else very well might. The argument will be by analogy. I will try to describe a subject who assigns spatial content to different qualia, in a different manner, than the qualia to which we typically assign spatial content. And I will try to make it plausible that the assignation of spatial content to this experience is done via its coordination with spatial skills. Such a subject will be unable to imagine these qualia devoid of spatial content -- the very same qualia to which we assign no spatial content at all. To the extent this makes sense, it opens the possibility that our own intuitions about the spatiality of visual qualia are similarly questionable.

[33] The most famous such example is Bach-Y-Rita's visual prosthesis device. This is an apparatus that allows blind subjects to have something similar to vision, only through the tactile modality. It works by taking a video image from a camera mounted on the subject's head, and using it to drive a grid of tactile stimulators (vibrators) worn on the stomach or back. But this example, as interesting as it is, won't work for me, because the blind subjects, like us, already have a two-dimensional "space" provided by tactile perception on their stomachs and backs. This case is entirely analogous to Berkeley's theory of distance vision: the subjects are taking a pre-existing two-dimensional manifold of experience (in Berkeley's case, the two-dimensional "visual" sense datum screen, in the present case, the subject's ability to perceive two-dimensional vibrational stimuli on his or her skin), and learning to interpret it as providing three-dimensional information about the layout of the environment. I don't want to slight this device at all. I think it works as advertised, and that in fact, after learning, the subjects can perceive (not merely perceive*) objects at a distance just as we can with our eyes. The problem is that I can't get the leverage I need from this device, because it exploits a manifold of sensory qualia that has two dimensional spatial content apart from the new skills learned.

[34] But I can get the leverage I need from another fascinating device designed for the blind, the sonic guide (see, e.g. Heil, 1987). The sonic guide is a device worn on the head, that transmits a continuous high frequency (inaudible) probe tone, and picks up echoes of that tone with a stereophonic microphone. The objects in the subject's vicinity reflect back various components of the probe sound, in complex ways that depend on the size, distance, orientation, and surface properties of those objects. The guide takes these echoes, and translates them into audible sound profiles which the subject hears through earphones. There are several aspects to this translation.

1. Echoes from a distance are translated into higher pitches than echoes from nearby. E.g. as a surface moves toward the subject, its reflected sound will be translated into a tone which gets lower in pitch.

2. Weak echoes are translated into lower volumes. Thus as an object approaches the subject, the subject will hear a tone which increases in volume (as the object gets closer, it will ceteris paribus reflect more sound energy), and gets lower in pitch (because it is getting closer, as in (1) above). An object which merely grows, but stays at a distance, will get louder, but stay at the same pitch.

3. Echoes from soft surfaces, grass, fur, etc., are translated into fuzzier tones, while reflections from smooth surfaces, glass and concrete, are translated into purer tones. This allows subjects to distinguish grass from concrete, for example.

4. The left-right position of the reflecting surface is translated into different arrival times of the translated sound at each ear. Note that it is not required that this coding exploit the same inter-aural differences which code direction in normal subjects. In fact, if the differences are exaggerated substantially by the guide, then one would expect a better ability to judge angle than we typically have.

As the Psychologist John Heil (1987) describes it, the "sonic guide taps a wealth of auditory information ordinarily unavailable to human beings, information that overlaps in interesting ways with that afforded by vision. Spatial relationships, motions, shapes, and sizes of objects at a distance from the observer are detectable, in the usual case, only visually. The sonic guide provides a systematic and reliable means of hearing such things".

[35] Let us imagine a blind subject who is proficient with the sonic guide because she has been using it from birth. Call her Toni. Toni is able to make use of information, such as volume and pitch, to learn about the arrangement of objects in her surroundings. For her, a sound at a certain pitch and volume carries a clear spatial content: of an object, of roughly such and such a size, and such and such a distance, at such and such a direction. Upon hearing such a sound, Toni would be poised to turn her head to face the object, to reach out for it if close enough, or to hit it with a dart, in exactly the same way that our normal visual experience puts us in a position to exercise these spatial skills. That is, through her audition, Toni hears the spatial features of her environment, rather then merely hearing* them. The crucial point is that exactly these sounds, volumes and pitches, have absolutely no spatial meaning for us at all. Middle C is not the sort of thing that is intrinsically nearer to or farther from us than the lower A. And 30dB is not at all, in itself, larger or smaller than 35dB. At least not for us -- if I put headphones on your head, and presented you with a pure tone of middle C at 35dB, and asked you to hit that with a dart, you would be at a complete loss. If competently using a device from birth, Toni might very well wonder what it would be like to hear a certain sound pattern, say Middle C at 35 dB, but not to hear it as over there. Indeed, she might very well think the question unintelligible.

[36] Recall Hugh, who I argued has visual qualia with no spatial import. Hugh, I claim, is to us as we are to Toni -- or more accurately, Hugh's visual experience is to our visual experience as our auditory experience is to Toni's auditory experience. Just as we have never learned to allow various audible features to cue such and such a suite of spatially oriented skills, Hugh has never learned to allow various aspects of visual experience to cue spatially oriented skills. If the analogy holds, then it should be reasonable to maintain that Hugh has visual experience, in the same way we clearly have auditory experience. But for Hugh, this visual experience is devoid of spatial import, just as our auditory experience is devoid of spatial import.7 Hugh would see colors, in that he could distinguish red from blue immediately on the basis of visual input, as we can distinguish tones. He might also be able to tell, for example, how much blue there is in his visual field, perhaps it would be roughly analogous to volume, so that a small blue stimulus would differ from a large on in something like volume intensity. He might be able to distinguish circles from squares, as we can distinguish violins from trumpets. But it seems reasonable to maintain that he might be able to do all this, without assigning spatial elements to any of this experience. After all, we can experience volumes and tones without assigning them spatial import, in spite of the fact that someone who does assign them spatial import (Toni) cannot imagine how this could be possible.

[37] It will be useful to recap the failing of the anti-skill theory argument. The argument was that it seems as though taking away the spatial import of vision takes away vision itself, for what are visual qualia but two-dimensionally extended color patches? The problem with this is that it does not fully recognize the way in which skills are connected with those experiences that unreflectively cue them. For a normal subject, since visual experience immediately cues the skills that supply spatial content, the only way to "imagine away" the space is by "imagining away" all the visual experience. But this fact might just be telling us about our skills, and how they are cued, and not about anything necessarily part of visual experience. We can easily imagine away the spatial import of middle C at 35dB while keeping the experience, because no skills are cued. Toni might be in exactly the opposite boat, being simply unable to imagine away the spatial import of auditory qualia without imagining away the qualia themselves.8 If the anti-skill theory argument were good, then Toni could use it to conclude that if we in fact have auditory experience of tones and volumes, they must be infused with spatial import -- she, after all, cannot imagine these qualia lacking a spatial element to their content. But such a conclusion would be mistake. So the argument (which really was no more than an intuition to begin with), is no good.

[38] Finally, I want to briefly suggest that the analogy may hold even farther. There is no reason to suppose that Hugh would not be able to tell when there are two visual stimuli that we would call qualitatively identical (two blue patches, for instance, of the same size, shape, and color). I choose this example, because many will think that there is a glaring disanalogy in the case of Hugh and Toni. The putative disanalogy results from the supposition that spatial characteristics of visual stimuli provide identity conditions, and are not affiliated with any quale. One consequence of this is that we normal subjects can visually perceive two qualitatively identical blue squares, which are numerically distinct because they are at different locations. Further, it seems that Toni would be unable to do this, since spatial location is always associated with some qualitative element, such as tone or volume. It would also seem that Hugh would be unable to do this for visual qualia for similar reasons. If this is so, then perhaps there is a disanalogy in the two cases, which might threaten to undermine the argument I gave above.

[39] The problem with this line of thought is the assumption that identity conditions for "qualitative" particulars are given exclusively by spatial location. Now I think that there is a close relation between location and identity, but the link is neither as strong nor as clear as many assume. Even a cursory treatment of this topic would be the task for another paper, but I will settle for one final thought in this section, which I nevertheless hope will take the edge off the intuition that qualitatively identical particulars can only be distinguished by appeal to spatial location.

[40] Imagine you are looking at a computer screen that is uniform white except for two qualitatively identical blue squares, one in the upper right hand corner, and the other in the lower right hand corner. Each square begins moving in the direction of the opposite corner. There will be a point at the center of the display at which the squares overlap, and the final state will have one square in the upper left hand corner, and the other in the lower left. Now there are many ways one can describe what happened. Here are three:

1. The squares crossed in the middle of the screen, and each continued in a straight line to the corner opposite the one it started in.

2. The squares bounced in the middle, and each ended up on the same vertical side on which it started, but on the opposite horizontal side.

3. The two squares met in the middle, at which point there was only one square. Immediately following this, two squares appeared, and they moved away to the upper and lower left hand corners of the screen.

If identity conditions for qualitatively identical particulars were exhausted by spatial location, then only the third description is possible, since at the midpoint, there is no spatial difference between the two squares, and hence there can't really be two squares. But crucially not only do both (1) and (2) make sense, most people would use either (1) or (2) in describing the situation, and furthermore, the visual system, apart from the description, perceives the situation as being either a (1) type situation, or a (2) type situation.

[41] The same is true of sounds. Two pure tones, one initially high and the other initially low, can cross, and this will typically be perceived as either a case of two tones crossing, or bouncing, and not as the extinguishing and re-creation of sounds. The point I want to close this section on is that, in fact, identity conditions for qualitatively identical particulars can be, and often are, supplied by elements other than spatial location (a similar point is argued for by Pete Mandik (1998)). And if this is so, then the intuition to the effect that Hugh could not have experience of two qualitatively identical but numerically distinct visual stimuli because he lacks a spatial element to his visual experience is exposed as questionable, or at least in need of further argument that I am unable to furnish.


[42] My chief goal has been to give some substance to Evans' idea that the possession of skills can make contents available to an organism. That Evans' held such a view is widely known, but what the view amounts to has never been satisfactorily spelled out. There has been much appeal in certain areas of the philosophical literature to "embodied skills" making contents available, but little by way of examination of what this could mean. I hope that I have to some extent remedied this, or at least made some initial steps in that direction. But beyond the Evans exegesis, there is one benefit provided by the skill theory that I should like to draw attention to.

[43] I think that an understanding of the skill theory will help us in our attempts to understand neural representation, especially the neural basis of spatial representation. If we are limited to seeing spatial representation in terms of either causal covariation (receptive fields) or isomorphism (maps), or either of these in conjunction with a story about evolutionary pressures, then we are sentenced to failure. Regarding topographic maps, there is simply nothing in the brain that serves as a topographic map of extrapersonal space -- the best one gets are cells that appear to be acting as receptive fields for specific points in extrapersonal space in the posterior parietal cortex of the right hemisphere. But these fields are not arranged topographically, indeed, it would be very difficult to arrange a 3-dimensional space topographically on a 2 dimensional sheet.

[44] Appeal to receptive fields provides no comfort. Such fields perhaps carry spatial information, but this should not be confused with spatial content. To see the difference, notice that if you don the sonic guide, it will suddenly become the case that your auditory cortex (the area that processes features of sound) will be filled with spatial receptive fields, in that there will be neurons, or neural groups, that will fire when and only when some object is at a certain egocentric location (the neurons that fire when and only when you hear middle C at 35dB, for example). But as I have already shown, this will carry no spatial content for you. Now it is true that once you get used to the device, this receptive field will start to carry spatial content for you. But what can "getting used to" mean here, except that you learn to incorporate that aspect of your experience in a non-inferential way into cuing and guiding your sensorimotor skills? (Accounts appealing to evolutionary pressures flop for similar reasons -- Toni never, we may legitimately suppose, had any ancestors whose auditorily apparatus was evolutionary pressured to assign spatial content to middle C at 35 dB.)

[45] Of course there is much that remains in need of investigation, such as the range of applicability of the skill theory to contents other than spatial contents, and the relation between skill-borne contents and nonconceptual content (if there is any connection). But these are questions for another day. For now, I hope to have shown how Evans' skill theory can be developed and defended in such a way as to show what a genuinely interesting and potentially useful theory it is.


[46] I would like to thank audiences who heard, and provided feedback, on earlier drafts of this material, at the University of Copenhagen in May 1998, and UC San Diego in October 1998. I would also like to thank Pete Mandik and Adrian Cussins for useful discussions and feedback on earlier versions of this paper.

Rick Grush
Department of Philosophy
Center for the Neural Basis of Cognition
University of Pittsburgh


Bach-Y-Rita, P. (1972) Brain Mechanisms in Sensory Substitution. New York and London: Academic Press.

Dretske, F. (1995) Naturalizing the Mind. Cambridge, MA, MIT Press.

Evans, Gareth (1982) The Varieties of Reference. John McDowell, ed. New York: Oxford University Press.

----- (1985) "Molyneux's question". In Gareth Evans (1985) The Collected Papers of Gareth Evans. London: Oxford University Press.

Heil, John (1987) "The Molyneux Question". Journal for the Theory of Social Behavior. 17:227-241.

Mandik, Pete (1998) "Objectivity without Space". Electronic Journal of Analytic Philosophy, Special Issue on the Philosophy of Gareth Evans.

----- (1999). "Qualia, space, and control". Philosophical Psychology 12 (1): 47-60.

Millikan, Ruth Garrett (1991) "Perceptual Content and Fregean Myth". Mind 100:439-459.

Tye, M. (1995) Ten Problems of Consciousness: A Representational Theory of the Phenomenal Mind. Cambridge MA: MIT Press.


1 Evans never used this term. Rather, in the places where he was most actively developing these ideas, especially Evans (1985) he talked of "behavioral dispositions". Heavy reliance on the term "skill" in explaining Evans' ideas is due to Adrian Cussins. In this, as in many other areas, I am following Cussins' lead. (back)

2 Evans also addressed this issue in Chapter 6 of The Varieties of Reference. But his discussion there is less adequate in many ways, and in any case is almost entirely parasitized from MQ. (back)

3 I will be hammering on Millikan (1991) a bit in what follows. Unfortunately, in our sometimes ungrateful profession, an ounce of criticism can provide better rhetorical traction than a pound of praise. I want to register the existence of the pound before I turn attention to the ounce. (back)

4 Millikan remarks that in a footnote in Evans' paper, McDowell, who posthumously edited the paper for publication, pointed out that Evans "subsequently noted" that the example of pulsatingness he used was disanalogous to the Molyneux case, because in the case of his pulsatingness example, it is the experience itself which is pulsating, and not the content. His example involved a pulsating stimulus applied to the subject's skin. Millikan takes this to be evidence that perhaps Evans realized that he was making a content/vehicle confusion. But this cannot be right, because the example can of course be easily recast in terms which are completely analogous (and perhaps Evans' note was no more than a reminder to himself to do this in the next revision of the paper). Specifically, we imagine the subject perceiving a siren, learning to apply "pulsating" to the siren and not to a car alarm of constant tone and volume, and then exposed to a light bulb whose intensity is waxing and waning. In this case, it is in fact the content (the siren or the light bulb) which is pulsating, and we need make no assumptions about the character of the experience, or the stimulus, or any other intermediary, involved in the perceptual episodes. (back)

5 I use the term "cued" to mean that the skill need not be executed. Like an actor waiting off stage, a skill can be cued without actually making an appearance. (back)

6 Fred Dretske (1995) and Michael Tye (1995), for example, would no doubt go this route -- see Mandik (forthcoming) for an interesting treatment of the relationship between something similar to the skill theory I develop, and mechanisms appealed to by Tye and Dretske. (back)

7 This point requires a bit of care. Of course our auditory experience has spatial import, but not in virtue of the same features that give spatial import to Toni's auditory experience. If we limit application of "auditory experience" to tones, volumes, and large inter-aural discrepancies, then auditory experience has no spatial import for us. (back)

8 I suspect that this same corrupt line of reasoning is behind the thought of some (not all) philosophers who take it that all human thought and cognition (at the "personal" level) is linguistic. In order to find what is NOT linguistic, they try to create some mental images or ideas that are not verbally coded. When they engage in these imaginings, they find that not much remains when they imagine away the "inner voice". But all this tells us is that verbally coding things that we are thinking about is a skill that we cannot turn off. It does not tell us that our thoughts or other cognitive episodes are essentially verbal. If we had not learned a language -- i.e. learned the skill of verbally coding our cognitive reasonings and thoughts -- we would still have the same, or very similar cognitive episodes, just without the accompanying verbal harmonizations. (back)

1998 Rick Grush

EJAP is a non-profit entity. EJAP may not be printed, forwarded, or otherwise distributed for any reasons other than personal use. Do not post EJAP on public bulletin boards of any kind; use pointers and links instead. EJAP retains the right to continuously print all the articles it accepts, but all other rights return to the author upon publication.

EJAP is hosted at The University of Louisiana at Lafayette. Please send comments and questions to