Saturday 20 April 2013

From Information to Learning: Why Shannon is important and what it means for Educational Technology

Shannon's information theory as presented in his and Warren Weaver's "Mathematical Theory of Communication" still stands as one of the great intellectual monuments of the 20th century. Indeed, when we consider that the technologies of the internet rely fundamentally on his re-application of Boltzman's statistical thermodynamics, his social impact is of Einsteinian proportions.  When things become that familiar, it is easy to forget what they were really about, and what problems they attempted to solve. Now, when 'information' appears all around us, when our learning is becoming dependent on the information we are able to discover as much as the colleges and teachers we meet, when we have become much more aware of our own 'information literacy' (whatever that means!), when we have become more aware of the relationship between the decisions we make and the information presented to us, when 'misinformation' is a stock-in-trade of corporations of political parties, etc., etc., it's worth thinking about what Shannon was saying.

What he wasn't saying was anything about "meaning" (although Weaver had a different view). Shannon made a distinction between information and meaning - information, essentially, was defined in a fairly tight description of a set of statistical probabilities of message transmission and reception: essentially information related to the uncertainty involved in predicting the value of a random item of information. The measure of this uncertainty Shannon, following Boltzman, called 'entropy'. His equation borrowed Boltzman's equation which described the uncertainty of predicting the state of matter at a particular point. Shannon's equation instead showed the minimum number of 'bits' that would be required  to transmit a message. Sending the message with more bits was to add redundancy. His insight was to see that Boltzman's work on physics had  application to communicating signals through a medium.

When we think about learning, however, it's "meaning" that counts. Any teacher knows that they can't just throw information at their students - if it doesn't mean anything to those students, they won't learning anything. The 'meaning' of Boltzman's state of matter is transferred to us because we can see that such and such is 'hot': we might touch it and say "ouch!". We might then relate our experience of "ouch!" to the statistical formulation presented by Boltzman - particularly if only part of the material makes us go "ouch!". What is the equivalent of "ouch!" in Shannon's theory? Since Shannon's equation is a measure of uncertainty, maybe the "ouch?" moment is a moment of confusion.

His point is that communication occurs through fluctuating patterns of uncertainty. Indeed, if you can encode the fluctuating patterns of uncertainty and code them, you may be able to 'compress' the message communicated. This is the basic principle behind entropy encoding algorithms like the Huffman algorithm (see http://en.wikipedia.org/wiki/Huffman_coding) which is used in the file 'zip' process. But it is important to understand what the Huffman algorithm gives us. Communication is successful because there is sufficient redundancy to guarantee transmission over a 'noisy' medium. Huffman eliminates the redundancy encoding the spread of probabilities that could be communicated if the medium was clear. This will lead to the communication of the message, but without the redundancy. We see that the message is the same because we can convert the Huffman code back into the message. Some redundancy is reapplied in the decoding of the message.

No human being works at the basis of a Huffman code. Does meaning require redundancy? Does redundancy contribute to the pattern of uncertainty? Consider a noisy environment. and person X chooses to shout to person Z above the crowd. Person Y, on the other hand, beckons to Z to move into a quieter environment. What's happening here? Partly, we might say that there is a selection of different "channel": the auditory channel is too noisy for Y so they use the visual channel (I don't like the idea of channels but it works in this example).  But there is more than the selection of the channel. There are messages galore in this situation depending on the sender and receiver. The receiver of X's shouting will receive a message like "this person's a fool if they think they can shout above this!"; the receiver of Y's visual signals might equally think "yes, we need to find a way of eliminating the noise". In human communication, noise affects the selection of the message as well as its transmission. Why does this happen?

One way of explaining this is to suggest that anticipation of the likelihood that a message will be successfully received and the communication is successful is fundamental. That calculation involves a selection of the message, a consideration of the medium (the noise), and some acknowledgement of the capacity of the receiver ("how are they likely to respond?"). Such a calculation requires a degree of reflexivity by the sender as they consider the options for making an utterance ("what should I say? how should I say it? how are they likely to respond in each case?")

The criteria for deciding a particular utterance over any other is the maximising of the probability of successful communication. The communication which is going to be most successful is the communication which has the maximum redundancy.

This sounds simple. But much communication doesn't work - particularly in education. Why is that? The reason must be that it is very difficult to assess the capacity of the receiver (students) to guess their likely response. Indeed, much that happens in education isn't communication! Whilst in face-to-face learning communication, it is possible to assess the noise, in online communication, this too is impossible. Online, the receiver is also even more of a black box. But that doesn't explain things completely. The deep problem is that transmitters (teachers) work with constraints which mean that they can become blind to certain options for communication which might be more successful. Custom and practice, assessment regimes, institutional protocol, power relations, personal histories, and (more than anything) fear all feed into this.

Personal constraints can affect the process of selecting the utterance with the maximum redundancy. The communication with the maximum redundancy is the communication that isn't there - with that communication there is no knowable message, so all signals are effectively "redundant". This is the communication which is absent. Identifying the communication which isn't there means looking at the 'negative' images of the possible communications which are imaginable. That means examining the constraints upon the transmitters communication, and considering the likely constraints bearing upon the receiver. By considering the  negative image of communication, a new kind of utterance can emerge. This is a determination of the 'absences' which unite sender and receiver.  In my example, Y's use of the visual channel is a good example: visual communication was absent, and Y's gestures turn the visual channel into a communication channel which is understood by both parties, considering the noise in the environment.

Person Y does more than just identify a new channel. They create new redundancies with their new language. In fact, they create lots of new redundancies, since their message is very simple, but their gestures are likely to be quite elaborate.

So there is a process of identifying the communication that isn't there (the shared absence), which creates the maximally redundant communication. Beyond that, with the emergence of a new language, new redundancies are further produced.

I think this process lies at the heart of creativity in teaching and learning. It is the off-the-wall gestures of teachers in tearing-up the rule-book (that's the book which constrains the messages!) that reinvigorates the communication. Those gestures only happen if teachers can inspect their own constraints as a way of identifying the maximally redundant communication. That shared absence then catalyzes new communications whose form (with their redundancies) gradually emerge. The deep question for online educators is how this can be done online - where the protocols are so rigid and where it is difficult (but not impossible) to step outside the box.

But perhaps what is most interesting in this is that there is a direct link between Shannon's insights into information and something more deeply human and creative.

No comments: