Thursday 29 January 2015

The Limits of Algorithms

Chris Yapp, on his Future TechBlog, has posted a piece "The Limits of Algorithms" and my detailed response to the subject is given below.

Science and Mathematics is all about building abstract models which attempt to reflect various aspects of reality. As someone who did a Ph.D. in theoretical organic chemistry I am well used to the idea of multiple models of atoms and how they interact and whether, for example, it is useful to think of them as rather like miniature billiard balls, or as abstract probability functions. The problem Chris Yapp refers to arises because the computer industry has myopically concentrated on a single model based on pre-defined algorithms for the way information can be processed.
The early computers were developed to carry out highly repetitive mathematical calculations which could not be done quickly or accurately enough by specially trained human beings. It turns out that many highly repetitive tasks could be represented by a predefined set of rules (an algorithm) and hence be carried out using a computer. What could be done was limited by the speed and memory of the computer, and the ability of people of people to program the systems. However this proved to be no real barrier as every few years faster computers, with more memory, and easier to use programming languages appeared on the market, while more and more people were trained to use them. There was big money to be made and careers to be built and it seemed that everyone tried to get on the bandwagon. Worldwide hundreds of people started to develop better hardware and software and the result was a rat race where the first to get to get a successful product to the market won, and the rest fell by the wayside.

In this heated environment did anyone stop and ask whether there was a alternative information model for handling open-ended tasks involving dynamic interaction with human beings. Even the pioneering work at Xerox Parc, which led to the kinds of user interfaces we find today on personal computing systems, did not go back to first principles. It took if for granted that computers were inherently opaque black box systems and that what was needed was a front end which hid the incomprehensible internal workings from the human users. Dozens of different computer languages were devised to find different ways to write algorithms – without asking whether humans naturally thought in an algorithmic way. It was suggested that there was no point in looking for an alternative approach because theoreticians such as Turing related the stored program computer to a “universal machine” – and surely one couldn’t possibly start with anything better than a universal machine. In fact anyone who took time off to question the scientific foundations of what was an outrageously successful industry would soon find themselves at the back of the queue in the race for fame and fortune.
But is the algorithmic model really the best or only “universal machine” model for handling information – especially when incompletely understood and dynamically changing real world tasks involving incomplete and fuzzy information is concerned?

My own research suggests that there is an alternative – but to someone who is immersed in the world of formal algorithms the first steps are counter-intuitive.
In 1967 I made a mistake as far as my career was concerned as I would undoubtedly have had an easier life if I had not queried the establishment line. I was a comparative newcomer to the computer industry, but one who had entered via an unusual career path. I had experience of working in a very complex manual management information system where the key was spotting and reporting the unexpected. I then moved to a very large and complex commercial sales accounting system (Shell Mex & BP) in a completely different industry where the problem was interfacing with a wide and ever changing market. It may well have been one of the most advances computer system of its type at the time.  Finally I moved to a planning department concerned with the probable market place for next generation large computers. My mistake was to pass my boss a note which said that I thought it might be possible to reprogram the microcode of an IBM architecture computer to give it a human friendly symbolic assembly language. This language was called CODIL as it was a Context Dependent Information Language. In retrospect what I had done was to take my manual skills in processing open-ended tasks and transferred to the computer.

The note was passed to the computer pioneers David Caminer and John Pinkerton (who I understand consulted Professor Maurice Wilkes) and as a result I was quickly transferred to research with a useful sized budget and told not to talk to anyone until the patents had been taken out. What happened was that an initial tentative idea, which in retrospect needed several years interdisciplinary brainstorming, was dropped straight into the computer industry rat race. Apart from the fact that the idea clearly caused excitement I had no idea how unconventional it was, and knew nothing about research into the relevant mathematical theory or psychological studies relevant to modelling human thinking. I spent two years writing and testing a pilot simulation program which demonstrated that the idea was at least capable of processing a range of different applications. My reward was to be declared redundant because of the formation of ICL and the closure of the research division in which I worked. Despite the support of Basil de Ferranti (the new Research Director) my project was deemed irrelevant to the company policy of developing the 2900 Series  of computers- and it had to go.
So, with the benefit of nearly 50 years hindsight, what was the idea at the heart of my proposal?  
The stored program model is a rule based top-down approach which uses numbers to process numbers and assumes that there is a human “creator” who can, a priori, define the rules. If you look carefully at the “universal machine” approach you realise that the theory does not cover cases where the rules are not knowable in advance. In practice there is the additional restriction that any “knowable” rules must be identifiable and implementable at a reasonable cost and on a realistic timescale.
In contrast, the CODIL model I developed is bottom up pattern recognition approach which assumes no prior knowledge of the task to be handled. It uses sets and partitions of sets when viewed as a mathematical model but these can be considered as concepts when its human user interface is considered. (For example the CODIL item “Murderer = Macbeth” is treated by the system as defining “Macbeth” as a member of the set “Murderers”.) In set theoretic terms the model seems weak but its strength lies in the power of recursion plus the ability to morph into the stored program computer model. This can happens if you can divide the patterns into two – with one set of patterns being “the rules” and the other set being “the data”. However the system is best when handling problems when there is no clear pre-definable global model and it can become very inefficient when handling tasks which require millions of iterations through a small number of precisely defined rules working within a highly constrained global model – the very area where the stored program computer is strongest.
The two models are in fact complementary – providing different views of the world of information processing:
·        The stored program computer model can be considered a special case of the CODIL model – but at the same time the CODIL model can be considered a special case of the stored program computer model – as it represents an algorithm whose specialist tack is to provide human users with a tool to manage tasks for which the relevant algorithms are not known in advance.
·        The stored program computer model is best at formal mathematical tasks which people find difficult – while the CODIL model is more appropriate to open ended real world tasks where human interaction is essential. This means that they both excel in the areas where the other is weakest.
·        The original proposal (relating to reprogramming the processor microcode of an existing system), and some later research, suggests that it could be possible to build systems which combine both models.
·        Recent work suggests that the CODIL model will map onto a neural network and provide the basis of an evolutionary pathway to explain human intelligence.
So what happened next?