“A New Kind of Science”. Kind of New. Kind of Science.
In 2002 Stephen Wolfram published his 1197 page book called “A New Kind of Science” which, generously, is completely available online here and from which we took some screenshots for this article.
It’s 2020 now. So in retrospect, let’s shed a new light on this book. Did it influence or even revolutionize science? Here is my summary.
1. Complexity From Simplicity
In the book, Wolfram mainly proposes and defends the thesis that, with very simple rules, simple programs running those rules, a lot of visual patterns can be generated, some of them showing high complexity. To know if this is true, it is important to define simple and to define complexity. We’ll come to that.
However, for best understanding of context, we start with the examples from the book Wolfram calls crucial.
1.1 Wolfram’s Computational Experiments
The first example Wolfram gives is one which he has run when he was 12 years old, while clearly inquisitive already, which occurred in 1959+12=1971.
The ‘computationally generative system’-setup is that we create a vector of a certain number of squares, where each square is either black or white. This vector represents the initial condition or first generation. Then a ‘simple program’ will take such a generation as input and using a fixed rule, will generate the next generation, which is a vector of the same length. The fixed rule is always of the type that the color of each square in the vector depends on the color of the squares in the previous vector that are in the immediate neighbourhood. So there is only local influence. It turns out that this is already enough to generate a variation of patterns in the results.
1.1.1 Experiment 0: ‘Nothing’ Results
The zero-th experiment he, I think added retroactively, tested was the rule, “a square in generation n+1 becomes black if itself or either neighbour was black in generation n”.
Quite obviously, even without executing the computer program, one can imagine that this leads to spreading of black squares and never is a white square generated. So we get a black pyramidal structure as in the picture below.
The result is not very exciting. What happens if we change the generative rule just a bit?
1.1.2 Experiment 1: Periodicity Results
The second rule is indicated in the below picture on the right and the result is given on the left.
This may be surprising or not to people. If one reads the rule itself “cell gets black if either of its neighbours were black”, “cell gets white if both its neighbours were white”, one understands that this already encodes the tendency for the same color to spread according to diagonals. Understanding that may take away ‘the magic’ for some. It will please the others, I think, essentially because they capture that a very simple exact rule can generate a relatively complex structure; the alteration of black and white.
1.1.3 Experiment 2: Self-Similarity Results
Wolfram then shows an experiment with exactly the same setup but just a slightly different fixed rule, where the modification in the rule means that two black neighbours generate a white square in the next generation i.o a black one in the previous experiment, all else being equal. Obviously that will generate fewer black squares than the previous experiment for given triangle depth. The resulting figure is shown below.
This picture displays an object property called self similarity. This is defined roughly as that an object has similar structure from afar as when one zooms in on some parts of it. (One could argue that the previous picture possesses the same property, but it would be self-similarity of less complex, smaller structures, so it is less obvious there.)
This pattern in the second experiment will be surprising more people than the result of the first experiment, because it is just harder to predict, because (1) the rule is somewhat more complex and (2) the result is a lot more complex. That makes it more surprising and so also more interesting. One could also call the result more beautiful, and hence more satisfying, but that’s of course subjective. ¹
1.1.4 Experiment 3: Apparent/True PseudoRandomness Results
A further Wolfram experiment leads to a new feature in the result: apparent randomness.
It turns out that, after some 2780 steps a fairly simple repetitive structure emerges, as one can see in the following two pictures.
Wolfram also gives examples of programs that generate structures without discernible periodicity at all.
Whether (a bit of) randomness is more beautiful than recognisable periodicity is subjective, but it certainly adds variation to regularity and so seems to add complexity.²
1.1.5 Are you Surprised?
Are the results of the experiments of Wolfram mentioned here surprising or not? The black pyramid is not surprising. The chess-board pattern is not either, since from the rule itself it’s easy to predict. The self-similar pattern is harder to predict, so that’s quite surprising. Even more unpredictable is the apparent randomness from the third experiment.
1. 2 Non-Wolfram Computational Experiments
Wolfram does not mention experiments performed by others. So what if we ask the somewhat more general questions: can one generate complex pattern from simpler patterns? Can we come up with some of ourselves or from others?
1.2.1 Overlapping Equidistant Lines/Concentric Circles Generate a Surprising Moiré Pattern
When I was 16 and started programming, one of the first programs I cooked up was a graphical one where you draw straight lines with varying slope over each other. In the area where these lines cross a certain surprising pattern could be detected.
Similarly, as demonstrated on this wikipedia page, two sets of lines overlapping each other or two sets of concentric circles with increasing radius overlapping each other can generate Moiré patterns.
One could consider this also as two simple patterns generating complex patterns. These are certainly hard to predict, unless one has seen this before.
Remarkably or not, in magnetic fields, so ‘in nature’, the field lines occur along exactly the same pattern.
1.2.2 Unsurprising Self-Similarity: The Koch Curve
Self Similarity is an interesting property to explore in itself. Digging a bit into it, outside of Wolframs book, there is a more accurate definition of it on Wikipedia. More captivating than the definitions are the examples mentioned there, because one discovers that they come in two kinds. The ones where the definition of the rules already contain self-similarity and the ones where self-similarity of the result is non-obvious or in other words, surprising.
An example of the first category is the Koch curve, after the Swede Helge von Koch (1970–1924). This curve’s constructive definition is also expressed in terms of its generations. The Koch curve of the initial generation is just a line segment of finite length. The next generation is constructed/computed from the previous, by dividing each existing segment in the previous generation in three equal parts, and only for the middle part, replace the segment with a regular triangle of which the base is the replaced segment, then leave out that base segment. Visually, this leads to the following first three generations.
The following animated GIF from wikipedia shows an infinite amount of generations, that is, if you would have infinite time to watch it.
Apart from dizzifying, this may seem ‘magic’ to some. In fact, the self similarity should not surprise you, once you realise that it is already ‘obviously’ present in the definition of the curve. It was ‘set-up’ to be self similar. It was constructed / invented to be self-similar.
This is, at least as far as I see, not the case in the next type of structures.
1.2.3 Surprising Self-Similarity: The Mandelbrot Set
An example of the surprising category are pictures generated based on the Mandelbrot-set. For our argument, it is of course important to understand how the rule assigning a color to a pixel in the picture is defined at its base.
Wikipedia defines the Mandelbrot-set as the collection of complex numbers c for which the iterative mapping from complex number z to complex number f_c(0), to f_c(f_c(0)), f_c(f_c(f_c(0))), … does not diverge to infinity.
The function f_c is not just an arbitrary one, but is defined as the ‘very simple’ f_c(z) = z²+c.
The practical way to calculate and show the Mandelbrot set and the surrounding pixels not being part of it, is to select a rectangle in the complex plane, (say from c=x + iy with -1≤ x ≤+1 and -1≤y≤+1, representing x,y in the generated picture) and for each pixel (x,y), apply the function z1= f_c(0), z2 = f_c(z1), z3=f_c(z2) until the a |zi|=sqrt(x²+y²), measuring the distance of zi to (0,0) in the complex plane, for some i can be considered diverging, so larger than a certain threshold, say 100 or so. For each c, one then assigns a color to c that depends on the number i at which zi first got outside the radius 100 circle around the origin. When a chosen, fixed maximum number of iterations is reached for a certain c without divergence, the pixel is part of the Mandelbrot set and gets assigned the ‘color black’.
Clearly, there is no self similarity directly imposed in this iterative procedure to construct it or should we at least say that, it is not clear that self-similarity is directly imposed in the formulation.
Yet, consider the following resulting picture, and what happens when we zoom in to explore substructures…
As a GIF obtained from Simpsons contributor at English Wikipedia — Transferred from en.wikipedia to Commons by Franklin.vp using CommonsHelper., Public Domain, https://commons.wikimedia.org/w/index.php?curid=9277589
or as a youtube video
or for another rendering, including some entertaining music :)
Clearly self-similarity is omnipresent. The sub-structures occurring on all levels of magnification are also extremely complex.
Isn’t it surprising that all this complexity gets generated from a very simple rule, the subsequent mapping starting at (0,0) with a function f_c(z)=z²+c and counting how many iterations are needed for divergence?
Lacking to see exactly where the rich structure comes from, one gets a feeling that it cannot be planned for by the human who defined the generator, and so, ‘is just present in nature itself’. In other words, the generated pattern is a discovery, not an invention.
Wikipedia gives some history of the Mandelbrot set
The Mandelbrot set has its origin in complex dynamics, a field first investigated by the French mathematicians Pierre Fatou and Gaston Julia at the beginning of the 20th century. This fractal was first defined and drawn in 1978 by Robert W. Brooks and Peter Matelski as part of a study of Kleinian groups.[2] On 1 March 1980, at IBM’s Thomas J. Watson Research Center in Yorktown Heights, New York, Benoit Mandelbrot first saw a visualization of the set.[3]
It ends the introduction on the set with:
The Mandelbrot set has become popular outside mathematics both for its aesthetic appeal and as an example of a complex structure arising from the application of simple rules. It is one of the best-known examples of mathematical visualization and mathematical beauty.
That high complexity can arise from very simple rules is the number one message the book “A New Kind of Science” wants to drive home, using 100s of examples from Wolfram himself. To me the Mandelbrot set is a much better example of generation of complexity from simple rules. It is therefore quite remarkable that Wolfram’s first mention of it is on page 899 in the (foot)note part of the book.³ ⁴
2 Nature Shows Complexity and How Can Science Best Describe/Predict it?
The chapter 7 of Wolfram’s NKS book, many examples of complexity present in nature are given. The first and a very convincing is about typical forms of growing snowflakes. It’s given in the figure below.
This picture directly convinces that complexity in terms of structure, periodicity and self-similarity are all present. It poses some questions.
Why are snowflakes hexagonally structured?
Why are they alternating on a very local level between ice and no ice?
Why are snowflakes so symmetric? In other words, how does one arm of the 6 know to behave in the same way as the others? ⁵
Hexagonality is due to the crystal structure of ice. Wolfram answers the second question by “the major effect responsible for this is that whenever a piece of ice is added to the snowflake, there is some heat released, which then tends to inhibit the formation of other pieces of ice nearby.” This is a qualitative answer and traditional science would normally take a quantitative approach by for example presenting a (set of coupled) partial differential equation(s) to model this more accurately. Apart from being a more accurate description of a phenomenon, the advantage of using the classical analytical approach is also that computers and software libraries directly allow one to simulate the phenomenon. While differential equations are only descriptive and by themselves they do not contain a recipe to simulate, nowadays a practical constructive recipe directly follows from it.
Wolfram does not mention this. That’s weird because he is the main architect of the Mathematica package which amongst many other things is able to do exactly that: take a differential equation and solve it, which ends up with showing an example simulation of the phenomenon.
Wolfram does mention that with his approach of cellular automata a visual result close to what is observed in nature can be generated. He shows convincing results of that as in the picture below.
One could argue that while the result is ‘close’ by showing similar but not identical structures, it does not prove that the process is the same as taken by nature.
With the traditional approach to science via differential equations, one only has to take the new step of accurately describing the new studied phenomenon in an equation, based on possibly new but potentially already discovered physics, then the constructive recipe just rolls out via equation solving software libraries.
As an example of a very condensed and powerful descriptive model in physics, consider the below Maxwell’s equations of magnetism.
With this differential model, the complete field, so the strength and direction of the magnetic force, in each point can be derived. This works with standard calculus. I also happily solved countless exercises in my first engineering education years by integration over surfaces or volumes. These equations are also the basis of numerical software libraries like Magpylib by which computations of E and B for much larger or more complex settings can be automatically performed.
On the other hand, with the Wolfram cellular automata approach, one will formulate a certainly new ‘simple rule’, then try out and see if this gives results that are similar. If not, the ‘simple rule’ can just (all too?) quickly be altered. I believe there is a danger there, corresponding to the danger of overfitting and as such getting results alike to what nature produces without actually catching the essence of the process present in nature. Also, once a certain rule seems to give a nature-like result for a certain setting, will it also give a nature-like result for another setting? How should the rule be changed to adapt to a setting? Or not at all? In other words, does it have predictive power apart from for the possibly overfitted first experiment?
One may say that this does not matter if the results resemble what nature creates. So it depends if one raises the deeper question of truth (Does nature really behave like this?) versus the more superficial question of utility (Can I just get nature-like pictures?). I think the first question is the one science is supposed to answer. So to me, A New Kind of Science looks like a Lesser Kind of Science.
3 Could Simple Programs form the Basis of Complexity found in Nature?
Would nature use ‘simple programs’ to come up with the complex structures that we see in plants and animals and also in inanimate objects around us?
Wolfram does no claim that when he presents an artificial automata generated picture which looks alike some structure in nature that nature actually does the same computation.
He does formulate a more careful version of this claim, which he calls the ‘principle of computational equivalence’. In Wolfram’s words: “What the Principle of Computational Equivalence says is that above an extremely low threshold, all processes correspond to computations of equivalent sophistication.” Even though he spends many pages on this, I still find this a very vague claim, sounding more like a tautology.
4. My Conclusion
My conclusion about New Kind of Science is this.
4.1 Is it New?
Yes, it was. The kind and scope of cellular (and other) automata that Wolfram describes were quite new when published in 2002. But quite some earlier or simultaneous discoveries like the Game of Life by Conway and the Mandelbrot set were already known, very relevant and quite downplayed in the book.
4.2 Is it Science?
In the traditional method of science, one first aims to describe a phenomenon by a compact concise descriptive model, most often as a type of (possibly coupled) equation(s). Then a widely accepted, and by others encoded in software, process follows to perform a simulation with this/these equation(s) as input. This corresponds to the scientific method, where first a hypothesis is formulated, without checking yet what the results would be, and only after careful consideration of the descriptive model, then the outcome with the generative recipe is produced and checked. One typically has one program with the same rules and should be able to predict results from nature by changing only initial conditions.
In the Wolfram proposed approach with cellular (or other) automata of simple programs, there is just one step essentially, where, for given initial conditions, one can manipulate and tune the rule without constraints to just give the desired results. This corresponds to a less general method than in classical science.
In both approaches, trial-and-error iterations not forbidden, but I think the split between a descriptive and constructive process of the traditional approach to science gives a higher chance of getting close to the process nature follows than Wolfram’s approach.
But well, it may just be a matter of taste or habit.
Is it science? … Kind of. It depends on what one calls science. (see: “Is it Useful” below.)
4.3 Is it Wolfram’s?
Stephen Wolfram is without a doubt a brilliant mathematician and scientist. He also clearly worked extremely hard, with the Wolfram team he lead, to first construct the software package “Mathematica” and then, based on that, to produce the monumental volume NKS is. However, at times he downplays contributions of others to the field of cellular automata. In one footnote, on page 849 called “Clarity and Modesty”, he even writes […] “Perhaps I might avoid some criticism by a greater display of modesty, but the cost would be a drastic reduction in clarity.” I don’t think there needs to be a trade-of here, and certainly not a drastic one.
4.4 Is it Useful?
Did the book ‘New Kind of Science’ have impact from 2002 to 2020?
It would be most informative to see if Wolfram himself has an opinion on that. And indeed he has written elaborately about this online here.
Summarised, I think we can say that in the last 10 and certainly the last 5 years, genetic algorithms and especially neural networks have been very successful. This is for example demonstrated in the 2016 victory of Deepmind’s alpha-go on the best human players of the game Go and the DARPA autonomous driving contests (2004–2018). The latter has by now also given rise to a whole new industry.
However, I would attribute the success of that technology to the learning capacity of those systems, which comes from training them on existing input-output sets and then results in predictive capacities of outputs on new similar inputs. This process nor capacity is present in Wolfram’s cellular automata as presented in NKS.
Wolfram claimed in 2015 “But to me the success of today’s neural nets is a spectacular endorsement of the power of the computational universe, and another validation of the ideas of A New Kind of Science. Because it shows that out in the computational universe, away from the constraints of explicitly building systems whose detailed behavior one can foresee, there are immediately all sorts of rich and useful things to be found.”.
I has become clear that it’s indeed true that we do not need to restrict ourselves to classical science to have useful engineering applications, with the footnote that with the inherent lower understanding about how these models work internally (coming from the training and possible overfitting methods) we have a higher risk at predicting outcomes that do not materialise in nature. To decide whether these predictive methods can be used or not, it depends on the level of criticality and whether the models are trained on outlier events if these events can occur in practice.
So George Box’s aphorism comes to mind here.
All models are wrong but some are useful.
4.5 What’s Interesting about the Book?
NKS certainly convinces that simple programs can lead to beautiful pictures containing a lot of surprising structure. So it certainly has artistic and some philosophical value. The book can also be seen as a catalogue of programs with some parallels to a naturalist reference, which is a nice idea in itself. I am less convinced it contributes a new method that can be used in theoretical science. For engineering applications recent history has confirmed that the neural networks rather than the cellular automata are most useful because of their ability to learn.
Footnotes
- Interesting side questions here are if the subjective experience of beauty has more to do with the subjective surprise or feeling of discovering something new than with the objective structure in the picture itself. I think it’s about both or even the connection between both. And indeed, from this, one wonders if nature would somehow also ‘computationally work’ like this.
- Note that in information theory, maximum disorder, also called maximum entropy (which, according to the second law of Thermodynamics, is what in nature any closed system ultimately evolves to), is defined as the case where there is the least possible information at all about what is going to happen next. More formally, in terms of exclusive events e out of a set E of N events, where each e will occur with a probability p(e), the missing information H(E) can be defined as:
It can be proven via derivation of the Lagrangian (see e.g. here) that this is maximised when P(e)=P(e’) for all event pairs (e,e’) in ExE. This implies P(e) = 1/N. This proves that all events being equally likely to happen corresponds to maximum randomness.
3. Unsurprising versus Surprising Structure: It’s Subjective.
The distinction between what is surprising and what is not depends on the observer’s knowledge and understanding an so is certainly subjective. Once more knowledge is acquired (e.g. with the Koch curve), something that was surprising may become logical in the sense that one understands what causes it (giving satisfaction by the feeling of having discovered a new relation) and become interesting but stop being surprising.
If one looks at the animated zooms of the Mandelbrot set, at least to me, it is still incredible that so much structure keeps coming up from ‘just z²+c’. So one keeps wondering why. The animation keeps surprising me.
4. At the outset we said that Wolfram claims that simple programs can produce complex patterns. It’s quite clear that all the rules presented here and also in the book are quite simple. The complexity found in the output pictures is of different levels. From blandness (a black pyramid), to periodicity (a chess board pattern pyramid) to self-similarity (pyramids containing sub-pyramids, koch curves and fractals). We have tried to demonstrate that this is only interesting to the degree that it remains surprising even after closer inspection.
So to me, Wolfram’s main thesis should be understood in the sense that the ratio of complexity in the output over the complexity in the input is surprising and interesting when high. I still find this ratio the highest in fractals like the Mandelbrot sets. It seems to reveal more about ‘nature’ than Wolframs examples. I think this is also why in the 1980s so many people got excited about it when they became part of the popular science press.
5. Why are snowflakes so highly symmetric? In other words, how does each arm of the 6 know to behave in the same way as the others? ⁵ This Scientific American article gives a qualitative answer, indicating that the symmetry follows from minimisation of energy in the structure, which balances maximisation of attracting forces and minimization of repelling forces. So surely one arm does not actively know about the others, but given certain homogeneous temperature and humidity conditions, then via the forces, it seems that when one arm starts forming in one way, the other arms have no choice anymore but to do the same. The sheer variety of snowflake structures is a consequence of the large variation of these conditions from high to low altitude across different snowflake trajectories through our atmosphere.
Appendix: Further Nature vs. Computational Experiment Similarities
To show that the scope of the book is not only about snowflakes, here are some more examples from the book. On page 385 of NKS, Wolfram gives beautiful examples of periodic or self-similar structures found in nature at many different scales.
He then produces similar structures artificially with his automata like these ones on page 400.
The structures are quite alike the ones in row 3, column 3 and 4 of the pictures gotten from nature above and certainly arty. But does it capture anything that would ‘happen’ in nature or is the similarity accidental? Not sure… Is it useful for science? I am not so convinced.
More examples from chapter 7 concern the breaking of materials, fluid flows, 2D pigment structures and 3D shapes in biology and even financial systems.
Peter Sels, August 7th 2020. Written for all people curious about the NKS book, but wanting to be efficient in spending their time. :)
Copyright © 2020 Logically Yours BV.