Information Theory



I stumbled on to this book a few weeks ago and immediately picked it up after a quick browse through the sections of the book. I had promptly placed it in my books-to-read list. I love anything related to information theory mainly because of its inter-disciplinary applications. The principles of information theory are applicable in a wide range of fields. In fact it will hard to pinpoint a specific area where concepts from information theory have not been applied. In this post, I will summarize the main points of the book.


Prologue : The Eternal War

The chapter is titled so, because there is a conflict between entropy and information. The entropy is the incessant march towards disorder. One of the ways that I can relate to is my music practice. If I don’t practice my music for long, I find it difficult to retrain my fingers and get back my muscle memory. "That which you don’t use atrophies". Entropy is also something similar. In the absence of any mechanisms to create information, the disorder of the system increases. This obviously raises a question about the mechanisms that allow the information to battle randomness and grow. The book is mainly about describing the mechanisms by which the information grows, the physical order of our world increases – that makes our planet unique, rich and uneven, from atoms to economies. The author focuses on planet earth as this is a special place where information lives, grows and hides in an otherwise mostly barren universe.

In the prologue, the author says that the book would answer the following questions:

  • What is Information ?
  • Where does it come from ?
  • Why is information concentrated on our planet?
  • Why does it grow on our planet ?
  • What are the natural, social and economic mechanisms that allow it to grow ?
  • How do the various mechanisms contribute to social and economic unevenness of the global economy ?
  • How does the social accumulation of information improve our capacity to accumulate even more information?

Introduction : From Atoms to People to Economies

The chapter starts with the story of Ludwig Boltzmann, the famous scientist who committed suicide. Though the exact reason is not known, the author speculates that it could be the apparent conflict between his theory and the order prevalent in the world. His theory was that there is always a march towards disorder, which stumped him because there were so many fascinating things in the nature that were orderly, systematic, almost giving an impression that there was a creator up there who was designing our world. The biggest sin that Ludwig committed, given the context of scientific temper at his time, was that he had worked across spatial scales. His theory made connections between atoms and gases, both belonging to different spatial scales. At that point in time, any connection between various spatial scales was considered as a sin.

At the turn of twentieth century, Ludwig was vindicated. There was immense cross-fertilization of ideas amongst many fields. Yet not all of the cross-fertilization took place near known scientific boundaries. Amid these multidisciplinary tangos, there was one concept that was promiscuous enough to play the field. This was the idea of information. In the twentieth century, the study of information was inspired by war as there was a urgent need to encode and decode messages effectively. The field took off after the revolutionary paper by Claude Shannon and Warren Weaver. Information as a concept found its followers in almost every field for the simple reason that it could be applied to microscopic as well as macroscopic worlds. It was the first truly scale independent concept. Even though the idea of information grew in prominence, many began to forget one crucial aspect of information

We forget about the physicality of information that had troubled Boltzmann. The word information became a synonym for the ethereal, the unphysical, the digital, the weightless, the immaterial. But information is physical. It is as physical as Boltzmann’s atoms or the energy they carry in their motion. Information is not tangible; it is not a solid or a fluid. It does not have its own particle either, but it is as physical as movement and temperature, which also do not have particles of their own. Information is incorporeal, but it is always physically embodied. Information is not a thing; rather, it is the arrangement of physical things. It is physical order, like what distinguishes different shuffles of a deck of cards.

One of the highlights of the work of Shannon and Weaver is that they divorced the idea of information and message. Colloquially we can use both the terms interchangeably. However the need to divorce the two was needed so that further developments in the field could happen. Whatever gets transmitted between two devices, two people, is information. It is humans who automatically interpret the information as a meaning, given the various contextual factors. This clear demarcation was given because technically , one could now focus on sending any kind of messages whether the message meant anything or not. Shannon also came up with a formula for encoding an arbitrary message with maximum efficiency. This formula looked identical to the Boltzmann’s formula.

The beauty of information being scale independent means that one can use principles of information theory to describe everything from atoms to economies. In all the previous attempts, natural sciences described the atom to human connection, the social sciences described the connection between humans and economies. Using the concept of information, one can analyze across all scales. The content of book is laid out in such a way that it describes the history of the universe, centered not on the arrow of time but on the arrow of complexity.

It is the accumulation of information and of our ability to process information that defines the arrow of growth encompassing the physical, the biological, the social, and the economic, and which extends from the origin of the universe to our modern economy. It is the growth of information that unifies the emergence of life with the growth of economies, and the emergence of complexity with the origins of wealth.

The Secret to Time Travel

This book made me look at child birth from a completely different perspective. The author compares child birth as an example of time travel; the baby is transferred from an environment(mother’s womb) that has essentially remained same since the last 1000 years in to 21st century world that is largely alien for the species. There are a ton of machines, gadgets, technologies, objects that are realizations of human knowledge and human knowhow. All the objects that we seen around embody information and imagination. The author uses two central actors, amongst many, to describe the way information grows, i.e.

  1. Physical objects: physical embodiment of information
  2. People: fundamental embodiment of knowledge and knowhow

The fundamental perspective of the author is,

Economy is the system by which people accumulate knowledge and knowhow to create packets of physical order, or products, that augment our capacity to accumulate more knowledge and knowhow and, in turn, accumulate more information.

How are humans different from other species on the planet ?

The fact that objects embody information and imagination may seem obvious. Information is a fundamental aspect of nature, one that is older than life itself. It is also an aspect of nature that accelerated with life. Consider the replication of information-rich molecules, such as DNA and RNA. The replication of DNA and RNA is not the replication of matter but the replication of the information that is embodied in matter. Living organisms are highly organized structures that process and produce information. Yet, our focus here will not be on the information-generating capacity that is embodied in the intimacy of our cells but that which emerged with humans and society. Humans are special animals when it comes to information, because unlike other species, we have developed an enormous ability to encode large volumes of information outside our bodies.

Humans are able to create physical instantiations of the objects we imagine, while other species are stuck with nature’s inventory.

The Body of the Meaningless

This chapter clarifies the differences amongst various terms used in information theory. Terms such as entropy and information are used interchangeably. Indeed they can be used in some situations but not always. Shannon’s definition of information relates to the number of bits required to encode a message with maximum efficiency. In a sense, a highly regular correlation rich structure has less information and a randomized set of instructions in a message has more information. He termed this as "entropy"(von Neumann told Shannon that calling his measure entropy would guarantee Shannon’s victory in every argument, since nobody really knew what entropy was). If I consider my laptop, it contains many documents, pictures, videos etc. In Shannon’s language, if I randomly switch the bits in my computer, the information increases. But this doesn’t go with our intuitive definition of information. Ideally the more regular, the more ordered the data is, there is more information in to it. So, there is a need to expand the definition of entropy as defined by Shannon so that one can use those concepts to talk about information that we can relate to.

The author gives a nice analogy of a half-filled stadium to show the difference between entropy as defined in statistical physics and entropy as defined by Shannon. In statistical physics, entropy is dependent on "multiplicity of states". A highly disordered system tends to have higher multiplicity of states and hence has higher entropy. However it is not necessary that a higher entropy system is necessarily more disordered. In other words, disorder can be equated to higher entropy but not always. In the physical sciences, information has always been referred to something that has order. So, in physical states, information is the opposite of entropy. The ordered states, commonly referred to as information rich states are highly correlated structures. These information rich structures are also uncommon and peculiar structures in the nature.

The author uses the example of Rubik’s cube to illustrate the rarity of ordered states in the nature. Rubik’s cube has 4.3 × 10^9 possible states and the perfect state can be obtained in less than 20 moves. However getting to this ordered state requires a specific movement of the cube that one is called a genius if he can reach to an ordered state in less than 30 moves. This example can be extrapolated to the nature. The growth of entropy is like a Rubik’s cube in the hands of a child. In nature information is rare not only because information-rich states are uncommon but also because they are inaccessible given the way in which nature explores the possible states. The author provides a few nice examples that show the connection between multiplicity of states and the ability to process information,i.e. compute

The main idea of this chapter is to look at the word "information" as defined by Shannon, and then reconcile the concept with the colloquial meaning of the word information and the work of Boltzmann.

The Eternal Anomaly

If the natural tendency of a system is to move towards disorder, move towards higher entropy, how does one explain the information explosion on our planet ? If we look around the planet, it is amazing to see so many beautiful creations of the nature. Why didn’t our planet disintegrate in to chaos ? Why does information grow on our planet ? To explain this phenomenon, the author introduces the theory put forth by Ilya Prigogine. The main idea of the theory is

Information emerges naturally in the steady states of physical systems that are out-of-equilibrium.

The author unpacks the above statement using many examples such as marble in a bowl, box filled with gas, whirlpool in a sink etc. Prigogine realized that although Boltzmann’s theory was correct, it did not apply to what we observe on Earth because our planet is an out-of-equilibrium pocket inside a larger system-the universe-that is moving toward equilibrium. In fact, our planet has never been close to any form of equilibrium. Prigogine did the math and showed that out-of-equilibrium systems give rise to information-rich steady states. So, that explains "Where information comes from ?". In an out-of-equilibrium system, such as Earth, the emergence of information is expected. It is no longer an anomaly. The bad news, however, is that entropy is always lurking on the borders of information-rich anomalies, waiting to devour these anomalies as soon as it gets the chance. Yet information has found ways to fight back. As a result, we live on a planet where information is "sticky" enough to be recombined and created. This stickiness, which is essential for the emergence of life and economies, also hinges on additional fundamental physical properties.

The author explains three mechanisms that make the information sticky. The first mechanism flows from Prigogine’s math that states that out-of-equilibrium systems self-organize into steady states in which order emerges spontaneously, minimizing the destruction of information. The second mechanism comes from Schrodinger’s theory that says Solids are essential to explain the information-rich nature of the life. The third mechanism by which information grows is matter’s ability to process information, or the ability of the matter to compute. The author explains wonderfully all the three aspects that make information "sticky"

The main idea of this chapter is to view our planet as out-of-equilibrium system. The other idea communicated by the author is that of "entropy barrier". I love this concept as it is philosophically aligned with what I believe, "Life is a Martingale".

Time is irreversible in a statistical system because the chaotic nature of systems of many particles implies that an infinite amount of information would be needed to reverse the evolution of the system. This also means that statistical systems cannot go backward because there are an infinite number of paths that are compatible with any present. As statistical systems move forward, they quickly forget how to go back. This infiniteness is what Prigogine calls the entropy barrier, and it is what provides a perspective of time that is not spatialized like the theories of time advanced by Newton and Einstein. For Prigogine, the past is not just unreachable; it simply does not exist. There is no past, although there was a past. In our universe, there is no past, and no future, but only a present that is being calculated at every instant. This instantaneous nature of reality is deep because it helps us connect statistical physics with computation. The instantaneous universe of Prigogine implies that the past is unreachable because it is incomputable at the micro level. Prigogine’s entropy barrier forbids the present to evolve into the past, except in idealized systems

Crystallized Imagination

The author starts off by giving his perspective on life

Life is all about : moving around and processing information, helping information grow while interacting in a social context.

If you reflect on the above statement a bit, I guess you will at least concur with some part of it, if not the entire statement. Our society’s ability to accumulate information requires flows of energy, the physical storage of information in solid objects, and of course our collective ability to compute. The flow of energy that keeps our planet’s information growing is clearly that coming from the sun. Plants capture that energy and transform it into sugar, and over long periods of time they degrade into the mineral fuel we know as oil. But as a species, we have also developed an amazing capacity to make information last. We have learned to accumulate information in objects, starting from the time we built our first stone axes to the invention of the latest computer.

The easiest way to get a grasp on the "accumulating information in an object" is via comparing "apple" that is product of a tree, and "Apple" product from Silicon valley. The former is a product available in the nature and we internalize in our minds while the latter is an instantiation of the knowledge in our head. Both products are packets of information, but only the latter is a crystal of imagination. The author cites two examples of MIT lab scientists who are working on robotic arms and optogenetics. They are trying to create objects that crystallize imagination, and by doing so, they are endowing our species with new capacities. The author gives several contexts where thinking about products in a different way changes several preexisting metrics and notions that we carry on in our head. For example, Chile is a potential exporter of copper and one might argue that other countries are exploiting Chile. However by looking at the value generated in the finished products that use copper, the value of copper itself goes up. So, who is exploiting whom? Isn’t Chile free-riding on the crystallized imagination of other people?

Thinking about products as crystals of imagination helps us understand the importance of the source of the information that is embodied in a product. Complex products are not just arrangements of atoms that perform functions; rather, they are ordered arrangements of atoms that originated as imagination.


The chapter is titled so, to emphasize the amplifying nature of the objects. Each object can be thought of as a crystallization of knowledge and knowhow and these objects become important to all of us because they enhance our capacities to do other things with it. Take laptop for instance. It is a product of someone else’s imagination and we get to use it to produce some other objects. There is no need to know what’s behind the hood for every object that we use. In the words of the author,

Products are magical because they augment our capacities

Objects are much more than merely a form of communication.

Our ability to crystallize imagination into products, although expressive, is different from our ability to verbally articulate ideas. An important difference is that products can augment our capacities in ways that narrative descriptions cannot. Talking about toothpaste does not help you clean your teeth, just as talking about the chemistry of gasoline will not fill up your car with gas. It is the toothpaste’s embodiment of the practical uses of knowledge, knowhow, and imagination, not a narrative description of them, that endows other people with those practical uses. Without this physical embodiment the practical uses of knowledge and knowhow cannot be transmitted. Crystallizing imagination is therefore essential for sharing the practical uses of the knowledge that we accumulate in our mind. Without our ability to crystallize imagination, the practical uses of knowledge would not exist, because that practicality does not reside solely in the idea but hinges on the tangibility of the implementation. Once again, the physicality of products-whether tangible or digital-augments us.

The main idea of this chapter is to describe products as physical embodiments of information, carrying the practical uses of knowledge, knowhow, and imagination. Our capacity to create products that augment us also helps define the overall complexity of our society.

This time, It’s personal

If we look at various products, the knowledge and knowhow for creating these products are geographically biased, though it is coming down a bit at least on the software front. The reason for this geographical bias is that crystallization of any product requires a great amount of knowledge and knowhow. The learning in almost all the cases is experimental and social. Bookish knowledge alone is not enough. You need a certain set of environment where you can interact, share ideas, experiment, learn from trial and errors. Each geographical region has its own idiosyncrasies and hence gives rise to different codifications of knowledge and knowhow. So, this means that there is certainly going to be geographical bias in the products we see. So, this naturally limits the growth of information. The author introduces a term, person-byte, meaning maximum knowledge and knowhow carrying capacity of a human. Is there a limit for human knowledge? Well, let’s talk about knowledge that one can accumulate over a period of ones working life. If I take my own example, there is a limit to how much math you can do, what kind of stats I can work on, what kind of models that I can build, the amount of code I can write. All these ultimately limit the growth of information. In that sense, a person-byte is a nifty idea that says that for information to grow, there needs to be a network of people where the collective person-bytes of the group is more than the individual person-byte.

The person-byte limit implies that the accumulation of large amounts of knowledge and knowhow are limited by both the individual constraints of social and experiential learning and the collective constraints brought by our need to chop up large volumes of knowledge and knowhow and distribute them in networks of individuals.

Links are not free

If one harks back to the time Henry Ford’s Model-T factory, it was considered as a poster child of industrial economy. It stood for the efficiency gained through scale. The output of the factory, the car, was a complex product and the rationale was, it was better to chunk out this complex task in to 7,882 tasks. It is another matter of debate whether there was a need for 7,882 individual tasks or not. One takeaway could be that complex products needs giant factories. Based on that takeaway, we should be having innumerable giant factories, given the complexity of products that we see in today’s world. This is where the author introduces a second level of quantization of knowledge and knowhow; firm-byte. This is a conceptual term that gives a upper limit on the amount of knowledge and knowhow a firm can possess. So, if a product requires more number of firm-bytes, there is a need for a network of firms. The factors that limit the size of the firm has been studied under "transaction cost theory" extensively. The author gives an overview of the theory that says

There are fundamental forces that limit the size of the networks we know as firms, and hence that there is a limit to the knowledge and knowhow these networks can accumulate. Moreover, it also tells us that there is a fundamental relationship between the cost of the links and the size of these networks: the cheaper the link, the larger the network.

It all comes down to links. If you take a typical Barbie doll, the various steps in the start to scratch process happen in twenty different countries. What has made possible this splintering up of the manufacturing process? It is not because the product is complicated. It is because the cost of creating a links between a set of firms has become easy. This could be attributed to reducing transportation costs, revolution in communication technologies, standardization of parts etc. In all the cases where market links have become cheaper, we have seen vast networks of firms participating together. There are innumerable examples that fall in to this category(iPad, iPhone,laptops,cell phones,…)

Does it mean that making the cost of market links cheaper will automatically give rise to increase in information via crystallization of many other products? Not necessarily. We observe links that are inherently expensive depending on the frequency and specificity of the transaction.

In Links We Trust

This chapter explores the role of "trust" in formation of networks. Social networks and social institutions help determine the size, adaptability, and composition of the networks humans need to accumulate knowledge and knowhow. When it comes to size, the ability of societies to grow large networks is connected to the level of trust of the underlying society. When it comes to the composition of networks, social institutions and preexisting social networks affect the composition of the professional networks we form in two important ways. On the one hand, a society’s level of trust determines whether networks are more likely to piggyback on family relations. On the other hand, people find work through personal contacts, and firms tend to hire individuals who trace the social networks of their employees.

The takeaway from this chapter is that social networks and institutions are also known to affect the adaptability of firms and networks of firms.

The Evolution of Economic Complexity

If one’s intention were to study the geographical distribution of knowledge and knowhow, one inevitably comes up with an issue- knowledge and knowhow are intangibles. How does one cull out of these things for various geographies ? The author’s first attempt is to look at the location of various industries that produce complex objects to simple objects. In this context, he uses the concept of "nestedness" from ecology and does number crunching to show that

There is a clear trend showing that the most complex products tend to be produced in a few diverse countries, while simpler products tend to be produced in most countries, including those that produce only a handful of products. This is consistent with the idea that industries present everywhere are those that require less knowledge and knowhow.

The author ties back his person-byte theory to the observations from the data. In a sense, the inference is commonsensical. The knowledge and knowhow of specialized products is sticky and biased towards specific geographical areas where or ubiquitous products, the knowledge and knowhow is spread across a wide range of geographies.

The Sixth Substance

If one looks at the models describing economic growth, the five common factors used in the literature are

  1. Land
  2. Labor
  3. Physical Capital
  4. Human Capital
  5. Social Capital

The author connects these five factors to the principles explained in the previous chapters. For example, the physical capital is the physical embodiment of information that carries the practical uses of the knowledge and knowhow used in their creation. Physical capital is made of embodied information and it is equivalent to the crystals of imagination described in the previous chapters. The author introduces a metric "economic complexity" that takes in to consideration diversity of exporting country, diversity of the country to which export is being made, the ubiquity of the product exported. The author tests his model for predictive accuracy and shows that it performs well.


The last section of the book highlight the main points from the book. In a sense, it makes my summary redundant as the author provides a far more concise summary. So, if you are short on time, you might just want to go over the last 10 pages of the book.



We see/hear/talk about “Information”  in many contexts. In the last two decades or so, one can also go and make a career in the field of “Information” technology. But what is “Information” ? If someone talks about a certain subject for 10 minutes in English and 10 minutes in French, Is the “Information” same in both the instances?. Can we quantify the two instances in someway ? This book explains Claude Shannon’s remarkable achievement of measuring “Information” in terms of probabilities. Almost 50 years ago, Shannon laid out a mathematical framework and it was an open challenge for engineers to develop devices and technologies that Shannon proved as a “mathematical certainty”. This book distils the main ideas that go in to quantifying information with very little math and hence makes it accessible to a wider audience. A must read if you are curious about knowing a bit about “Information” which has become a part of every day’s vocabulary.