Information is the raw material powering our modern digital world. We think and talk about it frequently, whether in reference to information technology, ‘big data’ analytics, or just complaining about our smartphone’s monthly data limit. Few of us, however, really understand what information is or how we could measure it. Or come to it, charge money for an ‘amount’ of information. This post attempts to explain a little bit of the mathematical reasoning behind information theory and how that reasoning is used to design digital information systems.
The first thing to keep in mind is that digital data (a series of 1 or 0 signals) isn’t exactly the same thing as information. Just like an essay written by randomly choosing words from a dictionary will probably get you an F, a random string of bits (those 1s and 0s) has no meaning. Information is created by interpreting data in a particular context. In our essay analogy, the context would be the grammatical rules of the English language. In digital systems, there are a variety of ‘encodings’ with different rules for interpreting strings of bits in different contexts. ASCII, 2’s Complement, and Temporenc are ways to encode letters, numbers, and dates respectively.
Knowing this, we can mathematically define information to be data which reduces uncertainty about a particular circumstance or set of possibilities. How much information is communicated depends on how much uncertainty is resolved. For example, if I pick one card from a deck and tell you it is a king, I have given you more information than if I only told you its suit is Hearts. In the first case, your uncertainty about the identity of the card was reduced from 52 possibilities to 4 while in the second it only goes from 52 possibilities to 13.
The equation for the amount of information carried by digital data in bits is where is the amount of information, is the original number of possible states for the uncertain situation, and is the remaining number of possible states after you receive the data.
How is this useful you ask? Well, we can use this definition to put a lower limit on the average number of bits needed to correctly send or store an amount of information. In mathematics we call this the entropy of the random variable, or the amount of uncertainty about the state of a situation. This lower limit is why we can only compress files so much and why internet connection speeds limit how quickly pages load.
When your Clash Royale game lags on a bad connection, it is because your iPhone either isn’t receiving enough bits to know exactly what your opponent is doing or it isn’t sending enough bits to tell SuperCell’s servers exactly what moves you want to make. This uncertainty usually prevents you from winning the match.
So for the sake of higher Clash Royale scores, and maybe to become a more informed digital citizen, keep in mind that information isn’t just a fuzzy concept but an exact quantity we can measure and even charge for.