ELI5: Wiener's definition of information: higher probability --> higher entropy & lower information content.

Okay, so we're going to disregard information and explain entropy, just think of information and entropy are opposites, the higher information is, the lower entropy is, and vice versa.

Entropy, in the mathematical since, is the uncertainty over a random variable (event that we can't 100% predict ahead of time). Say I roll an even six-sided die and hid the result from you. You know very little about what the result of my roll is, another way to phrase it is to say that you are uncertain about the result. this uncertainty can be measured as approximately 2.584 (in truth, there are a bunch of ways to measure entropy, but they're all pretty much equivalent so everyone uses the same standard). Now, suppose I was to tell you that the die is loaded, and will roll a six 90% percent of the time, with other outcomes having 2% chance. You can now be pretty certain you're gonna get a 6, you're so certain that, with this knowledge, the entropy has dropped to 0.701. The more certain you are about the outcome of an event, the lower the entropy of that event is ,with absolute certainty (I.E. whether the result of an even die will be less than 10) having 0 entropy.

There is also a neat part to entropy: the entropy of a random variable is the lower bound of the average of yes-no questions you can ask until you can find the outcome of the variable. So if I rolled a fair die and had you guess the outcome, but you can ask me any yes-or-no question, on average, you'll have to ask me more that 2.584 questions. Whereas if I was rolling the loaded die, you could lower the average questions asked to 0.701 questions by asking the right ones (NOTE: this is only mathematically realizable, since you'd always have to ask at least one question).

/r/explainlikeimfive Thread