Entropy, Cross-Entropy, Mutual information and Kullback-Leibler divergence
An introduction to information theory
Introduction
The universe is overflowing with information. Everything we say, hear, think and see is an information. Everything wraps an information in it, that must follow a certain rule no matter the format. But sometimes we want to process these information, whether to compare it to another or to quantify how much information a signal has from any format. We need a measure, we need a metric.
In 1948, Claude Shannon founded what is known now as information theory. The goal was to transmit a signal from a sender to a receiver efficiently. The concept is simple but genius.
Information
We will try to build this concept together, and reach the famous formula.
Let’s suppose that your friend has picked a city in a set of shuffled cities where the number of countries in each continent is equal and the number of cities in each country is equal. You should guess this city. Your friend will help you with a statement each time. We’ll try to quantify the information included in each statement.
First, he told you, “It’s a city”. This does not provide us with any information. We already know that we…