Entropy and Information Gain
Definition
(Entropy)
The entropy of a dataset \(S\) with classes \(C\) is:
\[H(S) = -\sum_{c \in C} p_c \log_2(p_c)\]
where \(p_c\) is the proportion of examples belonging to class \(c\). Entropy is maximised when classes are equally distributed and zero when all examples belong to a single class.
Definition
(Information Gain)
The information gain of splitting dataset \(S\) on attribute \(A\) is:
\[\text{IG}(S, A) = H(S) - \sum_{v \in \text{Values}(A)} \frac{|S_v|}{|S|} H(S_v)\]
Hermann Ebbinghaus and the Science of Memory

Hermann Ebbinghaus