Explainer

Decision Trees

Entropy and Information Gain

Definition (Entropy)

The entropy of a dataset \(S\) with classes \(C\) is:

\[H(S) = -\sum_{c \in C} p_c \log_2(p_c)\]

where \(p_c\) is the proportion of examples belonging to class \(c\). Entropy is maximised when classes are equally distributed and zero when all examples belong to a single class.

Definition (Information Gain)

The information gain of splitting dataset \(S\) on attribute \(A\) is:

\[\text{IG}(S, A) = H(S) - \sum_{v \in \text{Values}(A)} \frac{|S_v|}{|S|} H(S_v)\]

Read more >

Anki Explained

Hermann Ebbinghaus and the Science of Memory

Hermann Ebbinghaus

Hermann Ebbinghaus

Read more >