info_theory Module

Algorithms related to information theory.

@author: drusk

pml.tools.info_theory.entropy(dataset)[source]

Calculates the entropy of a data set.

Entropy is the measure of impurity of the data set. For example, if all the samples have the same classification, the entropy will be 0.

Args:
dataset: model.DataSet
The data set whose entropy is to be calculated.
Returns:
The entropy of the data. Higher values indicate less uniform or more disordered data.
pml.tools.info_theory.info_gain(feature, dataset)[source]

Calculates the information gain of a feature in a data set.

The information gain of a feature is the expected reduction in entropy caused by knowing the value of that feature.

Args:
feature: string
The name of a feature in the data set.
dataset: model.DataSet
The data set that the feature is a part of.
Returns:
info_gain: float
The information gain of the feature.

Project Versions

Previous topic

id3 Module

Next topic

knn Module

This Page