Description
Feature hashing is a computationally efficient pre-processing technique for sparse, high-dimensional features. Starting from an overview of the method, this talk covers: the impact of hash functions, hash size and collisions on statistical performance; three libraries for model training with feature hashing; hash reversibility and its implications for model interpretability.