Information‐Theoretic Approaches to Neural Network Compression, Clustering, and Concept Learning