Publications / SAND Report

Entropy and its Relationship with Statistics

Lehoucq, Richard B.; Mayer, Carolyn D.; Tucker, James D.

The purpose of our report is to discuss the notion of entropy and its relationship with statistics. Our goal is to provide a manner in which you can think about entropy, its central role within information theory and relationship with statistics. We review various relationships between information theory and statistics—nearly all are well-known but unfortunately are often not recognized. Entropy quantities the "average amount of surprise" in a random variable and lies at the heart of information theory, which studies the transmission, processing, extraction, and utilization of information. For us, data is information. What is the distinction between information theory and statistics? Information theorists work with probability distributions. Instead, statisticians work with samples. In so many words, information theory using samples is the practice of statistics. Acknowledgements. We thank Danny Dunlavy, Carlos Llosa, Oscar Lopez, Arvind Prasadan, Gary Saavedra, Jeremy Wendt for helpful discussions along the way. Our report was supported by the Laboratory Directed Research and Development program at San- dia National Laboratories, a multimission laboratory managed and operated by National Technol- ogy and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell Inter- national, Inc., for the U.S. Department of Energy's National Nuclear Adminstration under contract DE-NA0003525.