Data science vs. statistics

No Thumbnail Available
Full text at PDC
Publication date

2023

Advisors (or tutors)
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Citations
Google Scholar
Citation
Abstract
The original text states that Statistics has a strong mathematical basis, with developments that include theorem-proof representations linked to areas of pure Mathematics such as mathematical analysis, algebra, mathematical optimization, and theoretical probability (as part of measure theory), free of subjective interpretations. However, Statistics also has an experimental side linked to different approaches for understanding probability, when estimating parameters or testing hypotheses through samples or data. Bradley Efron (1978) argued that “Statistics is a difficult subject for mathematicians” but “is also a difficult subject for statisticians”, arguing that there has been a philosophical battle between classical (or frequentist) and Bayesian statisticians regarding the understanding of probability with samples (in DeGroot and Schervist (2012) both perspectives are thoroughly explained; see Chapter 7, Section 1). The term “Data Science” was introduced in 1997 by Jeff Wu, a disciple of the renowned statistician Peter J. Bickel. Wu, known for formally proving the global convergence theorem of the EM algorithm, was feeling dissatisfied with the term “statistics” as it was often being associated with “descriptive statistics” and was not capturing the essence of his work (Chipman and Joseph, 2016). In order to establish a distinct discipline that would encompass his work, Wu coined the term “Data Science” and the term “Data Scientist” to refer to individuals working in this field as an alternative to “Statistician”.
Research Projects
Organizational Units
Journal Issue
Description
This Invited contribution to the International Encyclopedia of Statistical Science (Springer).
Unesco subjects
Keywords