CHI 2019, Glasgow, UK, Tuesday May 7th, 9h (limited seats available, register early, preferably before March 11, 2019)
See illmo-course-chi2019 for a more detailed course description.
Suppose we would analyze current statistical practice from a user experience (UX) perspective. Most CHI scientists interested in empirical research understand WHY statistics is relevant, although most of the motivation is extrinsic (it is needed to get quantitative research published). There however seems to be much confusion in the scientific community about WHAT statistics actually does (or doesn’t do), and there is virtually no understanding of HOW statistics works. This can lead to statistics anxiety, or at least confusion and uncertainty about how to interpret statistical results.
This situation is far removed from the UX perspective on technology that drives the CHI community, i.e., that technology should support people in enjoying their activities and help them to be confident about the work that they produce. The CHI course “Insights into Experimental Data through Interactive and Intuitive Statistics” aims at lifting some of the myths surrounding statistics. It shows how many seemingly different statistical techniques can be derived from a few basic principles. It also offers an interactive environment (the ILLMO program developed by the course lecturer) to practice with the newly proposed approach. Specifically, the course addresses how frequent statistical tasks such as hypothesis testing, linear regression and clustering can be performed in a more intuitive and interactive way.
So, WHAT does statistics actually do? Unlike what some people seem to think, statistics doesn’t tell you anything about your data, as the data is assumed to be fixed (e.g., if I toss a coin 10 times and obtain 4 heads (H) and 6 tails (T), this observation is accepted as it is). Statistics instead establishes the likelihood of different alternative models that you propose for your data (how likely is it that the coin is fair, i.e., has equal probability of producing H and T?). This allows statistics to separate likely from unlikely models.
HOW does statistics determine the likelihood of different alternative models? Statistics uses a criterion to express the correspondence (technical term: divergence) between the actual data (e.g., 6 H and 4 T) and any model prediction (e.g., equal probability of p=0.5 for H and T in case of a fair coin). The difference in amplitude of this criterion for two alternative models determines the likelihood ratio of both models. This likelihood ratio can for instance be used to establish confidence intervals for model parameters (e.g., which values for the probability p of an H, and a probability of 1-p for a T, are within the 95% confidence interval).
Unique characteristics of ILLMO
ILLMO (Interactive Log Likelihood MOdeling) is a statistical modeling tool (for Windows and Mac OSX) that was developed by Jean-Bernard Martens from the department of Industrial Design at the Eindhoven University of Technology. For a quick impression of the ILLMO program, see the Instruction Videos.
The tool offers several advantages over existing statistical software approaches:
- ILLMO provides a graphical interface that allows to easily modify the statistical model that is used to describe and analyze a dataset, which invites users to try out alternative models, next to the default model, which uses Gaussian distributions (as is customary in traditional statistics). ILLMO supports multi-model comparison to assign likelihoods to each of the models being considered.
- ILLMO offers a range of visualizations that illustrate the statistical model itself, the correspondence between the observed histograms and the modeled distributions, as well as deduced information (such as the confidence intervals) that is needed to make statistical inferences.
- ILLMO allows to compare results obtained using the general method of log-likelihood modeling to those produced by existing analysis methods such as ANOVA, regression analysis and T-tests that rely on the specific assumption of Gaussian distributions (often, with constant variance across conditions).
- ILLMO includes Empirical Likelihood (EL) as a new method for doing non-parametric statistics
- ILLMO can be used to teach statistics in an interactive way; especially the many visualizations make it easy to explain statistical concepts to novice users.
- ILLMO uses a single approach for a wide range of statistical problems. The log-likelihood criterium (LLC), or in case of discrete data the more general Cresssie-Read Convergence criterium (CRC), is used as the goodness-of-fit measure between observed histograms (the original data) and modeled distributions. ILLMO minimizes this criterium in order to estimate the model parameters such as the distribution averages and standard deviations. The log-likelihood profile (LLP), which is the variation of the LLC (or CRC) around its minimum value as a function of any model parameter, is used to establish how accurately such a parameter is determined by the data. Intersecting this LLP at a predefined value provides the confidence interval for this parameter, which in turn can be used to make statistical inferences (i.e., to conclude whether or not this parameter differs from a hypothesized value).
Disclaimer: the “example projects” are posts that illustrate instances of how students at our Industrial Design department have used the ILLMO program to address statistical issues. These posts have not been edited by the author, so that not all reported analyses or results may be appropriate or correct, and some of the links may no longer be maintained. Some of the examples moreover made use of older versions of the ILLMO program (such as the one shown in the Old Videos) that are no longer supported.