# Step 4: Drawing Conclusions

After optimizing your model, you can start drawing statistical conclusions in ILLMO.

You may have questions like: Is there a significant difference between the averages in both conditions? Is the “overall power” in the Box-Cox transformation significantly different from 1 (no transformation) or 0 (logarithmic transformation)?

The standard way in ILLMO to check the accuracy with which a model parameter is determined by the data is to construct the confidence interval for this parameter. A common choice is 95% confidence, which means that the true (so-called population) value of the parameter is estimated to be in the confidence interval with a probability of 95%.

In case we want to test the null-hypothesis that the averages in both conditions are equal, we request the log-likelihood profile (LLP) for the difference in the two averages, denoted by

LLP for difference: d(n1,n2)=a(n2)-a(n1), where n1=1 and n2=2,

in the prob dialog window shown below.

Dialog window prob set up to request calculation of the LLP (and the confidence interval) for the difference in distribution averages in both conditions

The result is displayed in the graph on the left. The red line is the LLP and the dotted vertical lines are the boundaries of the confidence interval; the exact values are reported in the text box and shown in the graph on the right. As zero difference is outside those boundaries, we can reject the null hypothesis that the averages in both conditions are the same. (Note: the reported analysis was performed assuming Student-T distributions; when adopting Gaussian distributions, the conclusion turns out to be different, as you can try out for yourself).

ILLMO interface showing the LLP (on the left) and the confidence interval (on the right) for the difference in averages

The stats dialog window in ILLMO allows to compare the confidence intervals generated by ILLMO with the outcomes of more traditional tests such as a T-test. If we select a T-test and press ‘traditional statistics/analyses’, the text field shows the outcome of such an analysis. In the example case where we have selected ‘pair(n1,n2): T-tests (equal variance, Gaussian)’ , we observe that T = -1.887, which is too small to reject the null hypothesis that both conditions share the same average value.

The text window shows the outcome of a traditional T-test (assuming Gaussian dsitributions), while the plot window shows the confidence intreval derived from the LLP (assuming Student T distributions); the tests come to different conclusions

The T-test in ILLMO also reports the effect size (Cohen’s d=0.60) and power (beta=0.45). One way of interpreting this latter value is that there are insufficient measurements available to obtain sufficient power (by convention, the minimal required value is often agreed to be beta=0.8). ILLMO estimates that the amount of data needed to obtain beta=0.8 is 3.07 times higher than the number of available measurements.

Selecting Thurstone models in one of the plots allows us to visualize the effect size, which is proportional to the difference between the two peak positions (i.e., the averages), divided by the width of the graphs (i.e., the standard deviation). The observed value of d =0.60 is quite low, which is reflected in the fact that the blue and green graphs, which are the model distributions for the reference (1) and the selected (2) condition, overlap substantially. This implies that individual measurements do no distinguish very well between the two conditions under test.

Regular distributions used to model two conditions; both the difference in averages and the width of the distributions determine the effect size

There are several measures to express how well two conditions can be distinguished based on a single measured value; one of which is the overlap in the distributions

In statistics, we tend not to draw conclusions from individual measurements, but from averaged measurements. The graph representing the distributions of the averaged measurements is obtained by choosing ‘Thurstone models (trial average)’, which indeed shows substantially less overlap, and hence better discrimination between the two different conditions. Note that the T-value used in the T-tests is equal to the effect size for these distributions of trial averages. As the difference between the averages is the same in both cases, the T-value is higher than the d-value because the width of the graphs is smaller in case of the trial averages.

Regular distributions used to model averaged observations

A numerical characterization of this discriminative power for both individual and averaged measurements can be obtained by selecting the ‘modeled sensitivity (effect size and ROC)’ in the stats dialog window. The result is displayed in the text box. While many measures of effect size, such as Cohen’s d, have been derived under the assumption of Gaussian distributions, the Receiver Operator Curve  (ROC) doesn’t have this limitation. The area above this curve is the probability of superiority, i.e., the probability that a value drawn from the distribution in condition 2 will be higher than a value drawn from the distribution in condition 1 (which is 70.3% for single trials and 99.7% for trial averages in the example case)..

Receiver operator curve (ROC) for single trials (in black) and trial averages (in blue); Student-T distributions are used to model  the indivisual trials, while Gaussian distributions are used to model trial averages

Note that in the stats dialog window, you can also change the ‘plot option’ to show the graph of your choice. ILLMO sometimes changes the graph in response to the statistical analysis that has just been performed. More specifically, ILLMO tries to select the graph that most clearly illustrates this statistical analysis.

After you click ‘OK’ the text produced by the statistical tests is transfered to the text window of the main screen.

By the way, in case you hadn’t yet noticed, the graphs contain a little ‘Save’ icon in the lower-left corner that allows you to save the picture as a PNG file. You can also zoom in on a part of the graph by selecting a rectangle with your mouse.

In case you have gnuplot and ghostscript installed, a second “gnuplot” icon will appear next to the ‘Save’ button that allows you to save the data as gnuplot files, and generate a postscript or PNG file from them. You can (de)activate gnuplot construction and viewing through the “Options” menu.

That’s all, folks!!

If you didn’t yet find what you were looking for, or need further clarification, you may be interested in the book Insight into Experimental Data or have a look at some of the instructional videos for more information.