Plot Options in ILLMO

ILLMO offers about 60 options for plotting information produced by the statistical modeling. Below you will find a short description of some of these plot options:

1. Model vs data: a scatterplot of predicted model averages (on the horizontal axis) versus observed data values (on the vertical axis). In case of discrete data, the size of the dots reflects the number of times the response was given. The blue line indicates average responses. Most of the data points are expected to lie in between the two red lines, which correspond to the model average +/- two times the estimated standard deviation of the probability distributions used to model the observed histograms..

2. Model vs data-model: a scatterplot of predicted model averages (on the horizontal axis) versus residue values, i.e., observed data values minus model averages (on the vertical axis). This is an alternative way of rendering the same information as in plot option 1.

3. Cumulative histogram of data-model: the black curve is the cumulative histogram, across all conditions, of the differences between the observed data values minus the corresponding model averages (i.e., the residue values). The blue curve is the distribution used to model this histogram (in this case, a Gaussian distribution). This plot option is especially useful to visually inspect how well the constructed model agrees with the observed data across all stimuli (see plot options 10 and 14 for a similar visual inspection for individual conditions).

4. Model history (LLC/CRC & AIC): the horizontal axis displays the iteration number, which is incremented by one every time a new optimization is performed within ILLMO; the black and red curves show how the LLC (Log-Likelihood Criterium) and the AIC (Akaike Information Criterium) have varied across such iterations. The most likely model is the one with the lowest value of the AIC. Moving the mouse across the curve will display some additional (textual) information about the iterations.

5. Group model history (LLC/CRC & AIC): similar information as in the previous case, the only difference is that the LLC and AIC are summed across all data (scaled attributes, pairwise comparisons or pairwise dissimilarities) in the same group; this is especially relevant when viewing the effect of ‘group variables’.

6.Log-Likelihood Profile & Function: The log-likelihood function (LLF, in black) shows how much the LLC differs from its optimal (minimal) value if one selected model parameter is varied around its optimal value, while all other model parameters are fixed at their optimal values. The log-likelihood profile (LLP, in red) shows how much the LLC differs from its optimal (minimal) value if one selected model parameter is varied around its optimal value, while all other model parameters are re-optimized for each new value of the selected parameter. Note that LLF=LLP=0 at the optimal parameter value.The vertical lines indicate the boundaries of the confidence interval, which are obtained by intersecting the LLP at a value that is determined by the required confidence. The default value for confidence is 95%, which implies intersection at a value of 3.84. but this percentage can be modified at will. This graph is especially useful when drawing statistical inferences from model parameters.

7. R-squared Profile & Function: This is similar to the Log-Likelihood Profile & Function, but instead of the LLF/LLP, the percentage of explained variance in the data, which is equal to the square of the correlation coefficient, is plotted. Note that, unlike in case of the LLF and LLP, the optimum (maximum) value for these measures does not necessarily occur at the optimal parameter value according to the LLC.

8. All stimuli; Thurstone models (trial average): Shows the distributions of the trial averages in the distinct experimental conditions. These distributions are assumed to be Gaussian (according to the central-limit theorem), even in cases where the distributions of the individual observations are not. In the example case, the graphs have different widths because the number of observations that are averaged differs in both conditions (22 in case of the reference condition, which corresponds to the blue curve, and 18 in case of the selected condition, which corresponds to the green curve).

9. All stimuli – Thurstone models (single trial): Shows the continuous distributions that are used to model the observed histograms of the individual observations in the distinct experimental conditions. The blue curve corresponds to the reference condition, while the green curve corresponds to the selected condition. In case of discrete data, the vertical lines indicate the boundaries between the different categories and the area under a curve between two boundaries is the model prediction for the probability that an answer in that category is observed for the corresponding stimulus.

10. All stimuli: cumulative histograms/distributions: the cumulative histograms of the observed data (in red for the selected condition and black for all other conditions) are the stepwise curves, while the cumulative distributions (in green for the selected conditionand blue for all other conditions) are the continuous curves. The parameters of the distributions, most noteworthy their averages and the (shared) standard deviation are derived by optimizing the LLC. This plot option is especially useful to visually inspect how well the constructed distribution for an individual condition agrees with the observed histogram.

11. All stimuli: normal plots: The normal plots offer an alternative way to visually inspect how well the modeled distributiosn agree with the observed histograms for each of the experimental conditions. If the lower corners of the stepwise curves are exactly on the diagonal line, then there is a perfect agreement. If the corners are on a straight line that runs parallel to the diagonal, then there is agreement in the shape of the distribution, but the average is shifted. Similarly, if the corners are on a straight line that is not parallel to the diagonal, then this is evidence for a mismatch in the standard deviation. The more the corners deviate from a straight line, the bigger the difference between the shape of the observed histogram and the modeled distribution. The red line corresponds to the selected condition, while the lines for the other conditions are shown in black.

12. Stim nsel & nref: Thurstone models (average trial): same as plot option 8, but only the reference and the selected condition are plotted.

13. Stim nsel & nref: Thurstone Models (single trial): same as plot option 9, but only the reference and the selected condition are plotted.

14. Stim nsel: cumulative histogram/distribution: same as plot option 10, but only the selected condition is plotted. In the case of discrete data, the values of the observed and model distributions at the category boundaries are highlighted by dots.

15. Stim nsel: normal plot: same as plot option 11, but only the selected condition is plotted. In the case of discrete data, the values at the category boundaries are highlighted by dots.

16. Condition versus data: A scatterplot with the condition (or stimulus number) along the horizontal axis and the observed data values along the vertical axis. In the case of continuous data (left), dithering along the horizontal axis is used to reduce the chance of overlapping dots. In the case of discrete data (right), the (vertical) size of the dots reflects the number of repetitions.

17. Condition versus data & model: A scatterplot with the condition (or stimulus number) along the horizontal axis and both observed and expected values along the vertical axis. The data in black are identical to those generated in case of plot option 16, although they have been offset slightly to the left of the stimulus number to make room for the model predictions. In case of continuous data (see left graph), the model expectations (in red) are represented by the average with an interval equal to +/- 2 times the standard deviation. The minimum and maximum observed values are represented by the dotted horizontal lines. In case of discrete data (see right graph), the (vertical) size of the red dots reflects the modeled probabilities in the different categories. In the latter case, categories with very low probabilities are rendered empty and the category boundaries are indicated by horizontal dotted lines.

18. LL/CR: condition versus difference (effect): The reference condition is represented by a point on the horizontal axis (with zero value). In case the confidence intervals for the differences d(nref,n) are known (their computation can be initiated from within the prob dialog window), they are indicated by the red intervals. This plot allows to decide whether the average value for condition n is significantly different from the average value for the reference condition (nref).

19. Condition versus average (effect): The average values are plotted (as black points) along the vertical axis as a function of the stimulus number along the horizontal axis. In case the confidence intervals for the averages a(n) are known (their computation can be initiated from within the prob dialog window), they are indicated by the red intervals. This plot allows to decide whether the average value for a condition n is significantly different from an a priori assumed value. Either the minimum/maximum observed values (in case of continuous data) or the category boundaries (in case of discrete data) are rendered as dotted horizontal lines.

Note that the graphs in ILLMO contain a little ‘Save’ icon in the lower-left corner that allows to save the picture as a PNG file (which is actually how the above images were generated). The user can also zoom in on a part of the graph by selecting a rectangle with the mouse.

In case gnuplot and ghostscript are installed on the user’s computer, it makes sense to activate the construction and viewing of postcript files through the “Options” menu. A second “gnuplot” icon will appear next to the ‘Save’ icon that allows the user to save the data as gnuplot files, and generate postscript or PNG files from them. The advantage of gnuplot is that the information in the graphs can be edited.

The ILLMO Project

Statistical Modeling Software by Jean-Bernard Martens

Plot Options in ILLMO

Leave a comment Cancel reply

Share this:

Leave a comment Cancel reply