In the ‘getting started’ section, we discussed the case of continuous data analysis. Let us revisit the analysis of the ‘data40.csv’ data set, but this time taking the discrete nature of the data into account. There are two ways to do this. The first is to read in the data all over again, as specified in step 2 of the getting-started guide, and select ‘Yes’ when asked if the data should be considered as discrete (integers). The second is to modify the model of the data that is already available as continuous, by clicking on the C button. This brings up the dialog box below (where the switch to discrete data has already been made). Closing this dialog box will change the label from C to D, hence reflecting the fact that the data is now being interpreted as discrete rather than continuous.
In case of discrete data, we can interpret the input data as a (two-way) contingency table (or counts data), as can be observed when opening the data dialog box.
The change to the Thurstone model in case of discrete data is that category boundaries are introduced (the default choice is to put these category boundaries at half-integer values). They are represented by vertically and/or horizontally dotted lines in many of the figures. In the ‘Thurstone Models (single trials)’ rendering on the left, this means that the model probability for answering in a specific category for the selected condition(2) is equal to the area under the green curve between the corresponding category boundaries, as illustrated in the right graph.
ILLMO offers several alternative renderings to visually compare the model probabilities (in the selected condition) to the observed fractions. The left figure entitled ‘nsel: fractions versus probabilities’ does so in the form of a scatterplot, while the right figure entitled ‘nsel: regular histogram/distribution’ compares the regular histogram (as observed) to the regular distribution (derived from the Thurstone modeling).
Note that by adopting Thurstone modeling, the stimulus conditions are still represented by continuous distributions, despite the fact that the input data itself is discrete. This implies that the continuous analyses introduced in the getting-started section can still be applied. One should however realize that they only produce approximate results and that the only theoretically sound way of drawing statistical inferences in such a case is the general method of confidence intervals, as obtained from the log-likelihood profile.
Discrete data, as obtained from experiments with Likert scales, are often considered to be on an ordinal scale rather than on an interval scale. Most textbooks advice to analyze such data using non-parametric methods, which can be simulated in ILLMO by selecting a rank-order mapping (using the spline dialog window).
In the case of our example data, this indeed results in a lower value of the LLC and AIC, and hence is a better model description. The effect of this nonlinear mapping is that the category boundaries become non-uniformly distributed, or equivalently, that the categories themselves have unequal sizes.
Also in this non-parametric case, making inferences about the difference in the averages between both conditions can best be done using the LLP. In the example case, zero difference of the averages is on the boundary of the confidence interval, so that this difference is marginally (in)significant.
Note how the general approach of log-likelihood modeling allows to treat continuous and discrete data in a uniform way.
While in this example case, the rank-order transformation did indeed result in a better model fit, there is no theoretical argument why this should always be the case. So feel free, especially with your own data, to try out alternative non-linear transformations, such as spline or Box-Cox functions, next to no mapping at all or the ‘rank-order’ mapping. Use the AIC to establish which alternative is the best in your specific case.