Discrete (Likert scale) data

In the ‘getting started’ section, we discussed the case of continuous data analysis. Let us revisit the analysis of the ‘data40.csv’ data set, but this time taking the discrete nature of the data into account. There are two ways to do this. The first is to read in the data all over again, as specified in step 2 of the getting-started guide, and select ‘Yes’ when asked if the data should be considered as discrete (integers). The second is to modify the model of the data that is already available as continuous, by clicking on the C button. This brings up the dialog box below (where the switch to discrete data has already been made). Closing this dialog box will change the label from C to D, hence reflecting the fact that the data is now being interpreted as discrete rather than continuous.

dialogctod

Dialog window used to control the interpretation of the data as either discrete or continuous

In case of discrete data, we can interpret the input data as a (two-way) contingency table (or counts data), as can be observed when opening the data dialog box.

data40-table

The data dialog window can display discrete data as a contingency table

The change to the Thurstone model in case of discrete data is that category boundaries are introduced (the default choice is to put these category boundaries at half-integer values). They are represented by vertically and/or horizontally dotted lines in many of the figures. In the ‘Thurstone Models (single trials)’ rendering on the left, this means that the model probability for answering in a specific category for the selected condition(2) is equal to the area under the green curve between the corresponding category boundaries, as illustrated in the right graph.

illmo_thurstone_data40

Thurstone modeling, showing how continuous distributions can be used to provide probabilities for discrete data

ILLMO offers several alternative renderings to visually compare the model probabilities (in the selected condition) to the observed fractions. The left figure entitled ‘nsel: fractions versus probabilities’ does so in the form of a scatterplot, while the right figure entitled ‘nsel: regular histogram/distribution’ compares the regular histogram (as observed) to the regular distribution (derived from the Thurstone modeling).

Alternative ways for comparing measured fractions to modeled probabilities

Note that by adopting Thurstone modeling, the stimulus conditions are still represented by continuous distributions, despite the fact that the input data itself is discrete. This implies that the continuous analyses introduced in the getting-started section can still be applied. One should however realize that they only produce approximate results and that the only theoretically sound way of drawing statistical inferences in such a case is the general method of confidence intervals, as obtained from the log-likelihood profile.

Discrete data, as obtained from experiments with Likert scales, are often considered to be on an ordinal scale rather than on an interval scale. Most textbooks advice to analyze such data using non-parametric methods, which can be simulated in ILLMO by selecting a rank-order mapping (using the spline dialog window).

data40_spline

The spline dialog window can be used to select a rank-order transformation on the input data

In the case of our example data, this indeed results in a lower value of the LLC and AIC, and hence is a better model description. The effect of this nonlinear mapping is that the category boundaries become non-uniformly distributed, or equivalently, that the categories themselves have unequal sizes.

illmo_rank_order_data40

The ILLMO interface after applying a rank-order trasnformation to the input data

Also in this non-parametric case, making inferences about the difference in the averages between both conditions can best be done using the LLP. In the example case, zero difference of the averages is on the boundary of the confidence interval, so that this difference is marginally (in)significant.

illmo_differencenp_data40

LLP and confidence interval for the difference in averages between both conditions, where the data have undergone a rank-order transformation

Note how the general approach of log-likelihood modeling allows to treat continuous and discrete data in a uniform way.

While in this example case, the rank-order transformation did indeed result in a better model fit, there is no theoretical argument why this should always be the case. So feel free, especially with your own data, to try out alternative non-linear transformations, such as spline or Box-Cox functions, next to no mapping at all or the ‘rank-order’ mapping. Use the AIC to establish which alternative is the best in your specific case.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s