Objective 4: Tests of association for ordinal data
Similar to nominal-level data, there are several tests appropriate when analyzing ordinal-level data. Unlike the nominal measures, however, ordinal measures range in value from -1 to 1. You use the ordinal measures of association when both of the variables in a contingency table that you are analyzing are ordinal. If one variable is nominal, then you should use the appropriate nominal measure of association.
Ordinal measures rely upon a comparison of paired relationships in the data. To enhance our discussion, let's look at Table 14-9, which presents a hypothetical relationship between two dichotomous ordinal variables. While the table is simplistic because it depicts only four cases, it will make our explanation of pairs clearer. In any table, the possible combinations of paired relationships are determined by the following formula:
Total number of pairs = N (N – 1)/2.
In the formula, N is the total number of observations. In Table 14-9, the total number of possible pairs of observations is six, as calculated by 4(4 - 1)/2 = 6.
As in our previous examples in this chapter, the cells in the figure are labeled A through D. We also still have the main (AD) and off diagonals (BC). Keeping this in mind, we can examine the possible types of pairs found in a contingency table.
A concordant pair of cases is one in which one case is higher on both variables than the other case. The case in cell D has a higher level of education and income than the case in cell A. This pair also represents the main diagonal. The product of the cases along the main diagonal yields the number of concordant pairs. Thus, if cell A has four observations and cell D has three observations, there are twelve concordant pairs in the table. The greater the proportionate number of concordant pairs in any table, the stronger the relationship in a positive direction.
A discordant pair of cases is one in which one case is higher on one variable than the other case but lower on the other variable. The case in cell B has a higher level of education than the case in cell C. On the other hand, the case in cell C has a higher level of income than the case in cell B. This pair also represents the off diagonal. The product of the cases along the off diagonal yields the number of discordant pairs. The greater the proportionate number of discordant pairs in any table, the stronger the relationship in a negative direction.
A tied pair of cases is one in which both observations are tied on at least one of the variables. The cases in cells A and C both have low levels of education. The cases in cells B and D both have high levels of education. We call these tied pairs the X pairs (education is the independent variable, which is denoted as X). The cases in cells A and B both have low levels of income. The cases in cells C and D both have high levels of income. We call these tied pairs the Y pairs. The greater the proportionate number of tied pairs in any table, the weaker the relationship between the variables. In other words, the appropriate measures of association will have values closer to zero than the tables having higher proportions of concordant and discordant pairs. Now let's turn our attention to some specific ordinal measures of association.
You can use the gamma measure with any size table. It is a symmetrical measurement. Similar to the lambda measure, gamma is a PRE measure of association. Hence, its value is interpreted as a percentage. Unlike other ordinal measures of association, gamma does not consider tied pairs in its calculation. Therefore its value will be higher than other ordinal measures. In other words, gamma overemphasizes the strength of a relationship. The following formula is used to calculate the gamma statistic:
Let's look at another example to illustrate the use of the gamma statistic. Suppose you want to analyze the political socialization effect of the media. Therefore, you collect data that measures the political knowledge of a group of students and the amount of time that they read a daily newspaper. Your research hypothesis is "There is a positive relationship between political knowledge and reading the newspaper." After you collect the data, you collapse political knowledge into two categories: low and high. You also collapse time spent reading the newspaper into two categories: seldom and often. (You can see that collapsing the data loses some specificity of information. However, this is just an example to illustrate the use of the gamma statistic.) Table 14-10 is the contingency table you construct.
Analysis of the percentages shows that there is a strong relationship between the variables. As in our examples that examined the relationship between nominal-level variables, we can calculate a measure of association that will enhance our ability to interpret the relationship. For this example we want to calculate the gamma statistic. The calculation is
Gamma is a PRE measure of association. Thus we can say that our knowledge of a respondent's level of newspaper reading reduces the errors we would make when predicting the respondent's level of political knowledge by 74 percent.
Tau Statistics: Tau b and Tau c
The tau statistic is another bivariate statistic that is used with ordinal-level data. There are two tau statistics that you can use: tau b and tau c. You should use the tau b measure only for square tables (2 by 2, 3 by 3, etc.). Why? This measure will achieve a maximum value of -1 or 1 for only perfectly symmetrical tabular dimensions. It will not reach a maximum value with any other shape of table. The tau c measure corrects for this deficiency. Thus you can use tau c with any size table. In practice, however, the magnitude of difference between the values of the two measures is slight. These measures differ from gamma because they consider the impact of the tied pairs in their calculations. Thus, their values are usually less than the gamma measure calculated for the same contingency table. The formulas for the two measures are
N = the total number of cases.
m = smaller of rows or columns in the table (in a 4 by 3 table, m = 3).
X = pairs tied on the independent variable.
Y = pairs tied on the dependent variable.
When we calculate the tau b and tau c coefficients for our example in Table 14-10, the results are tau b = .43 and tau c = .38. The tau measures are also PRE measures, and our table is a square table. Thus, we use the tau b result to interpret our results. Based on our discussion we can say that our knowledge of a respondent’s level of newspaper reading reduces the errors we would make when predicting the respondent’s level of political knowledge by 43 percent. Note the difference in values between tau b and gamma (.74). This supports our previous discussion that the gamma measure will overestimate the relationship because the tied pairs are not part of gamma’s calculation. In other words, gamma suggests a stronger relationship. The results also support Richard Cole’s concern that we should select the appropriate measure based on the level of measurement, table size, and whether the measure is asymmetrical or symmetrical, not just the measure with the highest value.
Gamma and the tau coefficients are symmetric in their treatment of the variables, which means that their values do not depend on which variable is the independent variable and which variable is the dependent variable. The Sommer's D measure, however, is asymmetric in interpretation. Therefore, when you hypothesize a direction in a relationship, or specify causality, you should use this measure in lieu of gamma and the tau coefficients. This measure is also helpful because you can use it with any size table. In addition, the D measure can consider the pairs that are tied on the dependent variable (Y pairs) in its calculation [DYX]. The statistic can also consider the pairs that are tied on the independent variable (X pairs) in its calculation [DXY]. In this regard, the statistic is similar to the tau statistics.
You will recall that gamma does not consider the tied pairs in its calculation. As a result, the gamma measure is often larger than the Sommer's measure. The formulas for the measures are similar to the gamma formula:
When we calculate the Sommer's DYX (the appropriate coefficient for our example), we get a value of .44. Sommer's D is also a PRE measure. Again we can say that our knowledge of a respondent's level of newspaper reading reduces the errors we would make when predicting the respondent's level of political knowledge by 44 percent. Thus, based on our preliminary analysis, it looks like the hypothesis has some support.