Block 14
Objective 3: Tests of association for nominal data

So far we have examined the relationship between two variables by analyzing a contingency table. These tables can, at times, be so large that we need some simpler way to summarize the information. To do this, we use measures of association that efficiently summarize the following:

 The existence of a relationship.   The direction of a relationship.   The strength of a relationship.   The statistical significance of a relationship.

Measures of association mathematically summarize the distribution of cases in the cells. As we said earlier in Section 14-2, the range of values for these measures ranges from 0 to 1.0 for nominal-level data and from -1.0 to 1.0 for ordinal and metric levels of data. Again, in all instances the closer to ±1, the stronger the relationship. A relationship between two ordinal variables can also be curvilinear if the relationship changes direction. For example, the relationship starts out as a positive relationship and then changes to a negative relationship.

As you will see, there are several measures you can use. So which ones do you use to determine relationships in your data? The value of different correlation coefficients can vary, even for the same set of data. So, do you select the one with the greatest magnitude? While this may be tempting, it is not the basis for selection. Richard Cole stresses that we must select the appropriate measure based on the level of measurement, size (number of cells in the table) of the contingency table, and whether the measure is directional, not just the "strongest" one that might be produced for a set of data (Cole 1996, 205-206).

Tests of Association for Nominal Data

There are several measures of association you can use to determine relationships in your data. As discussed in the introduction to this part, each measure has its own application. The level of measurement, size of the table (2 by 2, 2 by 3, and so on), and the direction of the relationship dictate the measure you should use. In addition, each measure has its own interpretation, advantages, and disadvantages.

The Phi Coefficient

Political scientists often use the phi coefficient if at least one of the variables in a particular contingency table is nominal and both variables are dichotomous, thus producing a 2 by 2 table. Because you use this statistic with data measured at the nominal level, the range of scores is 0 to 1. The closer phi is to 1, the greater the relationship between the variables. Phi is a symmetrical statistic, which means that its value does not depend on which variable is the independent variable and which variable is the dependent variable.

Recalling the cell labels shown in Table 14-1, the formula for the phi coefficient is

Cells labeled as A, B, C, D.

The phi coefficient measures the concentration of observations on either diagonal. To enhance this discussion, let's reexamine the relationship presented in Table 14-6. You will recall that the table showed a relationship between the political party of U.S. House members and their vote to support a bill that eliminates funding for the National Endowment for the Arts. The table depicts that most of the cases are concentrated on the main diagonal (AD). You could conclude from this that there is a substantial relationship between support for the bill and political party. You could also initially determine the strength of the relationship by comparing the percentage of each party and their vote for the measure (92 percent of those Republicans who voted supported the legislation, while only about 15 percent of the Democrats cast supporting votes). Or you could calculate the phi coefficient to summarize the relationship. For our example, you calculate phi as

Based on the phi coefficient, you can conclude that a relationship exists between the political party of the representative and his or her vote. But, how do you interpret the observed relationship? How do you know whether .78 is a weak, moderate, or strong relationship? What words do you use to describe this relationship? Fortunately, James Davis offered some phrases you can use to describe the various ranges of values. Specifically, Davis developed the phrases to use when interpreting Yule's Q, another measure of association. Richard Cole, however, applies these descriptions to fit other measures of association (Davis 1971, 49; Cole 1996, 205-206). Using Table 14-8, we see that the relationship between the representatives' votes and their political party is a very strong relationship. Because nominal measures do not have direction, we did not say that the relationship was a "very strong positive relationship." The table is only presenting suggested interpretations.

 Table 14-8. Suggested Interpretations of Measures of Association

 Values Appropriate Phrases +.70 or higher Very strong positive relationship. +.50 to +.69 Substantial positive relationship. +.30 to +.49 Moderate positive relationship. +.10 to +.29 Low positive relationship. +.01 to +.09 Negligible positive relationship. 0.00 No relationship. -.01 to -.09 Negligible negative relationship. -.10 to -.29 Low negative relationship. -.30 to -.49 Moderate negative relationship. -.50 to -.69 Substantial negative relationship. -.70 or lower Very strong negative relationship.

 Note: You can use this table with nominal measures of association, but do not use the direction (positive or negative) when verbally interpreting them.
 Source: Adapted from James A. Davis, Elementary Survey Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1971, 49.

Cramer's V

Cramer's V (V) is a variation of the phi coefficient. However, you can use this measure with any size table if at least one of the variables in a particular contingency table is nominal. When you calculate this statistic for a 2 by 2 table, the result is the same as the phi coefficient. Thus, the V value for our example is also .78. Similar to phi, its values range between 0 and 1.

Pearson's Coefficient of Contingency (C)

Pearson's C is more appropriate for larger tables (4 by 4, etc.). Why? Because its upper limit depends on the number of rows and columns. Therefore, the range of values is 0 to something less than 1. In fact, the upper limit for a 2 by 2 table is .71. In our example, the value of this statistic is .61. How do you interpret this value? You could not use the table we gave you because it is based on values ranging from -1.0 to 1.0. As we just said, the upper limit for C for a 2 by 2 table is only .71. So, it is difficult to interpret the magnitude (.61) of the statistic. Thus, this limitation is a distinct disadvantage of the C measure.

The Lambda Coefficient

While the measures we just discussed help us to interpret the strength of a particular relationship, they have several limitations. You can only use phi, for example, to examine a dichotomous relationship. In addition, the measures are not easy to verbally interpret. With a V of .70, for example, you can only say that it is stronger than a V of .69 and weaker than a V of .71. You cannot say that it is 1 percent higher or lower.

Lambda coefficients, on the other hand, are proportionate reduction of error (PRE) statistics. Values of lambda are interpreted as percentages. A lambda value of .29, for example, is interpreted as 29 percent. Lambda coefficients enable us to answer the question: How much can the error in predicting values of the dependent variable be reduced when you know the values of the independent variable? Like the phi coefficient, lambda values range from 0 to 1. Unlike phi, however, you can use the lambda measure with any size table. In addition, lambda is an asymmetrical statistic. In other words, unlike the other nominal measures we have discussed, lambda results require identification of the independent variable and the dependent variable. In addition, lambda tends to underestimate the degree of a relationship. Thus, while the measure has more intrinsic meaning than the other nominal measures, this latter characteristic can limit its use.

Remember that lambda coefficients enable us to answer the question: How much can the error in predicting values of the dependent variable be reduced when you know the values of the independent variable? Based on this premise, the formula for lambda is

where

L = the number or prediction mistakes without considering an independent variable.
M = the number or prediction mistakes when considering an independent variable.

To illustrate, let’s return to our example concerning members of the U.S. House and their vote on HR 2107. Our research question is “Can you use a representative’s political party affiliation to predict their vote for the bill?” If you were asked to individually predict the vote for each of the 429 representatives knowing no information other than the distribution just presented, you would predict yes for each representative. You know from the distribution that there were 238 yes votes versus 191 no votes. Thus, you would predict that a representative voted with the majority if for no other reason than that the odds are in your favor. In other words, your best predictor of how representatives voted is the modal value, or yes.

Yes, however, is only your best predictor. If asked to identify the voting choice of each representative who voted, you would be right 238 times if you responded yes. However, you would also be wrong 191 times because that is the number of representatives who cast no votes. When applying the example formula, the value of L is 191, which is the number of mistakes you would make without having knowledge about the impact of an independent variable.

Now let’s consider the impact of the independent variable, political party affiliation. Knowing the impact of party affiliation, you would predict that each representative would vote for the modal choice in each category (Republican = yes; Democratic = no). Even with this knowledge, however, you would be wrong on occasion. How many times would you be wrong? Of the 226 Republicans who voted, 18 opposed the legislation. Of the 203 Democrats who voted, 30 supported the resolution. Therefore, even knowing their party identification and knowing that it related to the vote, you would still be wrong 48 times. Thus, M = 48. Your error rate without considering the effect of political party, however, was 191. You improved your predictive capability by reducing your errors from 191 to 48. When we insert these figures into the lambda formula we get the following:

When you interpret this lambda result, you could say something like "Knowledge of the representative's political party affiliation (the independent variable) reduces the errors in predicting the representative's vote on HR 2107 (the dependent variable) by 75 percent." This characteristic gives lambda a distinctive advantage over a measure of association such as phi, C, or V.