Friday, November 09, 2007

Attempting to measure educational performance by looking at improvement over time

Steve Sailer suggested an interesting angle to take regarding gauging pedagogical performance by state:
Although demographics obviously are the driving force in measures of student achievement, it is possible for one state to do a better job than another relative to what it has to work with in terms of student potential. One interesting way to analyze the value added performance of a state's public schools is to compare 8th grade scores versus 4th grade scores on the National Assessment of Educational Progress. If a state improves from 4th to 8th grade relative to the rest of the country, this could be evidence that it is doing a good job of schooling (at least in the middle years).
I previously attempted to do something similar in looking at how a state actually performed compared to how its racial composition 'predicted' it would fare. The results are here.

But there are obvious problems with that approach. It's doubtful that blue-blooded, English-descended whites in Massachusetts enjoy an estimated six point IQ advantage over Scot-Irish Appaclachia folk in West Virginia because of superior classroom teaching methods alone.

So I ran the mathematics numbers for fourth and eighth graders, by state, using 2005 scores. The following list shows the gap between fourth and eighth grade scores (for both grade levels, the test is scaled out of 500 possible points):

46: Nebraska
45: Illinois, Massachusetts, Montana, South Dakota (also, DoDEA)
44: Arizona, Iowa, Minnesota, North Dakota, Oregon, Virginia, Wisconsin
43: Alaska, Kentucky, South Carolina, Vermont, Washington
42: Colorado, Indiana, New York
41: Delaware, Missouri, North Carolina, Ohio
40: Maine, Maryland, Nevada, New Jersey, Pennsylvania, Utah
39: California, Connecticut, Idaho, Michigan, New Hampshire, New Mexico, Rhode Island, Tennessee, Texas, Wyoming
38: Georgia, Kansas, Louisiana, West Virginia
37: Alabama, Oklahoma
36: Arkansas, Hawaii
35: Florida, Mississippi
34: District of Columbia

This assumes that an absolute increase in the number of questions on the assessment that are answered correctly is paramount. That may artificially favor states that already do well. After all, it seems reasonable enough at first that, on a test with 20 questions, moving from 10 correct to 12 correct should elicit more celebration than moving from 17 correct to 19 correct should.

Indeed, the increase between grade levels and the actual score for eighth graders (the basis of my IQ estimates) correlate at a rigorous .68.

But we're not looking at percentiles--we're looking at actual scores. It's not as if the above is showing that a move from the 50th percentile to the 60th percentile is equivalent to a move from the 85th percentile to the 95th percentile.

The number of questions that had been answered incorrectly before, but, through the educational process, are answered correctly later down the road, seems to me the important issue. Answering the twentieth question correctly in a series of progressively more difficult questions is as big a leap forward from ending at the nineteenth question as stopping at the tenth instead of the ninth is (assuming the average variance in difficulty across questions is as uniform as possible). Thus, this strikes me as an accurate way of looking at it.

The strong relationship might be, instead of revealing an artificial 'boost', bearing out what is intuitive--more intelligent teachers tend to be more effective teachers. The higher a state's average IQ, the sharper its educators tend to be.

Demographics, inextricably related to performance, probably also offer some explanation. While the population of Hispanics, Asians, and Native Americans have no statistically significant relationship with the improvement gap, the percentage of whites and blacks both do. Predictably, more whites means more improvement, while more blacks means less improvement (both races correlate with the gap at .45, but in the case of blacks the relationship is an inverse one).

Since blacks undergo faster physical development than whites, perhaps they do so cognitively as well. If their cognitive development slows down relative to whites from fourth to eighth grade, the variance in improvement becomes quite difficult for educators to do much of anything about.
However, if a higher proportion of black kids (and American black culture celebrates individual self-centeredness and disruption in public more than the mainstream cultures of any of the other major racial/ethnic groups do) retards the improvement gap due to more chaotic learning environments, segregation should have a beneficial effect. The segregation would not need to be based explicitly on race or ethnicity, but based on demonstrated behaviors or aptitude (this sort of segregation would roughly proxy for race, however). Cognitive segregation seems the best way to improve the performance of the entire student body, especially the performance of those on the right side of the bell curve.

If, instead, the states are gauged based on the magnitude of the increase in terms of the percentage of questions answered correctly in fourth grade relative to the percentage answered correctly in eighth grade, it shakes out this way (because state performance is determined by dividing percentage of questions answered correctly in eighth grade by the percentage answered correctly in fourth grade, most of the states score slightly differently from one another, hence the numerical ranking):

1. Nebraska
2. Illinois
3. Arizona
4. Montana
5. Kentucky
6. South Dakota
7. Oregon
8. Iowa
8. Virginia
10. Wisconsin
11. Alaska
12. Massachusetts
13. North Dakota
14. South Carolina
15. Minnesota
16. Washington
17. New York
18. Vermont
19. Colorado
20. Indiana
21. Missouri
22. New mexico
23. Nevada
24. Delaware
25. North Carolina
26. California
27. Ohio
28. Tennessee
29. Maryland
30. Rhode Island
31. Utah
32. Maine
32. Pennsylvania
34. Louisiana
35. West Virginia
36. Alabama
37. New Jersey
38. Michigan
39. Georgia
40. Connecticut
41. Idaho
42. Texas
43. District of Columbia
44. Wyoming
45. New Hampshire
46. Oklahoma
47. Hawaii
48. Kansas
49. Mississippi
50. Arkansas
51. Florida

The relative performance of the states looked at in this way also correlates positively with estimated average IQ, albeit more modestly at .33. The only demographic characteristic where a statistically significant relationship exists is with the percentage of the state's population that is white, at .37.

Whichever of the two methods is employed, the Texas miracle doesn't appear particularly miraculous. My state's infamous curricula do not jump out as being optimal, either. States in the Midwest and Upper Midwest generally fare well by both methods. I suspect this might answer the question of what exactly the relatively intelligent people in these areas do with their brains, since they're not in the Northeast trading equities or in California creating start-up technology firms. They do standard middle class work, but perform a bit better at it than their peers in other areas of the country.


Steve Sailer said...


How about if you looked at it separately for whites and for blacks?

Steve Sailer said...

Maybe you should measure it in standard deviation terms -- I believe there is a webpage on the NAEP site that explains their standard errors. If a state moves up, say, 0.2 standard deviations compared to the mean from 4th to 8th grade, that would be pretty good.

Then it would be interesting if the high gainers in math got that way by sacrificing gains in reading, or if they were improving in both areas.

Then, to add to your work load, you should take a look at a few years ago to see if the the high gainers are consistent or if this is just noise.

There are a lot of states where there aren't enough blacks for a reliable comparison, but every state has enough whites, so looking just at whites is good. Looking just at whites also prevents demographic change from influencing your findings. California's 4th graders are not demographically the same as its 8th graders, for example.

Audacious Epigone said...


I will, to try and tease out answers to a few of the questions I posed.

I don't see where the NAEP offers other than a standard error for each individual state's population being tested, but I'll calculate the standard deviation for state scores and look at the results that way.

Steve Sailer said...

Another wrinkle might be to look at 4th grade scores in 2003 to 8th grade scores in 2007 or whenever so you can compare roughly the same cohort.

Statsquatch said...

This is interesting. I have not looked at the data set, but is it possible to determine if there is a significant difference between any of these states in the change from 4 to 8th grade?

Audacious Epigone said...


It depends on what other variable(s) you want to look at. The table gives the change from fourth to eighth in terms of standard deviations relative to the states in their totality.