Monday, November 12, 2007

State rankings by NAEP improvement from fourth to eighth grade

++Addition++Steve Sailer weighs in. I will put up the same improvement list for whites only this afternoon.


Continuing with the idea from a recent post attempting to get some idea of how well states are performing academically given their various demographic profiles, on sagacious advice I included math and reading results in the fourth and eighth grades from the NAEP of 2003 and 2007. Instead of simply looking at point changes (both the math and reading tests are based on a 500 point scale), I gauged average state scores in terms of standard deviations (normally distributed). I also looked at the same for whites exclusively.

For all students, 'improvement' (how a state's fourth graders fare in comparison to its eighth graders, with better relative performance in eighth grade seen as indicative of teaching effectiveness over time, and poorer relative performance showing a 'negative improvement', or deterioration) is pretty consistent. There is a correlation of .57 for the improvement of the '03 and '07 cohorts.

The four-year time interval also allows fourth graders in '03 to be tracked in '07 when the same kids are in eighth grade. It's not a perfect trace, as some families move across state lines and others elect to enter private school at some point during those years. But it still provides a nice proxy.

The improvement of the eighth grade class of '07 relative to how they did as fourth graders in '03 correlates with the improvement of the fourth and eighth grade classes of '03 at a solid .71. It would be optimal to have data from 1999 to see how the eighth grade class of '03 improved over time (they are not available), but using the '03 numbers of eighth graders still provides a good estimate, given the firm relationships mentioned previously. With this consistency, the class of '07 seems an appropriate measure to use in ranking improvement by state over time.

There does not appear to be a trade-off in improving math scores at the cost of reading scores, or viceversa. The improvement for the two subject areas correlate positively at .64 in '07 and .61 in '03.

Curiously, the same relationships hold when only white students are considered, but they are moderately less rigorous. The correlation of the '03 and '07 cohorts is .53 (compared to .57 for all students), while the improvement of the class of '07 relative to the '03 eighth graders correlates at .67 (to .71 for all). The improvement of the all-races class of '07 doesn't correlate with a state's racial composition at any level of statistical significance (the p-value is .27).

The improvement of the '07 class of eighth graders (all races), by state, in standard deviations, follows. Keep in mind that it is a state's improvement in performance relative to other states for its eighth graders in '07 compared to its fourth graders in '03 that forms the rankings. We're essentially looking at a state's rate of self-improvement over the middle four years of schooling:

RankStateImprovement (SD)
1.District of Columbia.66
4.North Dakota.50
7.South Dakota.41
15.New Mexico.21
18.New Jersey.16


41.Rhode Island(.25)
42.South Carolina(.26)
47.New York(.40)
48.New Hampshire(.43)
51.North Carolina(.89)
52.West Virginia(1.00)

The upper Midwest (excepting Michigan) does pretty well, although the trend is not overwhelming. The overseas schools that teach military brats also look good. Not much of a pattern emerges, though. The public school bodies of DC and Massachusetts are about as demographically and economically dissimilar as it gets, yet they occupy the top two spots.

This lends some credence to the idea that the differences are, broadly, a result of what goes on in the classroom. The differences in improvement, pretty consistent over at least the last eight years, are not meaningfully tied to demographic composition or affluence (whereas absolute performance very clearly is).

Whatever the driver, three putatively 'crucial' attributes--the student-teacher ratio, expenditures per student, and average teacher salaries (even after making cost-of-living adjustments for the latter two)--do not correlate with improvement in any meaningful way (.09, .05, .03, respectively, all without statistical significance). To the extent that a miniscule relationship does exist, the trends for all three are in the expected directions (lower student-teacher ratio and more money for students and teachers all weakly correlate with greater improvement).

It would be interesting to line up the results against metrics for difficulty in obtaining a teaching certificate (something like this, but quantifiable) and the vigor with which teacher performance is evaluated. John Stossel, in his special Stupid in America, for example, singled out New York City's school system (New York is near the bottom of the list) as being particularly absurd, refusing to fire even the most incompetent or disturbed educators (if memory serves, only two had been let go over the time period he looked at). Massachusetts, in contrast, has come under fire recently for its rigorous licensing requirements, for which over half of aspiring black and Hispanic teachers fail to make the grade.

Is there a known source for that kind of information? Do other quantifiable attributes that might relate to better educational methods come to mind?


Anonymous said...

Your calling out this as bullshit ...the student-teacher ratio, expenditures per student, and average teacher salaries...and the NY public school system. reminded me of my father and the way the world used to be before the left became powerful.
He was born in Brooklyn just before WW2 and would be considered low class white trash by today's liberal elites, if dago, guinea wops could even aspire to be that back then. He went to the NY public school system which back in the 1940's and 50's was probably the finest in the nation, and some might say even argue among the best in the world. This school took kids of many social classes and managed to teach them even though the classrooms were hideously overcrowded by today's standards,there wasn't nearly as much "funding" and teachers were hardly making great money. Those schools managed to turn out an educated, capable and skilled student body(not necessarily PhDs either, but they had plenty of those come from that system) for all lines of work from finance to heavy industry to the military with very little in the way of "resources" one would have today.
Of course, these schools were overwhelmingly ethnic white/jewish so the material was pretty good and many parents if only of average or even below average intelligence themselves knew the value of an education (and often knowledge just for its own sake and as a way to understand culture and the world. No word of a lie, opera was popular and regular average people loved it, not just Italians) and what it could do for their kids.
But the in the 1960's the left took over the NY public schools and for the most part, and for a variety of fucked-up idiotic reasons, destroyed it. The NY public school system is now much like most other urban school systems, a disaster area. To call what the left did to those NY schools a tragedy implies that there was some sort of mishap or accident of fate. But no, it was a plan that was executed, with these people knowing full well what the results would be and has now succeeded. Minorities run wild and there is an ever increasing call for funding and programs. But then again, they don't have very good material to work with either. The people that care about education and learning have left the NY school system or NY all together, like my dad diid. All the studies and money in the world cannot teach or educate the savages that run riot in so many of the nation's schools. Once again, it comes down to race and enablement
Thanks again for letting me rant.

Audacious Epigone said...

But then again, they don't have very good material to work with either. The people that care about education and learning have left the NY school system or NY all together, like my dad did.

Yet the calls for more of the same will be made--more money, more teachers, more counselors. The effects have been just about nil, as I'm certainly not the first to point out. There are a couple of conclusions to be drawn regarding those who call for more of the same failed 'solutions' while ignoring the genuine causes for variance in performance: It is either naivete or maliciousness.

Steve Sailer said...

Fascinating. Any chance of putting up the white-only score changes so we can remove most of the impact of demographic change? African-Americans have been getting driven out of DC by immigrants and by whites (although I doubt if many are putting their kids in public schools yet), so demographic change could be affecting things.

Steve Sailer said...

North Carolina's next to last position may have to do with demographic change -- Hispanics are flooding in. I wonder how NC would do on a white's-only measure.

On the other hand, nobody is immigrating to West Virginia, which is last. Maybe all the people with something on the ball are moving out?

Steve Setzer said...

I found some weak-to-moderate correlations between absolute NAEP scores and per-pupil spending. Your data showed a much lower correlation between improving NAEP scores and per-pupil spending. Interesting...

(And thanks. Great post.)

More Spending Won't Help
Data on School Spending

Steve Setzer said...

Oh, and despite the fact that our names look very similar, especially when we comment sequentially on the same blog post, I am not Steve Sailer. (I aspire to make his list of smart Steves one of these days, though...)


rbc said...

Good work -- this is certainly the right starting point for trying to measure school quality independent of demographics. I was going to say there is an important residual effect of IQ still embedded in the data which needs to be considered, but now I've convinced myself otherwise.

Most people talk about the NAEP as if it had a fixed scale; e.g., if you can correctly turn "forty-five and six hundredths" into "45.06" rather than "45.6" or "456.0" on a multiple-choice test, you are an Advanced 4th-grader. Yes, this is an actual question and ranking -- see

That list seems to say the test works exactly the way I just said it doesn't. Why am I not wrong? The missing link is a fuller explanation of "Item Response Theory" than their glossary provides. The key is that no one sat down before the test and said, "This question should be worth a 296 on our 300-point scale." Instead, the score it is apparently assigned is really just a label applied *after* the test. It is a way of *measuring* the item difficulty, through ordering the questions by the relative performance of the population taking the test (which is why the 300-point scale goes up to 330).

OK, so what? Well, the real meat of intelligence is not how much you know -- it's how fast you learn. Measuring current knowlege is just a proxy for what IQ tests really want to measure, which is the rate of knowledge acquisition. Therefore, if the test were just a static list of items and the reported score were percentage correct (and the range from easiest to hardest were large enough), then you would expect improvement over time to measure IQ even better than raw scores did! However, since the NAEP determines its item difficulty by studying the population distribution of performance, this fact is mitigated -- they have essentially already corrected for the learning rate effect of IQ, by using their data to measure the spread in raw performance (which grows substantially over time, independent of teaching) and scaling the grade levels separately to make comparison easier.

Why have I bothered saying all this? One of the things which is sometimes (but not often enough!) mentioned in discussions of demographics and education is that racial performance gaps exist the moment kids enter school and *widen* over time. This is a simple and obvious consequence of differing IQ distributions -- the "slow" kids really do learn more slowly, and the smart ones learn faster, so the same biological input has steadily larger visible output over time. What I have concluded from looking through the NAEP analytical documentation is that they have already accounted for this, leaving only the work you've just done to measure education instead of demographics. Thanks!

Also, "the same relationships hold when only white students are considered, but they are moderately less rigorous" doesn't seem at all "curious" to me. Because the width of the performance distribution of only white students is slightly smaller than that of students as a whole, this restriction of range naturally causes the small reduction in correlation which you see.

Audacious Epigone said...


Sure. I'll put up the same list for whites exclusively later today or early tomorrow. Thanks for spurring me!


I did the same and found no statistically significant relationship between the two. If, however, cost-of-living adjustments are made, the relationship is moderate but real. However, a state's standard-of-living correlates much more vigorously with its NAEP performance, leading me to believe that more affluent places tend to have more intelligent children and the ability to spend more on them, rather than the idea that more per student expenditures is enhancing their intelligence.


Thanks for that! It is much appreciated.

Parentalcation said...

You shouldn't include Military Schools in the calculation. There is about 90% turnover over any 4 year period, so its not the same kids being tested at all.

I would expect that if you were able to measure their improvement accurately for a group of kids who were there for a full 4 years, they would do even better than the stats currently show.

Of course the population is skewed, since almost zero percent of the kids come from broken homes or live in true poverty. The numbers show some military kids who qualify for free and reduced lunch, but this is only because the government doesn't count certain portions of military compensation.

My kids all went to overseas schools for a while, and they really are superior... not because the teachers are any better (though the teachers are better), but because the parents are all completely supportive and involved.

Audacious Epigone said...


Is the DoDEA overseas student body fungible though, for the most part? The improvement from 2003 fourth graders to 2007 eighth graders (.31 SD) is in line with the 'favorable' gap b/w 4th and 8th graders in 2003 (.44) and also in 2007 (.32). The military scores always get better.

The military schools do especially well on the reading portion of the NAEP. In 2007, for whites only, they are behind only New Jersey in absolute average score, while in math they are middling. In 2003, the DoDEA is again second in reading, behind Massachusetts. Any ideas as to why? Is reading ability a better indicator of parental involvement, since that's broadly the academic subject kids and parents are most likely to do at home?