Tuesday, September 22, 2009

What is "The Gold Standard"?


Did you hear about the big report that came out this week? You know, the one that "shows" that NYC charter schools are better than traditional non-charter public schools? It has gotten a ton of attention, probably because it uses "'the gold standard' method[ology]." The report is not subtle about this. It is right there in the very first sentence of the executive summary, "The distinctive feature of this study is that charter schools' effects on achievement are estimated by the best available, "gold standard" method: lotteries." It even uses the term "gold standard" four more times throughout the report.

Everyone wants to follow The Gold Standard -- or at least be able to say that they do. Of course! I mean, who wouldn't? But I do not think that we actually have a gold standard in education research. In fact, I am quite sure that we do not, and appropriating biomedical research's gold standard does not make it appropriate for us.

However, if we are going to borrow their standard, can we not at least get it right?

The biomedical standard uses double-blind experimental studies with random assignment. That means that some research participants get the experimental treatment and some get a placebo, and both are assigned randomly. It also means that neither the researchers nor the participants know who is getting which treatment. After all, expectations are important, and the mind can set us up for all kinds of things.

*****************

One of the latest ideas about The Gold Standard in educational research concerns charter schools.

We all want to know whether charter schools are better than traditional non-charter public schools. On one level, we certainly do want to know about individual schools. But on the policy level, we want to know about the average charter school, because we want to figure out if "charterness" helps a school be better. If it does, then we want more charter schools. If it does not, then we want fewer or none. And if we cannot be sure, we want to keep checking.

Let me say this quite clearly: Some charter schools are better than most non-charter public schools, and some are worse. And some non-charter public schools are better than most charters, and some are worse.

The Gold Standard crowd have a favorite method for comparing charter schools to non-charter public schools, one of which they are quite proud, but one that is so full of problems that I am shocked that they keep using it.

They rightly want to control for self-selection bias among charter school students. We know that children and families that apply to charter schools are different from those who do not, even if we do not know what all those differences are. This seems like the perfect time to do a randomized assignment, because that is the best method to make sure that these differences cancel out between groups. Luckily, we have some randomized assignments. Oversubscribed charter schools are virtually always supposed to accept students using a random lottery. This allows researchers to compare the outcomes of those were were randomly accepted, and those who were not.

Sounds good, right?

Well, it does sound good. But serious issues remain. Some are more obvious than others, and some are correctable by those interested in getting the correct answer, rather than the one that fits their pre-ordained conclusions.

Issue #1: No Placebo

Biomedical research does not just include randomization of treatment. It also is at least single blind. If some patients know that they are getting the new treatment, they might react differently. The mind is a powerful thing. They might be more diligent. Perhaps do their rehab exercises more often. Maybe pay more attention to diet. Who knows? And those who know that they are getting the new treatment might not lose hope.

If a student and/or his family do not get into the school of his/her/their choice, how might they react? I know from my own experience teaching that students who get their choice of schools take a bit more ownership. If they get their second choice, or last choice, or somehow do not get their choice, that's a big hurdle for their teachers and parents to overcome. If parents do not get their choice of schools for their students, are they going to be as supportive of their child's teachers? Of the assignments? Are they going to have the same kind of faith in their child's school? I think that the answer is really quite obvious.

The problem with these studies is that the students and families who "lose" these lotteries are no longer like the students and families who "win" these lotteries. There simply is no basis for thinking that their views of their schools are like those of the lottery "winners." In fact, one could quite simply argue that this method of analysis ensures that the "winning" charter school students are being compared to students who did not want to go to the schools they attend.

Obviously, that's not an even comparison.

Issue #2: Peer Effects

We all know that peer effects matter. Research, experience and common sense back this up. If you put a student in a class with a "better" group of peers, s/he will do better than s/he would have done in a class with a "worse" group of peers. The other kids all do their homework, or their parents were more likely to read to them, or they are somehow smarter, or harder working, or bring more cultural capital to school with them, or however else you think "better students" might be defined.

We also know that charter school students are not, as a group, like non-charter school students. That is how the Gold Standard crowd justifies their approach here. So: trying to control for applicant differences, but not controlling for ongoing peer effects? I don't know if that is just lazy or actually dishonest. The importance of peer effects is so well recognized that I tend to think it is the latter, especially because techniques to control for them are so well established.

Of course, there is another way to look at this. From a personal level, if you have a child, you don't care about controlling for peer effects. Actually, you want the effects to be left in so that you can take advantage of them. If charter schools have "better" students, that's a reason to send your own child to a charter school. However, if this analysis is done for policy purposes, to influence policy-makers, then peer effects do matter. If you are thinking about all students, not just the select few who can get into the "better" school, you need to control for peer effects.

Issue #3: Selection Bias on the School Level

The goal of this lottery-based study design is to avoid self-selection bias in the data. However, those who use it do not acknowledge the additional selection problems they create.

The most important problem is that not all charter schools are oversubscribed, so not all charter schools can be included in these studies. This wouldn't be a problem if we had good reason to believe that a random selection of charter schools were included, but that is obviously not the case. Clearly, the "better" charter schools are far, far, far more likely to be oversubscribed than the "worse" charter schools. This biases the sample rather severely towards better charter schools.

Unfortunately, the sample bias problem doesn't stop there.

A really strong traditional non-charter public school is not going to lose a lot of students to a simply above-average charter school. In order to be oversubscribed, a significant number of students and/or families have got to believe that the charter school option is superior to the non-charter public school option, which suggests a level of dissatisfaction with the local traditional public schools. This biases the sample towards inferior non-charter schools.

Issue #4: Generalizability

The hardest thing in educational research -- and perhaps research overall -- is to be able to generalize one's results to the broader population or wider world. And yet, that is usually the end goal of policy-oriented research.

These kinds of lottery-based studies only include the kinds of students and families that apply to charter schools in the first place. Even if the previous issues could be corrected, how can one know that other sorts of students and families would see the same benefits? The fact is that different populations might benefit less or more from going to a charter school. It is simply impossible to know from this kind of study. Of course, if you are only concerned about benefitting the kids of families who already opt for charter schools, then this is not a problem. But if you aim to help a broader population than that, you need a better methodology.

These generalizability concerns also apply to schools. Oversubscribed charter schools might well be better than average non-charter public schools, and I do not really question whether they are better than their local traditional alternatives. But on a policy level, we need to be concerned with charters more generally than that. If we raise or lift caps on charter schools, or approve new charter schools, we have to expect an average charter school to result, not an exceptional one. But these studies really tell us nothing about the majority of charter schools that are not oversubscribed. Nor do they tell us anything about the relative quality of non-charter public schools that lack charter school alternatives.

*****************

I understand the desire to find a Gold Standard for educational research. But simply grabbing that label because a methodology has some resemblance to biomedical research is not good enough, despite what Prof. Caroline Hoxby may claim. Moreover, the popular press really must do a better job of examining these claims critically, rather than cheerleading for them like this.

This, of course, means that researchers, journalists and the rest of us must be sure to take a more thoughtful stance than has become our habit.

14 comments:

  1. Excellent points! Esp about peer effects. Even if the experimental kids were randomized, charter school peers are going to be a self-selected group.

    ReplyDelete
  2. Aside from the flaws in the methodology, it's a very big asterisk that this study was conducted not by an impartial academic researcher but by a longtime, high-profile partisan of free-market "solutions" and privatization, an open opponent of public education. It is not impartial academic research but advocacy -- propaganda.

    It's shocking that the mainstream press and many education bloggers are ignoring that fact and treating the study as though it were credible academic research. That's simply dishonest and misleading.

    ReplyDelete
  3. Caroline,

    This is going to be posted over on Gotham this morning, so I'd love for you to repost this comment where others are more likely to see it.

    I think that I have to disagree with you on your first paragraph, though.

    I've made to secret how I feel about Hoxby's work, but she IS an independent academic researcher. I don't have a problem with her views, just a problem with the consistent flaws in her methodology; she seems to always have issues with biased samples.

    I'm not sure where we'd find "impartial" academics. Experts have real opinions, built on years of experience and research. I'd be concerned if anyone who spent years in a field did not develop opinions and expectations. I would expect it of those who are prone to agree with me, and those who are prone to think otherwise.

    So, it's hardly fair to discredit her work simply because she's got a point of view.

    On the other hand, I agree that the media really needs to take that point of view into account when reporting on her work. This is a not a peer reviewed study, and her record ought to at least raise issues for journalists.

    ReplyDelete
  4. OK, ceolaf, you've won me over with your comments. These are seriously thoughtful critiques. I believe we would need to see her data, knowing how many charters in the entire universe of possible entrants in this study were included, and how many didn't make the cut because they were undersubscribed, among other data points, to know whether your points are overstated or on target.

    I have a suspicion that given the diversity among charter schools, it is well nigh impossible to design a study that captures the effect of "the average charter school." I understand that the press is treating this study as if it has done so, and you are right to point out that this is not the case.

    All the same, I have to point out that one of the key innovations of the charter movement is that, in theory, all the charter schools can be "better than average," like the Chancellor's 98% A and B schools (snicker), because below average charter schools are easily shut down by authorizers. Charter laws have their own Darwinian selection that supports this aspect of Hoxby's selection bias. Over time, if authorizers did their jobs, one could expect that the charters not included in her study would close. The kids who would have attended there would go on to superior charter schools, or back to the district.

    ReplyDelete
  5. Don't forget about ATTRITION. A lot of kids who receive a voucher or attend a charter don't stay at their new school. (For vouchers in particular, these rates can be high.) If this attrition is non-random, your treatment and control groups cease to be the same.

    ReplyDelete
  6. Peter,

    You're right. That's a big issue in a lot of these studies, where transfers are excluded without acknowledging that the transfers are likely NOT representative of the school's general population.

    I did not mention this because it is not a problem particular to this methodology or even charter - non-charter comparisons.

    Furthermore, I'm not sure what ought to be done about this. That is, if the study's goal is to figure out the impact of the "full treatment," what else could a researcher do? So, I am not sure that it is a methodological question as much as simply the nature of the research questions.

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. KitchenSink,

    I invite you to post your comment over at Gotham, where this piece is cross posted in the community section. I could respond to it more publicly there.

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete
  10. I agree with your analysis. I asked Sean Reardon to use that to completely discredit this bogus claim from Dr. Hoxby.

    Dr. Hoxby’s study is statistically flawed because it does not (cannot) control for the crucial peer group variable:

    http://bsdbudget.blogspot.com/2010/09/flawed-statistical-methods-of.html

    ReplyDelete
  11. Ceolaf, I agree with all your points and have made them myself. This does not change the fact that lottery based studies have the potential to be, and on average are, more rigorous than other ways of evaluating schools. Interpret with care? Yes. But unless you have an alternative...

    ReplyDelete
  12. MIchaelBishop,

    can you please explain what you mean by "rigorous"? How do you reconcile the issues that I raised with the idea of "rigor"?

    1) I think that it is pretty clear that we need to be careful when evaluating educational options that we communicate whether an outcome is better for an individual or better for a system. In other words, does the advantage rob Peter to pay Paul? When it comes to charters, clearly there is a peer effects issue that DOES rob Peter to pay Paul. From a policy perspective, that is unacceptable. That is is not clearly labeled as a policy problem is unconscionable.

    2) There is nothing rigorous about comparing different populations. When they groups have different survival rates (technical term) for different reasons, there's a problem. The issue here is that it's not merely a matter of how long students are in a treatment, but rather how long they are included in the study. Failing to acknowledge that all along, failing to adjust for that, and failing to underscore that is not rigorous. It is misleading.

    3) Too often econometrics is used in ways that assume the "treatment" is a black box. The refusal to model what is really going on in favor of simpler equations (e.g. linear regression) does not tell us much about reality. It is not actionable. It does not inform policy-makers. It does the begin to address the critical questions about mechanisms, processes and complex causation. That's not rigorous. That's a lazy dependency on simplistic methods to student complex phenomena.

    You want better? What about inspectors? What about serious qualitative research and explanations combined with high quality quantitative?

    Rigorous should NOT mean cheap, easy and simplistic. Rigorous should NOT mean "Well, the numbers make it look precise" or "Numbers make it look objective and consistent." And yet that seems to be the case.

    Understanding education requires more than that. Obviously.

    ReplyDelete
  13. I agree with the point that NYC charter schools are better than non-charter schools.They follow a well define pattern to educate students which they called'The Gold Standard' although it is difficult to apply by the the non-charte schools.But they can try to do that.
    vintage jewelry

    ReplyDelete