Adaptive Design Series: Why Pruning Designs Are My Favorite
August 23, 2012
Note: This article is one of a series about adaptive design that come from a blog written by Dr. Karen Kesler from 2010 to 2011. That blog is no longer active, but it contained some great information, so we wanted to re-post it here.
I think that pruning designs get a bad rap. Sure, you can’t control Type I error if you’re looking at the data frequently and getting rid of treatment groups in the middle of the study, but there’s a place for this type of design in finding your best dose. Before I get too far ahead of myself, recall that in a pruning study, you start with a bunch of doses or regimens of your active compound and one arm of “placebo”. The placebo could be an active drug, but the goal here is to show superiority, not non-inferiority. You then set up boundaries—both efficacy and futility—for use in multiple interim analyses, with the goal of eliminating (“pruning”) the less effective or less safe doses early in the study. At each of the interim analyses, you calculate the test statistic for comparing each active arm to the placebo arm and see how it compares to your boundaries. Test statistic below the futility boundary? Prune that arm! Test statistic still in the middle of your boundaries? Keep randomizing subjects to that arm. Test statistic above the efficacy boundary? Think about stopping your study. Theoretically, at the end of the study, you’re down to a placebo arm and 1 or 2 active arms—the ones you now want to take to your confirmatory Phase III trial. If you start with 4-6 active dose/regimens and at least 2 interim analyses, you can see that the number of hypothesis tests really starts stacking up. Suppose you have 5 active arms and 3 interim analyses—that’s potentially 15 hypothesis tests before you even get to your final analysis! Of course, if you really have 15 tests, the design isn’t working right. You should be eliminating 1-2 arms in each interim analysis.
If it is working correctly, you would have a little bit of information on the less effective doses and as much information on your best dose as you would have from a traditional Phase II study. You’re also a bit more protected from guessing wrong on your dose-response curve. If it’s a little off, one of the doses you expected to prune could be the big winner. If it’s way off, you’re going to see that earlier than if you had waited to the end of the study to look at your data. Plus, one of the things I keep hearing from the FDA is that we don’t spend enough time investigating doses. This study design can help make looking at a wider range of doses or regimens more palatable to your teammates monitoring the program budget.
It’s not a panacea, however. It is definitely not “adequate and well controlled” so you can’t use it as a pivotal study. I know that a lot of people planning development programs would like to be able to use their Phase II studies as both dose-finding and confirmatory, but as they say, you can’t always have your cake and eat it, too. I’ve just gotten to the point where I expect the study to go in a completely unexpected direction since that seems to happen more often than not. In which case, I feel reassured by this design—I may not have the best assumptions going into the study, but I have more room for screwing up. That’s why it’s my favorite.