We consider an adaptive experiment for treatment choice and design a minimax and Bayes optimal adaptive experiment with respect to regret. Given binary treatments, the experimenter's goal is to choose the treatment with the highest expected outcome through an adaptive experiment, in order to maximize welfare. We consider adaptive experiments that consist of two phases, the treatment allocation phase and the treatment choice phase. The experiment starts with the treatment allocation phase, where the experimenter allocates treatments to experimental subjects to gather observations. During this phase, the experimenter can adaptively update the allocation probabilities using the observations obtained in the experiment. After the allocation phase, the experimenter proceeds to the treatment choice phase, where one of the treatments is selected as the best. For this adaptive experimental procedure, we propose an adaptive experiment that splits the treatment allocation phase into two stages, where we first estimate the standard deviations and then allocate each treatment proportionally to its standard deviation. We show that this experiment, often referred to as Neyman allocation, is minimax and Bayes optimal in the sense that its regret upper bounds exactly match the lower bounds that we derive. To show this optimality, we derive minimax and Bayes lower bounds for the regret using change-of-measure arguments. Then, we evaluate the corresponding upper bounds using the central limit theorem and large deviation bounds.
翻译:我们考虑用于治疗选择的自适应实验,并设计了一种关于遗憾的最小最大与贝叶斯最优自适应实验。给定二元治疗方案,实验者的目标是通过自适应实验选择具有最高期望结果的方案,以最大化福利。我们考虑由两个阶段组成的自适应实验:治疗分配阶段与治疗选择阶段。实验从治疗分配阶段开始,实验者将治疗方案分配给实验对象以收集观测数据。在此阶段,实验者可以利用实验中获得的观测数据自适应地更新分配概率。分配阶段结束后,实验者进入治疗选择阶段,从中选择一种最优治疗方案。针对该自适应实验流程,我们提出了一种将治疗分配阶段分为两个步骤的自适应实验方案:首先估计标准差,随后按各治疗方案的标准差比例进行分配。我们证明,这种常被称为奈曼分配的实验方案具有最小最大与贝叶斯最优性,其遗憾上界精确匹配我们推导的下界。为证明该最优性,我们通过测度变换论证推导了遗憾的最小最大与贝叶斯下界,继而运用中心极限定理与大偏差界评估了相应的上界。