Nonparametric Estimation and Model Combination in a Bandit Problem with Covariates

Qian, Wei

Nonparametric Estimation and Model Combination in a Bandit Problem with Covariates

Qian, Wei

2014-07

Persistent link to this item

https://hdl.handle.net/11299/175307

Statistics

View Statistics

Title

Nonparametric Estimation and Model Combination in a Bandit Problem with Covariates

Authors

Qian, Wei

Published Date

2014-07

Type

Thesis or Dissertation

Abstract

Multi-armed bandit problem is an important optimization game that requires an exploration-exploitation tradeoff to achieve optimal total reward. Motivated from industrial applications such as online advertising and clinical research, we consider a setting where the rewards of bandit machines are associated with covariates. Under a flexible problem setup, we focus on a sequential randomized allocation strategy, under which, the plug-in regression methods for the estimation of mean reward functions play an important role in the algorithm performance. In the first part of the dissertation, we study the kernel estimation based randomized allocation strategy, and establish asymptotic strong consistency and finite-time regret analysis. In addition, although many nonparametric and parametric estimation methods in supervised learning may be applied to the randomized allocation strategy in a convenient plug-in fashion, guidance on how to choose among these estimation methods is generally unavailable. In the second part of the dissertation, we study a model combining allocation strategy for adaptive performance, and establish its asymptotic strong consistency. Simulations and a real data evaluation are conducted to illustrate the performance of the proposed combining strategy. In the existing literature of nonparametric bandit problem with covariates, it is generally assumed that the smoothness parameters of a Holder condition for the reward functions are known. Also, the finite-time regret analysis in the first part of the dissertation remains minimax sub-optimal. In the third part of the dissertation, we address these two issues by proposing a multi-stage randomized allocation strategy with arm elimination. In particular, when the smoothness parameter is unknown, we equip the algorithm with a smoothness parameter selector based on Lepski's method, and show that the regret minimax rate is achieved up to a logarithmic factor.

Description

University of Minnesota Ph.D. dissertation. July 2014. Major: Statistics. Advisor: Yuhong Yang. 1 computer file (PDF); vii, 73 pages.

Collections

Dissertations

Suggested citation

Qian, Wei. (2014). Nonparametric Estimation and Model Combination in a Bandit Problem with Covariates. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/175307.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.

University Digital Conservancy

Nonparametric Estimation and Model Combination in a Bandit Problem with Covariates

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

University Digital Conservancy

University of Minnesota Twin Cities

Nonparametric Estimation and Model Combination in a Bandit Problem with Covariates

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation