Online Convex Optimization in Changing Environments and its Application to Resource Allocation

Yuan, Jianjun2020-02-262020-02-262019-12https://hdl.handle.net/11299/211825University of Minnesota Ph.D. dissertation.December 2019. Major: Electrical/Computer Engineering. Advisor: Andrew Lamperski. 1 computer file (PDF); 129 pages.In the era of the big data, we create and collect lots of data from all different kinds of sources: the Internet, the sensors, the consumer market, and so on. Many of the data are coming sequentially, and would like to be processed and understood quickly. One classic way of analyzing data is based on batch processing, in which the data is stored and analyzed in an offline fashion. However, when the volume of the data is too large, it is much more difficult and time-consuming to do batch processing than sequential processing. What’s more, sequential data is usually changing dynamically, and needs to be understood on-the-fly in order to capture the changes. Online Convex Optimization (OCO) is a popular framework that matches the above sequential data processing requirement. Applications using OCO include online routing, online auctions, online classification and regression, as well as online resource allocation. Due to the general applicability of OCO to the sequential data and the rigorous theoretical guarantee, it has attracted lots of researchers to develop useful algorithms to fulfill different needs. In this thesis, we show our contributions to OCO’s development by designing algorithms to adapt to changing environments. In the first part of the thesis, we propose algorithms to have better adaptivity by examining the notion of dynamic regret, which compares the algorithm’s cumulative loss against that incurred by a comparison sequence. Dynamic regret extends a common performance measure known as static regret. Since it may not be known whether the environment is dynamic or not, it is desirable to take advantage of both regrets by having a trade-off between them. To achieve that, we discuss recursive least-squares algorithms and show how forgetting factors can be used to develop new OCO algorithms that have such a regret trade-off. More specifically, we rigorously characterize the effect of forgetting factors for a class of online Newton algorithms. For exp-concave or strongly convex objective, the improved dynamic regret of max{O(log T),O(\sqrt{TV })} is achieved, where V is a bound on the path length of the comparison sequence. In particular, we show how classic recursive least-squares with a forgetting factor achieves this dynamic regret bound. By varying V , we obtain the regret trade-off. In order to obtain more computationally efficient algorithm, we also propose a novel gradient descent step size rule for strongly convex functions, which recovers the dynamic regret bounds described above. For smooth problems, we can obtain static regret of O(T^{1−\beta}) and dynamic regret of O(T^{\beta}V^*), where \beta\in(0,1) and V^* is the path length of the sequence of minimizers. By varying \beta, we obtain the regret trade-off. The second part of the thesis describes how to design efficient algorithms to adapt to the changing environments. Previous literature runs a pool of algorithms in parallel to gain better adaptivity, which increases both the running time and the online implementation complexity. Instead, we propose a new algorithm requiring only one update per time step, while with the same adaptive regret performance guarantee as the current state-of-the-art result. We then apply the algorithm to online Principal Component Analysis (online PCA) and variance minimization under changing environments, since the previous literature on online PCA has focused on performance guarantee under stationary environment. We demonstrate both theoretically and experimentally that the proposed algorithms can adapt to the changing environments. The third part of the thesis starts from the observation that the projection operator used in constrained OCO algorithms cannot really achieve true online implementation due to the high time-consumption. To accelerate the OCO algorithms’ update, previous literature is proposed to approximate the true desired projection with a simpler closed-form one at the cost of constraint violation (g(\theta) >0) for some time steps. Nevertheless, it can guarantee sub-linearity for both the static regret and the long-term constraint, \sum_{t=1}^T g(\theta_t), having constraint satisfaction on average. However, the sub-linear long-term constraint does not enforce small constraint violation for every time step, because a strictly feasible solution can cancel out the effects of violated constraints. To resolve it, we propose algorithms to have the cumulative constraint of the form \sum_{t=1}^T(max{g(\theta_t), 0})^2 upper bounded sub-linearly. This new form heavily penalizes large constraint violations while the cancellation effects cannot occur. Furthermore, useful bounds on the single step constraint violation are derived. For convex objectives, our result generalizes existing bounds, and for strongly convex objectives we give improved regret bounds. In numerical experiments, we show that our algorithm closely follows the constraint boundary leading to low cumulative violation. Furthermore, we extend the proposed algorithms’ idea to the more general time-dependent online resource allocation problems with performance guarantee by a variant of dynamic regret.enchanging environmentsconvex optimizationexp-concaveonline learningPCAresource allocationOnline Convex Optimization in Changing Environments and its Application to Resource AllocationThesis or Dissertation