Discrete Optimization Models and Techniques for Problems in Data-Driven Prescriptive Analytics

Kim, Jongeun2023-02-032023-02-032022-05https://hdl.handle.net/11299/252329University of Minnesota Ph.D. dissertation. May 2022. Major: Industrial and Systems Engineering. Advisor: Jean-Phlippe Richard. 1 computer file (PDF); ix, 151pages.In this thesis, we consider situations when optimization problems include objective and/or constraint functions whose explicit forms are not directly available. We focus on two complementary situations, when those unknown functions are described by pretrained tree ensembles and when we train an explicit functional form using sample data points to substitute the unknown functions. We first study optimization problems where some objective and/or constraint functions are described by pre-trained tree ensembles. This problem is known as tree ensemble optimization. We establish a connection between tree ensemble optimization and multilinear optimization. We develop new polyhedral results for one problem through the lens of the other problem, and vice versa. Computational experiments show that our formulations provide tighter relaxations and improve the solution times compared to formulations based on modeling piecewise-linear functions and compared to existingtree ensemble formulations. We next study piecewise polyhedral relaxations of multilinear optimization. The developed results can be applied to model optimization problems when some objective and/or constraint functions are generalized decision trees. A generalized decision tree consists of a decision tree and a nonlinear model for each leaf. Given an input value, its prediction is computed by the nonlinear model associated with the leaf that the point corresponds to. We provide the first locally ideal formulation for the relaxation problem and further improve formulations by introducing linking constraints that help relatemodel components that are commonly treated independently in the literature. The implementation of our results inside of an open source mixed-integer nonlinear programming (MINLP) solver reduces by factor of approximately 3 the number of instances that the algorithm cannot solve within an hour with its default setting. Finally, we study symbolic regression, which is a form of regression that learns functional expressions from observational data. An expression tree is typically used to represent a solution function, which is determined by assigning operators and operands to the nodes. The methodology we develop is based on MINLP, which seeks global optimality given a restriction on the size of a solution expression tree. We develop tighter MINLP formulations and a heuristic that iteratively builds an expression tree by solving a restricted MINLP. In computational experiments that are aimed at discovering physics formula from sample data points, our methods outperform the best known approaches from the literature.enDecision TreesDiscrete OptimizationMultilinear OptimizationPolyhedral RelaxationSymbolic RegressionTree Ensemble ModelsDiscrete Optimization Models and Techniques for Problems in Data-Driven Prescriptive AnalyticsThesis or Dissertation