Browsing by Subject "Quantile regression"
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
Item Distributional analyses on diet quality in the United States(2014-08) Smith, Travis AlanThis dissertation takes a distributional approach to examining dietary quality in the United States. Diet quality is a direct input to health, is often used as a proxy for well-being, and is an outcome variable for a wide variety of economic interventions. This makes diet quality a particularly important, yet understudied, outcome for program evaluation and describing food bundles that individuals choose. The first chapter describes the evolution of adult dietary quality in the U.S. over the last two decades. Contrary to popular wisdom, there have been statistically significant improvements at all levels of diet quality. For the population as a whole, we find significant improvements across all levels of diet quality. Further, we find improvements for both low-income and higher-income individuals alike. Counterfactual distributions of dietary quality are constructed to investigate the extent to which observed improvements can be attributed to changes in the nutritional content of foods and to changes in population characteristics. We find that 63% of the improvement for all adults can be attributed to changes in food formulation and demographics. Changes in food formulation account for a substantially larger percentage of the dietary improvement within the lower-income population (19.6%) as compared to their higher-income counterpart (6.4%). The sheer myriad of overlapping policies and public awareness initiatives during this time period make it difficult to pin down the exact causes behind such improvements. This chapter motivates two program evaluation studies in the two chapters that follow.The second chapter estimates distributional effects of food consumed at school and away from home on child dietary quality. Using a fixed-effects quantile estimator, two non-consecutive days of food intake are used to identify the effect of eating away from home and at school. I find considerable heterogeneity in the estimated impacts. The study finds that food away from home, as compared to home-prepared food, has a negative impact on the distribution of dietary quality except at low quantiles. Main results suggest that school food has both positive and negative impacts across the distribution of dietary quality. I find positive impacts on dietary quality at low quantiles of the outcome distribution, whereas food from school has a negative impact at the upper end of the distribution of diet quality. While food consumed under the National School Lunch and Breakfast Programs may not benefit every child, especially the average child, it does improve the diets of many children who otherwise would have poorer dietary quality. The implication is that U.S. schools are fertile grounds to improve nutrition skill formation, especially for the most nutritionally disadvantaged. This final chapter estimates the effect of replacing food assistance benefits, which typically come in the form of a food voucher, for an equal value of cash on the quantity and quality of food consumed in a household. We utilize an experiment in which a portion of beneficiaries were chosen at random to receive their benefits in the form of cash. We take a distributional approach because we believe it is important to analyze low-consuming households separate from high-consuming households. We find some evidence that a cash system would increase kilocalorie consumption in the portion of the distribution below recommended levels of consumption and decrease consumption in the portion of the distribution well above any reasonable threshold. This finding implies that a cash transfer system may both alleviate food insecurity and decrease overconsumption. The cash system appears to have a positive impact on the distribution of dietary quality in quantiles above 40. Virtually all of the improvement in quality comes from a decrease in consumption of less-healthy foods by the cash receiving group. Overall, these findings imply that beneficiaries are no worse off under a cash transfer system and in fact, may be better off.Item Methodologies and Algorithms on Some Non-convex Penalized Models for Ultra High Dimensional Data(2016-06) Peng, BoIn recent years, penalized models have gained considerable importance on deal- ing with variable selection and estimation problems under high dimensional settings. Of all the candidates, the l1 penalized, or the LASSO model retains popular application in diverse fields with sophisticated methodology and mature algorithms. However, as a promising alternative of the LASSO, non-convex penalized methods, such as the smoothly clipped absolute deviation (SCAD) and minimax concave penalty (MCP) methods, produce asymptotically unbiased shrinkage estimates and owns attractive ad- vantages over the LASSO. In this thesis, we propose intact methodology and theory for multiple non-convex penalized models. The proposed theoretical framework includes estimator’s error bounds, oracle property and variable selection behaviors. Instead of common least square models, we focus on quantile regression and support vector ma- chines (SVMs) for exploration of heterogeneity and binary classification. Though we demonstrate current local linear approximation (LLA) optimization algorithm possesses those nice theoretical properties to achieve the oracle estimator in two iterations, the computation issue is highly challenging when p is large due to the non-smoothness of the loss function and the non-convexity of the penalty function. Hence, we also explore the potential of coordinate descent algorithms for fitting selected models, establishing convergence properties and presenting significant speed increase on current approaches. Simulated and real data analysis are carried out to examine the performance of non- convex penalized models and illustrate the outperformance of our algorithm in computational speed.Item Quantile regression model selection(2014-05) Sherwood, Benjamin StanleyQuantile regression models the conditional quantile of a response variable. Compared to least squares, which focuses on the conditional mean, it provides a more complete picture of the conditional distribution. Median regression, a special case of quantile regression, offers a robust alternative to least squares methods. Common regression assumptions are that there is a linear relationship between the covariates, there is no missing data and the sample size is larger than the number of covariates. In this dissertation we examine how to use quantile regression models when these assumptions do not hold. In all settings we examine the issue of variable selection and present methods that have the property of model selection consistency, that is, if the true model is one the candidate models, then these methods select the true model with probability approaching one as the sample size increases.We consider partial linear models to relax the assumption that there is a linear relationship between the covariates. Partial linear models assume some covariates have a linear relationship with the response while other covariates have an unknown non-linear relationship. These models provide the flexibility of non-parametric methods while having ease of interpretation for the targeted parametric components. Additive partial linear models assume an additive form between the non-linear covariates, which allows for a flexible model that avoids the ``curse of dimensionality". We examine additive partial linear quantile regression models using basis splines to model the non-linear relationships.In practice missing data is a common problem and estimates can be biased if observations with missing data are dropped from the analysis. Imputation is a popular approach to handle missing data, but imputation methods typically require distributional assumptions. An advantage of quantile regression is it does not require any distributional assumptions of the response or the covariates. To remain in a distribution free setting a different approach is needed. We use a weighted objective function that provides more weight to observations that are representative of subjects that are likely to have missing data. This approach is analyzed for both the linear and additive partial linear setting, while considering model selection for the linear covariates. In mean regression analysis, detecting outliers and checking for non-constant variance are standard model-checking steps. With high-dimensional data, checking these conditions becomes increasingly cumbersome. Quantile regression offers an alternative that is robust to outliers in the Y direction and directly models heteroscedastic behavior. Penalized quantile regression is considered to accommodate models where the number of covariates is larger than the sample size. The additive partial linear model is extended to the high-dimensional case. We consider the setting where the number of linear covariates increases with the sample size, but the number of non-linear covariates remains fixed. To create a sparse model we compare the LASSO and SCAD penalties for the linear components.Item Unconventional Regression for High-Dimensional Data Analysis(2017-06) Gu, YuwenMassive and complex data present new challenges that conventional sparse penalized mean regressions, such as the penalized least squares, cannot fully solve. For example, in high-dimensional data, non-constant variance, or heteroscedasticity, is commonly present but often receives little attention in penalized mean regressions. Heavy-tailedness is also frequently encountered in many high-dimensional scientific data. To resolve these issues, unconventional sparse regressions such as penalized quantile regression and penalized asymmetric least squares are the appropriate tools because they can infer the complete picture of the entire probability distribution. Asymmetric least squares regression has wide applications in statistics, econometrics and finance. It is also an important tool in analyzing heteroscedasticity and is computationally friendlier than quantile regression. The existing work on asymmetric least squares only considers the traditional low dimension and large sample setting. We systematically study the Sparse Asymmetric LEast Squares (SALES) under high dimensionality and fully explore its theoretical and numerical properties. SALES may fail to tell which variables are important for the mean function and which variables are important for the scale/variance function, especially when there are variables that are important for both mean and scale. To that end, we further propose a COupled Sparse Asymmetric LEast Squares (COSALES) regression for calibrated heteroscedasticity analysis. Penalized quantile regression has been shown to enjoy very good theoretical properties in the literature. However, the computational issue of penalized quantile regression has not yet been fully resolved in the literature. We introduce fast alternating direction method of multipliers (ADMM) algorithms for computing penalized quantile regression with the lasso, adaptive lasso, and folded concave penalties. The convergence properties of the proposed algorithms are established and numerical experiments demonstrate their computational efficiency and accuracy. To efficiently estimate coefficients in high-dimensional linear models without prior knowledge of the error distributions, sparse penalized composite quantile regression (CQR) provides protection against significant efficiency decay regardless of the error distribution. We consider both lasso and folded concave penalized CQR and establish their theoretical properties under ultrahigh dimensionality. A unified efficient numerical algorithm based on ADMM is also proposed to solve the penalized CQR. Numerical studies demonstrate the superior performance of penalized CQR over penalized least squares under many error distributions.