Three Essays in Development Economics A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY KHOA VU IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY PAUL GLEWWE December, 2023 © KHOA VU 2023 ALL RIGHTS RESERVED Acknowledgements As the saying goes, ‘It takes a village to raise a child.’ This dissertation is the fruit of labor from years of research and analysis, which would not have been possible without the help of so many people, especially my family, my faculty members, my colleagues, and my friends from all over the world. First, I am forever grateful to Professor Paul Glewwe. Paul was there when I took my first step in academic research since my junior year in college and has always been there every time I needed guidance or support, regardless of whether it was econometrics, research, or job application. Paul gave me a safe space to stop by his office any time to ask any kind of question, which served as a foundation for my research ideas and writing. I still remember asking Paul how to merge two datasets in Stata, the very first code that I wrote. Without any hesitance, he fired up his computer and showed me how to do it even when he likely had tons of deadlines waiting. Over 10 years of knowing Paul, I have lost count of how many times I have asked Paul to write a letter of support or to simply look over a manuscript or a proposal before I submitted it. But more than that, his expertise in the Vietnamese economy and educational system has been a tremendous resource for me to write my dissertation. I am thankful for the kind and supportive faculty whom I met and worked with during my PhD program. I am particularly grateful to Professor Marc Bellemare. Marc has a foundational impact on how I think about an economic problem from a modern applied microeconomic perspective and how I write an applied economic paper. More than that, he is a dear friend and a mentor that I was unbelievably lucky to have during my Ph.D. journey. I am also thankful for Professor Pinar Karaca-Mandic and her mentorship and friendship. Through working with Pinar, I have learned how to work with large health datasets and be meticulous about data cleaning and analysis. i Pinar has given me many opportunities to develop skills that are usually not provided in a typical PhD program, including teamwork and project management. I am deeply grateful for this important knowledge, which continues to play a huge role in my career. I am thankful for my Ph.D. committee members, Professor Jason Kerwin and Dr. Aaron Sojourner, who were willing to listen to my presentation several times during the job market season and provide thoughtful feedback for my research and for my career. Secondly, I am thankful for the unconditional love and unwavering support of my wife and my best friend, Vu Dao. Pursuing a doctoral degree while being a parent to two children was challenging; sometimes it felt impossible to even get near the finish line. Yet every time I looked, I saw my wife doing everything in her power to keep our household together. Yet every time I looked, I saw my children being raised in the kindest and the most nurturing environment created by my wife. Every time I got stuck with my research (which happened more than I can count), I would just ramble on about theories, models, and data, with my wife. Vu often said that she wished she could be a better listener, but hardly did she know that those sessions have served me far more than any academic papers that I have read. It would have been impossible to get to where I am without the constant support from my parents. I inherited a passion for social science research from my parents even before I arrived in the US. Throughout college, my mind was constantly fueled with research ideas every time I had a conversation with my parents. I learned from my father how to think critically and how to look at a problem from different angles. I learned from my mother that research is hard but I should never give up. These principles served as a bedrock for my creativity and perseverance throughout graduate school. I am thankful for my sister, Tram Vu, and my children, Minh-Anh and Minh-Khoi Vu. Despite being the younger sibling, Tram always gave me level-headed advice and career inspiration that got me to where I am today. Minh-Anh and Minh-Khoi are a constant reminder of who I am outside my academic work and are the sources of happiness and pride. I still remember telling Minh-Anh about not doing very well during the job market presentation, and she said “Dad, I think your paper is amazing. Everyone will love it.” I still remember seeing Minh-Khoi trying to take his first step even after falling several times. Having them as my family is a blessing. ii Dedication To Dad, Mom. To Vu Dao, Minh-Anh, and Minh-Khoi. Thank you for everything. iii Abstract Vietnam has risen to become one of the fastest-growing economies in the world, yet the path to sustainable long-run economic growth remains elusive. This goal is further complicated by concerns about inequity and the recent global pandemic. In this dis- sertation, I tackle three important issues related to the quest for sustainable economic growth in Vietnam. I first examine whether expanding access to higher education has any impact on productivity at the worker level and firm level. I found that exposed workers are more likely to work in the service sector and, thus, the productivity of service firms rises in the long run. Second, I examine whether extending the maternity leave requirement has any implications on women’s decisions to work in the formal sec- tor. My findings indicate that women are more likely to move from informal work into formal jobs when the government extends the required maternity leave length from four months to six months. Third, I propose a new method to understand the impacts of the global pandemic on food security in Vietnam at a granular level. iv Contents Acknowledgements i Dedication iii Abstract iv Contents v List of Tables viii List of Figures ix 1 Introduction 1 2 Higher Education Expansion, Labor Market, and Firm Productivity 4 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Background of the Higher Education Expansion . . . . . . . . . . . . . . 9 2.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4 Impacts on Individual-Level Outcomes . . . . . . . . . . . . . . . . . . . 14 2.4.1 Empirical Strategy to Estimate Worker-Level Effects . . . . . . . 15 2.4.2 Results for Worker-Level Effects . . . . . . . . . . . . . . . . . . 18 2.4.3 Effects on Occupation and Industry of Work . . . . . . . . . . . 25 2.4.4 Returns to Completing Higher Education in the Presence of Equi- librium Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5 Equilibrium Effects on the Labor Market and Firm Productivity . . . . 30 2.5.1 A Model of Endogenous Technological Adoption . . . . . . . . . 30 v 2.5.2 Empirical Strategy for Market-Level Analysis . . . . . . . . . . . 33 2.5.3 Results for the Effects on the Labor Market and Firms . . . . . . 34 2.6 Partial equilibrium returns to higher education . . . . . . . . . . . . . . 42 2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3 Maternity Benefits Mandate and Women’s Choice of Work 46 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.4 Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.4.1 First Research Design: Comparison by Childbearing Age . . . . 57 3.4.2 Threats to Validity and Robustness Checks . . . . . . . . . . . . 58 3.4.3 Second Research Design: Comparison by Expected Birth Rates . 61 3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.5.1 Main Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.5.2 Results from The Second Research Design . . . . . . . . . . . . . 72 3.5.3 Formal Employment in the Public and Private Sectors . . . . . . 74 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4 Income Shock and Food Insecurity during the Pandemic 85 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.2 Impacts of COVID-19 on the Vietnamese Economy . . . . . . . . . . . . 90 4.3 Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.3.2 Measuring Food Insecurity . . . . . . . . . . . . . . . . . . . . . 95 4.3.3 Econometric Model . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.4 Estimating the effect of household income on food insecurity . . . . . . 108 4.4.1 Regression Results . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.4.2 Assessing the Validity of the Shift-share Instrument . . . . . . . 110 4.5 Predicting Food Insecurity Risks in Vietnam . . . . . . . . . . . . . . . 112 4.5.1 Choosing the Optimal Classifier Threshold . . . . . . . . . . . . 114 4.5.2 Out-of-Sample Prediction Validation in Vietnam . . . . . . . . . 115 4.5.3 Predicting Food Insecurity Changes During the Pandemic . . . . 118 vi 4.5.4 Policy implications . . . . . . . . . . . . . . . . . . . . . . . . . . 122 4.6 Discussion and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5 Conclusion and Discussion 129 6 References 131 Appendix A. Higher Education Expansion, Labor Market, and Firm Productivity 152 A.1 Additional Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . 152 A.2 Estimating total factor productivity at the firm level . . . . . . . . . . . 156 Appendix B. Maternity Benefits Mandate andWomen’s Choice of Work158 B.1 Additional Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . 158 B.2 Overview of the Vietnamese labor market composition . . . . . . . . . . 160 Appendix C. Income Shock and Food Insecurity during the Pandemic 163 C.1 Additional Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . 163 C.2 Validity of food insecurity measurement . . . . . . . . . . . . . . . . . . 165 C.3 Comparison between different targeting approaches . . . . . . . . . . . . 168 C.4 Comparison between IV-based prediction and 2020 data . . . . . . . . . 172 vii List of Tables 2.1 Balance table of treatment status at the province level . . . . . . . . . . 13 2.2 Summary statistics by cohort and province . . . . . . . . . . . . . . . . 20 2.3 Difference-in-differences estimates for the effect of exposure to the higher education expansion on individual-level outcomes . . . . . . . . . . . . . 21 2.4 Difference-in-differences estimates for general equilibrium effects on labor market at province level . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.5 Difference-in-differences estimates for effects of the expansion on firm- level outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.6 Partial equilibrium returns to higher education in short run and long run 44 3.1 Summary statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.2 DiD results for the effect of extending maternity leave . . . . . . . . . . 64 3.3 DiD results for the effects on public and private formal employment . . 77 4.1 Summary statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.2 Estimates for income effects on food insecurity . . . . . . . . . . . . . . 109 4.3 Estimates for income effects on alternative food insecurity measures . . 111 4.4 Relationship between industry-district employment shares and district- level characteristics in 2009 . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.5 Income effect estimates using 2009 district-industry employment shares × year dummy as instruments . . . . . . . . . . . . . . . . . . . . . . . . 114 4.6 Income effect estimates using non-agriculture Bartik IV . . . . . . . . . 115 A.1 Summary statistics by year and district . . . . . . . . . . . . . . . . . . 152 A.2 Summary statistics by year and district . . . . . . . . . . . . . . . . . . 153 A.3 Production function estimation results . . . . . . . . . . . . . . . . . . . 157 C.1 First-stage estimates for income effects on food insecurity . . . . . . . . 163 viii List of Figures 2.1 Number of Universities and Share of Adults Completing College Education 5 2.2 Number of Universities by Provinces in 2005 and 2014 . . . . . . . . . . 11 2.3 Event Study for the Worker-Level Effects of the Higher Education Ex- pansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4 Event Study for Worker-Level Effects on Interprovincial Migration . . . 24 2.5 Change-in-changes Model for Worker-Level Effects on Monthly Wage . . 24 2.6 DiD Results for the Worker-Level Effects on Industry and Occupation . 26 2.7 Event Study for the Worker-Level Effects by Gender . . . . . . . . . . . 27 2.8 Event study results for the effects on province-level outcomes - by age cohort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.9 Event Study for Firm-Level Effects . . . . . . . . . . . . . . . . . . . . . 42 3.1 Birth rates by age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.2 Employment outcome for women aged 25–54 by year . . . . . . . . . . . 56 3.3 Event-study estimates for effects on women’s labor market outcomes . . 69 3.4 DiD estimates for effects by age . . . . . . . . . . . . . . . . . . . . . . . 71 3.5 Event Study for the Effects by District–Age-Group Birth-Rate Bin . . . 73 3.6 Public and private formal employment for women aged 25–54 by year . 75 3.7 Event study for the effects on public and private formal employment . . 79 3.8 Triple-differences results for industry and occupation effects . . . . . . . 81 4.1 Correlation between province-level income and poverty with share of food-insecure households in 2018 . . . . . . . . . . . . . . . . . . . . . . 92 4.2 Food insecurity measures by income group, 2010–2018 . . . . . . . . . . 98 4.3 Food insecurity and poverty by province and year . . . . . . . . . . . . . 100 4.4 Predicted and actual food insecurity of 2018 at the provincial level . . . 117 ix 4.5 Percentage of households with high predicted risk . . . . . . . . . . . . . 119 4.6 Pre-pandemic, post-pandemic, and percentage point change in each dis- trict’s share of households with high food insecurity risk . . . . . . . . . 120 4.7 Pre-pandemic, post-pandemic, and percentage point change in each dis- trict’s share of children ages 0–5 with high food insecurity risk . . . . . 121 4.8 District-level percentage point change in the share of households with food insecurity risk before and after COVID-19 under annualized changes 123 4.9 Districts’ share of households with high food insecurity risk in 2019 and districts’ increase in the share of high-risk households due to the pandemic125 A.1 Event study estimation (cohort) for other labor market outcomes . . . . 154 A.2 Event Study for Worker-Level Effects Using Different Control Groups . 155 B.1 Reasons for not working, 2014–18 . . . . . . . . . . . . . . . . . . . . . . 158 B.2 DiD estimates for effects by district–age-group birth-rate bin by gender 159 B.3 Labor force participation of men and women aged 25–54 in 2018 by college education and marital status . . . . . . . . . . . . . . . . . . . . . . . . 160 B.4 Labor market composition in 2010-2018 by gender and college education 161 B.5 Gender gap in formal employment by region and year . . . . . . . . . . 162 C.1 Predicted probability distributions by actual food insecurity status . . . 164 C.2 Difference in prevalence and Cohen’s Kappa statistics to choose the op- timal threshold for high-risk households for the linear probability model 164 C.3 Predicted and actual food insecurity of 2018 for IV probit and linear IV 165 C.4 Food-insecure household by household income and wealth deciles . . . . 167 C.5 Food insecurity and poverty by province and year . . . . . . . . . . . . . 169 C.6 Food insecurity change targeting versus income change targeting and poverty change targeting . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 C.7 Differences in share of “high-risk” households based on prediction and share of “high-risk” households based on 2020 income . . . . . . . . . . 173 x Chapter 1 Introduction Vietnam has gathered significant attention from policymakers and researchers around the world for its miraculous economic growth since the late 1980s. The economic trans- formation came from a series of agricultural policies and trade liberalization agreements known as Doi Moi, after which Vietnam became one of the top rice exporters in the world (Glewwe, 2004). Doi Moi has ushered Vietnam into a new era. During the period between 2000 and 2020, the country experienced substantial changes to its economy, in terms of productivity, sectoral composition, and trade patterns.1 More importantly, Vietnam was able to achieve a burst in economic growth, fueled by rapid structural transformation and rising exports. Yet as the country enjoyed considerable improvements in both household livelihoods and the business environment, it faces very different challenges than those at the beginning of its development journey. In this dissertation, I study three broad challenges facing the country: • One of the most important and pressing issues facing this young and promising economy is how to sustain the economic growth that it has enjoyed over the last two decades. Indeed, most of the recent economic growth in Vietnam can be attributed to structural transformation, as workers moved out of agriculture and into the manufacturing and service sectors (McCaig and Pavcnik, 2013), which, in turn, has been driven by exports and trade liberalization (McCaig and Pavcnik, 1See, e.g., McCaig and Pavcnik (2013), Tarp (2017), Liu et al. (2020), and Rand and Tarp (2020). 1 22018; Asghar and McCaig, 2023). While economic growth based on exports and structural transformation can be substantial and rapid, it cannot be sustained in the long run without long-run investment in physical and human capital or state capacity (McMillan et al., 2017). • Part of the quest for sustainable growth also involves policies that promote socioe- conomic equality such as lowering the gender gap in employment and wages. In Vietnam and other developing countries, women are often employed in the infor- mal sector, in which they are not protected by labor regulations nor guaranteed any labor rights such as maternity leave. Thus, the government is faced with two closely-related problems: how to lower the gender gap in the labor market and how to ensure that women receive protections for their labor rights. • The COVID-19 global pandemic also brought several unprecedented challenges, including the aftermath of mandatory lockdowns and school closures. While these public health policies were deemed necessary to protect the health of millions of people, they also come with potentially significant trade-offs such as food inse- curity and learning losses. A critical question for policymakers is, thus, how to target vulnerable households that might be substantially affected by these public health orders not only during the pandemic but also in future disasters. To address the first question about sustainable growth, I examine in Chapter 2 a national policy to expand the network of universities in Vietnam during the period between 2006 and 2013, and its implications for economic development at the subna- tional level. Such a policy aims to expand access to higher education in provinces that did not previously have a university. This national expansion of higher education is an important effort to raise the human capital to achieve sustainable economic growth in the long run. Theoretically, an increase in access to higher education can affect growth through two channels. First, it can induce further structural transformation, as educated workers move into the manufacturing and service sectors instead of the agri- cultural sector. Second, higher human capital can speed up innovation or technological adoption either of which can lead to higher productivity in the long run. However, there are also substantial concerns that a hasty expansion of higher education can pro- duce low-quality education and such long-term benefits may not materialize in reality. 3In other words, whether the expansion policy leads to structural transformation and enhances productivity is ultimately an empirical question. To answer this question, I assembled a new dataset on the timing and location of all university openings in Vietnam and combined it with the Labor Force Survey and the Enterprise Census data. Because new universities were established in different locations at different times, I applied a staggered timing difference-in-differences research design to identify the effects of this exposure on labor market outcomes such as employment in the manufacturing and service sectors. Using the same design, I look at the effects on firm-level outcomes such as total factor productivity and labor productivity. In Chapter 3, I turn to the relationship between labor rights and the gender gap in formal employment. To answer this question, I examine the 2012 Amendment to the Vietnam Labor Law, which raises the required maternity leave from four months to six months. Since this requirement is not enforceable in the informal sector, the law represents an increase in the benefits of the formal sector, and thus, creates an incentive to switch specifically for women at the childbearing age. Given that the extension would mainly affect women at the childbearing age but not older, I apply a difference- in-differences approach comparing women across different age groups as well as across different expected birth rates to estimate the effects on choice of work. In Chapter 4, I develop a two-step framework to help policymakers target vulnerable households during the pandemic and future natural disasters. Natural disasters and the pandemic threaten vulnerable households because household members are forced to stay at home and thus cannot generate the usual income that allows them to purchase food, raising concerns about food insecurity during hardships. The first step involves using existing household data to examine the relationship between household income and food insecurity using a shift-share instrumental variable (IV) approach. I then combine the estimate from the first step and external source on income shocks at the macro level to food insecurity shocks at a granular level. I apply this approach to predict food insecurity shocks caused by the pandemic at the district level. Chapter 2 Higher Education Expansion, Labor Market, and Firm Productivity 2.1 Introduction Over the last two decades, the demand for higher education in developing countries has risen rapidly; enrollment rates in both low and lower middle income countries have more than doubled since the early 2000s. Policymakers in these countries are thus faced with a decision of whether to expand access to higher education. On one hand, higher education can be an important contributor to economic growth. It can speed up an economy’s structural transformation by allowing workers to move into skill-intensive sectors. It can also raise firm-level productivity by inducing firms to employ skill-complementarity technology or capital.1 On the other hand, there is skepticism towards the value and quality of higher education in developing countries, raising concerns about efforts to expand access to higher education instead of focusing on basic education (Hanushek, 2016). Yet there is very little empirical evidence on how higher education affects low and middle income countries given these concerns, especially at the sub-national level. 1An increase in skilled labor can either induce skill-biased technological change (Acemoglu, 1998; Beaudry et al., 2010) or an increase in skill-complementarity capital, which also raises productivity (Lewis, 2013; Khanna, 2023). 4 5We study the effects of higher education on workers and firms by examining a na- tional expansion of higher education in Vietnam caused by Decree 121/2007/QD-TTg. This policy led to the establishment of over 100 new universities between 2006 and 2013; the share of young adults with a college education tripled between 2004 and 2018 (see Figure 2.1). Studying such a policy would provide valuable lessons not only to Vietnamese policymakers but also those in other fast-growing low and middle income countries, such as Bangladesh, India, Ghana, and Cambodia. These economies, in- cluding Vietnam, have experienced rapid growth due to export-led industrialization as workers move from less productive to more productive sectors. Expanding access to higher education is a potential policy to unlock sustainable growth as it induces firms to upgrade their production technology, raising their productivity. (a) Number of universities by year (b) Share of college-educated adults by age and year Figure 2.1: Number of Universities and Share of Adults Completing College Education Note: Figure (a) shows the number of universities by year. Figure (b) shows the share of adults between age 22-55 who completed college education. Source: Number of universities were collected by the authors from official approval documents. College completion rate data comes from from the Vietnam Household Living Standard Survey for 2004-2018. To assess this policy shock, we collect a new dataset on the timing and locations of all university openings in Vietnam2 and combine it with labor force survey and firm census data. In the first part of the paper, we estimate the effects on individual workers using 2All public and private university openings require official approvals from the government, which are publicly available and contain information about the location and timing of university openings. We hand-entered these information for all universities established in Vietnam since 1975. 6a staggered difference-in-differences (DiD) model that compares outcomes across birth cohorts and provinces using the data from the Labor Force Survey (LFS). Being exposed to a new university at college-going age increases the probability of completing college by 57% and the average monthly wage by 8.6%. Our results are robust when combining the two-way fixed effects model with propensity score matching and to accounting for differential linear trends across the treatment and control groups. We further show that migration does not explain our results, nor does the expansion trigger any migration response across provinces. The effects across most outcomes are concentrated among female workers, especially for the average monthly wage. Using a change-in-changes (CiC) model, we document substantial treatment effect heterogeneity on wage; the treatment effects are particularly high at the bottom of the wage distri- bution, where the share of college-educated workers is very low. This result implies that our results are also driven by substantial general equilibrium effects, complicat- ing any attempt to evaluate the returns to higher education. More importantly, we find that being exposed to the expansion lowers employment in less productive sectors and increases employment in more productive sectors, i.e., higher-paid occupations and more skill-intensive industries. Such results imply that the expansion helps speed up the structural transformation process, contributing to productivity growth. In the second part of the paper, we explore the effects on firms through changes in the local labor market. Using a canonical model of skill differentials with different age cohorts (Card and Lemieux, 2001) and endogenous technological adoption (Ace- moglu, 1998; Blundell et al., 2022; Carneiro et al., 2023), we first show that an increase in the supply of college-educated workers would decrease the college wage premium. In the long run, as there are more college-educated workers, the marginal product of educated-workers under skill-biased technology also rises, inducing firms to switch to more productive technology (Acemoglu, 2007; Carneiro et al., 2023). Using a staggered timing DiD, we find that the expansion increases the relative sup- ply of college-educated workers among the younger cohort, but reduces the college wage premium for both age groups. Using the theoretical model, we back out the relevant elasticities of substitution across age and educational levels. The inferred elasticity of substitution between college and non-college workers is about 2.12, which is consistent with the empirical literature in the US (Acemoglu and Autor, 2011). Surprisingly, we 7find that the expansion raises non-college wages in both the short run and the long run. Applying the same empirical strategy on firm-level data from the Vietnam Enterprise Census (VEC), we observe that firms experience an increase in productivity, measured by both total factor productivity (TFP)3 and labor productivity, but only in the long run. We also show that firms reduce capital intensity in response to the expansion, implying a shift in production techniques as the number of college-educated workers grows. Lastly, we use Khanna (2023)’s model to recover the partial equilibrium returns to higher education once we account for the general equilibrium effects on college and non- college workers in both cohorts. We show that college completion has a very high returns for those who live in a province without any college at all. These results highlight the value of higher education in a developing country. Our findings contribute to three important bodies of literature. First, higher educa- tion as an important driver of economic growth has mainly been studied in developed countries. An increase in college-educated workers can affect economic growth through the R&D market or productivity spillover.4 A relatively recent literature established the importance of colleges in developed countries such as the United States and Norway, as they drive local educational attainment (Russell et al., 2022), innovation (Hausman, 2020; Andrews, 2020), technological adoption (Carneiro et al., 2023; Blundell et al., 2022), consumption (Liu and Yang, 2021), and agglomeration (Liu, 2015). In develop- ing countries, the generalizability of these effects are often questioned given the concerns about low quality of higher education (Hanushek, 2016) and lacks of incentive for inno- vative activities (e.g., strong intellectual property rights protection) (Acemoglu et al., 2006; Vandenbussche et al., 2006). Our findings suggest that higher education can still play an instrumental role in developing countries by shaping structural transformation and firms’ productivity growth. Second, we contribute to the literature on the economic impacts of providing access to education in developing countries (e.g., Duflo, 2001; Akresh et al., 2018; Khanna, 3We estimate production functions using Ackerberg et al. (2015)’s approach to obtain TFP. 4See, e.g., Acemoglu (1998); Moretti (2004a,b); Vandenbussche et al. (2006). 82023),5 which often overlooks post-secondary education.6 A number of studies ex- ploring the effects of a higher education expansion in China in 1999 on labor market outcomes,7 firm productivity (Che and Zhang, 2018), and technological adoption (Feng and Xia, 2022). Unlike the expansion in Vietnam, which established new universities, the expansion in China raised the college admission quotas (Li et al., 2017), which also raised college enrollment and attainment. The key difference between our study and these studies is that we can exploit the variation in locations and timing of new univer- sities, while other studies rely on enrollment rates or indirect measures of exposure such as rural/urban status which may proxy for many factors. Our approach allows us to directly capture the effect of expanding access to higher education. We also provide an estimate for spillover effects of higher education, which is rarely studied in developing countries. Furthermore, we know very little about how firms in developing countries adjust their capital and production technology in response to an educational policy. While there is a large literature on the pivotal role of human capital in technological change and economic development at the country level,8 only a small number of studies examine firm-level responses to an increase of educated workers (e.g., Che and Zhang, 2018; Feng and Xia, 2022; Khanna, 2023). We provide further firm-level evidence on how firms can adjust along different dimensions such as capital (Lewis, 2013), production technology (Acemoglu, 1998; Clemens et al., 2018; Carneiro et al., 2023), and output mix (Dustmann and Glitz, 2015). The rest of the paper is organized as follows. In Section 3.2 and 2.3, we briefly describe about the national expansion of higher education and the data sources that we use. In Section 2.4, we discuss the empirical strategy to evaluate the effects of individual exposure to the expansion and the results from our estimations. In Section 5This literature includes a small but growing literature on education in Vietnam. Dang and Glewwe (2018) and Dang et al. (2021) explore Vietnam’s exceptional performance in basic education relative to its past and other countries. Phan and Coxhead (2013) examine the economic forces behind changes in returns to schooling. Coxhead and Shrestha (2017) study how foreign direct investment affects schooling decisions. Several studies examine the overall changes in returns to schooling in Vietnam (e.g., Patrinos et al., 2018; Doan et al., 2018; McGuinness et al., 2021) 6A related literature studies whether the value of higher education in developing countries is due to human capital or signaling (Arteaga, 2018; Barrera-Osorio and Bayona-Rodr´ıguez, 2019). 7See, e.g., Li et al. (2014, 2017); Xing et al. (2018); Huang et al. (2022) 8See Valero (2021) for an extensive review. 92.5, we present a capital-skill complementarity model to understand how the expansion may affect firms through the labor market. We then discuss how we evaluate the effects on these outcomes and the results from our estimations. We discuss the implications and shortcomings of our study in Section 2.7. 2.2 Background of the Higher Education Expansion The Decree 121/2007, signed by the Prime Minister Nguyen Tan Dung, laid out the overall plan of the government to expand the network of higher education institutions across the country during the 2006-2020 period. As a result of this plan, the number of universities in Vietnam went from 150 in 2005 to over 250 in 2014 (see Figure 2.1). In 2013, the government announced that it had already reached the number of universities planned for this period, so no new universities were established after that (with a few exceptions).9 Universities are also distributed unevenly across 63 provinces of Vietnam, as shown in Figure 2.2. Ho Chi Minh City, Hanoi, and Hue were the three ”centers” before the expansion, with more than 6 universities in each city/province. After the expansion, other provinces also saw an increase in the number of universities, such as Thai Nguyen and Nghe An in the North and Binh Duong and Dong Nai in the South. Many provinces had the first university open during this period. To understand the impacts of these new universities, we collect data on all univer- sities (including existing universities) from official documents of the government. Both public and private universities were established during this expansion. While public universities were typically “upgraded” from two-year colleges, private universities were usually newly established rather than upgraded. Public universities are established by officials at the province level. However, both types of universities have to justify to the government that they have enough staff, faculty, infrastructure, and land to operate a university. New universities tend to be opened in provinces with a large proportion of workers in skill-intensive sectors. We present in Table 2.1 provincial characteristics (before treatment) for three groups of provinces: those that never had any university during 9See Parajuli et al. (2020) and Vu and Nguyen (2018) for overviews of this policy. 10 our study period (our control group), those that opened a university for the first time (our treatment group), and provinces that already have a university, i.e., the ”already- treated” group. The treatment group has a smaller share of agricultural workers and larger shares of workers in the manufacturing and service sectors. It is also unsurprising to see that the treatment group has a lower poverty rate than the control group. Besides these characteristics, the treatment and the control groups are relative similar in terms of other economic indicators such as employment rate, self-employment rate, and firm- level performances such as total factor productivity, labor productivity, and capital- labor ratio. 11 Figure 2.2: Number of Universities by Provinces in 2005 and 2014 Note: The graph shows the number of universities in each province in 2005 and 2015. It is worth pointing out that the already-treated provinces are substantially different from the control group. The adult population tends to have better education, which is expected given that there are already existing universities. These provinces also have lower percentage of self-employment, higher share of employment, higher shares of manufacturing and service workers, and better economic conditions. These results reflect the nonrandom nature of where new universities are opened. 12 Provinces that open new universities likely respond to growing demands for higher ed- ucation and a growing workforce in industries that require more education. However, the overall economic conditions are more similar to those in the control group than the already-treated group. These results strongly suggest that simply comparing across provinces may yield biased estimates because of selection into establishing new univer- sities. 2.3 Data Our study draws from several data sources. Data on individual and market-level labor market outcomes are based on the Labor Force Survey (LFS). We use the individual- level data for 2010-2018 to examine the individual benefits of being exposed to the higher education expansion. For the market-level analysis, we aggregate this data at the district-by-year level. It is also important to note that these data have information on districts only for 2011 and 2015-2019. This limits our ability to estimate event-study specifications at the district level. For the individual-level analysis, we focus on college education, monthly wage, and employment as the main outcomes of interest. The LFS asks respondents for their high- est educational attainment, which we use to construct a binary variable for whether an individual has obtained a 4-year university degree or more. Employment is mea- sured by whether respondents are employed, as opposed to being self-employed or not working. We also use whether individuals earn a wage as an alternative measure for employment. Monthly wages only include wage compensation. but not other benefits such as bonuses.10 We have this information for most workers, including those who are self-employed. 10This is because questions about bonus are not consistent across years. 13 Table 2.1: Balance table of treatment status at the province level Control Treatment Already treated (N = 14) (N = 21) (N = 28) Mean Mean Difference Mean Difference % college graduates 0.033 0.042 0.009 0.052 0.019* (0.011) (0.011) % high school graduates 0.243 0.256 0.013 0.302 0.060** (0.025) (0.024) % college enrolment 0.190 0.295 0.105*** 0.301 0.111*** (0.033) (0.031) % self employed 0.889 0.858 -0.031 0.821 -0.068** (0.028) (0.026) % employed 0.111 0.142 0.031 0.179 0.068** (0.028) (0.026) % agricultural worker 0.748 0.603 -0.146*** 0.498 -0.250*** (0.052) (0.049) % manufacturing worker 0.076 0.159 0.083*** 0.212 0.136*** (0.026) (0.025) % service worker 0.146 0.211 0.066** 0.264 0.118*** (0.032) (0.030) Log income per capita 8.555 8.755 0.200** 8.881 0.326*** (0.094) (0.089) Urban 0.154 0.213 0.059 0.283 0.128** (0.054) (0.051) % in poverty 0.182 0.134 -0.047** 0.108 -0.074*** (0.021) (0.020) % age 0-5 0.238 0.214 -0.024* 0.201 -0.037*** (0.014) (0.013) % age 6-18 0.176 0.180 0.005 0.170 -0.005 (0.006) (0.006) TFP 11.628 11.751 0.124 11.702 0.075 (0.225) (0.214) Labor productivity 16.982 16.990 0.008 16.997 0.015 (0.110) (0.104) Capital-labor ratio 19.255 19.230 -0.025 19.383 0.128 (0.104) (0.099) Total N 64 This table shows the means of pre-treatment, province-level characteristics by treatment status and the results from their balance tests. The Mean columns display the mean values of these covariates for each treatment group. The Difference columns show the results from estimating a regression with the characteristics as the dependent variable and the treatment status dummy variables as the independent variables. Data on the pre-treatment characteristics are aggregated from the 2004-2006 VHLSS data. 14 For the market-level analysis in the second part of the paper, we aggregate data up to the district-by-year level. Our first outcome variable is the share of working-age adults who have completed college education. Guided by a capital-skill complementarity model that we discuss in Section 2.5.1, we examine the effects of the expansion on the relative supply of college-educated workers, ln (HL ). We measure H and L as the number of college-educated and non-college adults with non-zero monthly wage. The second main outcome of interest is the college wage premium, also known as the relative wage of college-educated workers, measured as the log ratio of college-educated monthly wage to non-college monthly wage, i.e., ln (wHwL ). All observations with zero wage are treated as missing and dropped before aggregation. For both types of analyses, we restrict our analyses to those who are between age 22 and 54 since they are most likely out of school and active in the labor market. Given that Hanoi and Ho Chi Minh city are both the centers of economic growth and universities of Vietnam, we also exclude them from the analyses. Lastly, we study firms’ response to the labor supply shock along three dimensions: total factor productivity (TFP), labor productivity, and capital intensity using firm- level data from the Vietnam Enterprise Census (VEC) for 2006-2018. This data contains detailed accounting information on firms’ annual operations, such as short- and long- term assets, total labor, total revenue, and industries. We measure labor productivity by value added per labor, where value-added is defined as profit plus wage (Newman et al., 2015). To measure the TFP of each firm, we first estimate the relevant production function for all 2-digit industries using Ackerberg et al. (2015)’s approach then use the estimated parameters to obtain the total factor productivity (see Appendix A.2 for a detailed discussion about the estimation process and results). We measure capital intensity as the ratio of capital to revenue. 2.4 Impacts on Individual-Level Outcomes In this section, we explore the effects of the expansion on individual-level outcomes. The expansion increases accessibility to higher education for younger cohorts; specifically, those who were at the college-going age when there were more new universities because of the reform. In contrast, it does not affect older cohorts who would generally be 15 too old to benefit from the increase in number of universities. As the younger cohorts have better access to higher education, they may be more likely to obtain a university degree, and hence, may do better than older cohorts in terms of employment and wages. We discuss this difference-in-differences design and estimation in Section 2.4.1 and the results in Section 2.4.2. We further explore the effects of being exposed to the expansion on the type of occupation and the industry of work. In Section 2.4.4, we discuss why the cohort-based DiD estimates do not identify the partial equilibrium returns to higher education given the likely general equilibrium (GE) forces in our context, and how one can estimate such returns if knowing the GE effects of the expansion. 2.4.1 Empirical Strategy to Estimate Worker-Level Effects We take advantage of the variation in the opening dates and locations of these uni- versities to identify the effects of the expansion on workers and firms. At the worker level, we compare birth cohorts that were 25 years old or younger when the expansion took place and, thus, would benefit from having access to new universities (i.e., the exposed cohorts), and birth cohorts that were older than 25 years old and, thus, would be too old to benefit from the expansion (i.e., unexposed cohorts). We then compare the cohort differences across provinces that established universities for the first time (treat- ment provinces) and those that never established a university before (control provinces). There are two reasons for choosing 25 as the cutoff age. Those who are slightly older than 18 years old may still be eligible for college (e.g., those who repeated a class in earlier ages). Another reason is that individuals may choose to attend a college (2-year or 3-year degree) before transferring to a university. Given that the timing of university opening varies across provinces, let g denote the year that a given province has its first university ever, t the survey year, and c the birth year of a given cohort. Let Gg denote the group of provinces with the same treatment year. For each group of provinces with the same treatment year Gg in the same survey year, we can compare them to the never-treated provinces as the control group across the exposed and unexposed birth cohorts. That is, we can estimate the following two-way fixed effects (TWFE) model in a subdataset with only provinces in Gg (treatment group with treatment year g) and the 16 never-treated provinces (control group) in the same survey year t: yG,ti,p,c = δ G,t(TG,tp × ExposedG,tc ) + γp + ηc + ϵi,p,c where yi,p,c denotes the outcome of individual i in province p of cohort c; Tp indicates whether province p had the first university during the expansion; Exposedc indicates whether cohort c was 21 years old or younger when the first university was opened in year g. Province and cohort fixed effects are γp and ηc. Since we are comparing each treatment group to a never-treated group in a given year, δG,t captures the treatment effect of a specific group G in a specific year t. One can then aggregate δG,t across all G and t to obtain the average effect for different groups of provinces and for different years. Alternatively, we can combine all sub datasets and estimate the following model: yi,p,c,s = δ(Tp,s × Exposedc,p,s) + γp,s + ηc,s + ϵi,p,c,s (2.1) where s denotes the subdataset for each G and t, while γp,s and ηc,s control for province- by-subdataset and cohort-by-subdataset fixed effects. Thus, δ captures the weighted average of all δG,t. To ensure that we are capturing the effect of having access to higher education via the expansion, we limit our analysis to survey years that are at least 4 years after the treatment year g, allowing enough time for the exposed cohorts to complete their 4-year degree before entering the labor market. This approach, proposed by Cengiz et al. (2019), allows us to compare each treatment group G with a clean control group in each given year, thus avoiding the problem of negative weights in a standard TWFE model (De Chaisemartin and d’Haultfoeuille, 2020; Goodman-Bacon, 2021). There are, of course, other different estimators such as Callaway and Sant’Anna (2021), Borusyak et al. (2021), and others. Yet this estimator is particularly suitable given that we only have repeated cross-sectional data and need to handle treatment effect heterogeneity across more than two dimensions. The DiD model imposes a conditional parallel trends assumption, i.e., without the expansion, the outcome variables would have evolved similarly for the treatment and control provinces, conditional on the fixed effects and control variables. For the rest of this section, we discuss when this assumption might be violated and other concerns 17 about estimation and inference strategy, as well as how we address these concerns. Threats to the Parallel Trends Assumption Three important threats to the parallel trends assumption are considered here. First, the treatment group and the control group might have followed pre-expansion differen- tial trends, which may bias our results. Second, given that provinces that opened the first university and those that did not may be systematically different, and the differ- ences might have differential effects across birth cohorts. Lastly, our results might be confounded by inter-provincial migration. Either one of these threats can lead to the parallel trends assumption being violated. To address the concern about pre-expansion trends, we extend our stacked regression model to estimate an event study specification that allows us to visualize the parallel pre-trends (or lack thereof). Following Bilinski and Hatfield (2018)’s proposed test for parallel pre-trends, we assume that there is no differential trends between the treatment and the control groups in the true model. Thus, if we allow the two groups to trend differently by adding an interaction term Tp,s×Age Exposedc in the TWFE model, then the estimated treatment effects of such a specification and the base specification should be similar. To address a related concern about non-random assignment of the treatment status, we first use propensity score matching (PSM) on the pre-expansion observables to select a more comparable control group before estimating the TWFE model. It is, however, important to note that such a matching approach is valid only when the pre-treatment covariates are strongly serially correlated. Matching on covariates that are moderately serially correlated in a difference-in-differences model can introduce bias due to regres- sion to the mean (Daw and Hatfield, 2018). Therefore, we use propensity score matching only based on urban and poverty rate, which are both relatively more stable over time than other covariates. In the Appendix, we also construct an alternative control group based on the not- yet-treated provinces. Given that the already-treated provinces and the not-yet-treated provinces only differ in the timing, we can assume that they are relatively more similar than the already-treated and the never-treated provinces. For every group Gg in year t, we construct a control group that is made up of provinces with new university for 18 the first time 3 years ago or later. That is, the control group are provinces with new university founded in year g′ where t− g′ ≤ 3. A more severe threat to our identification strategy is inter-province migration. New universities increase the supply of college-educated workers, thus reducing their wage level at the market level. This may induce college-educated workers to migrate to other provinces. Alternatively, new universities may create more job opportunities for college- educated workers and attract even more of these workers. In other words, our results might be driven by the educational attainment of the out-migrants and in-migrants. We address this issue by running the same regression on province of origin instead of province of current residence. The results, therefore, should not be affected by changes in the composition of provinces of current residence. For the continuous monthly wage variable, another useful check is a change-in- changes model (Athey and Imbens, 2006) that imposes a weaker assumption than a parallel trends assumption. Specifically, it allows the distribution of outcome to vary in terms of mean and variance in the absence of treatment. The identifying assumption of this model is that in the absence of treatment, the distribution of the unobservables can vary between the treatment and control groups, but not across time within group (Athey and Imbens, 2006; Imbens and Wooldridge, 2009). If this alternative model yields a similar result to that of the standard DiD model even though it does not rely on the parallel trends assumption, we can be more confident that our results are not driven by a violation of that assumption. 2.4.2 Results for Worker-Level Effects In Table 2.2, we show the descriptive statistics of the relevant variables for this analysis. Specifically, we present the mean and standard deviation for the unexposed cohorts (i.e., exposed between age 26 to 36) and the exposed cohorts in the control and treatment provinces. In Table 2.3, we present the results from estimating the difference-in-differences model leveraging the variation in exposure across birth cohorts, provinces, and survey year. In the first specification, we estimate a standard TWFE model with a stacked regression as discussed in Section 2.4.1. In specification 2, we first use propensity score matching (PSM) to select a more comparable control group based on pre-treatment 19 observable characteristics, specifically share of workers in different sectors as well as av- erage income, urban status, and poverty rate. In the third specification, we estimate the main TWFE model but allow the treatment and control provinces to have differential linear trends. In specification 3, we estimate a similar model as the first specification, but including an interaction term for treatment status and linear trends to allow the treatment and control groups to trend differently. In all specifications, we allow treat- ment effects to vary among those who were exposed at age 18 or younger and those who were exposed between age 18 and 25. We address migration in three different ways. In specification 4, we estimate the same TWFE model on a non-migrant subsample. In specification 5, we first construct a dataset with the treatment status based on province of origin instead of province of current residence. We then estimate a similar TWFE stacked regression on this dataset. If migration is the driver of our result, these two specifications would likely produce results that are different from the main models. Lastly, we estimate the effects of the expansion on inter-provincial migration status and report them in Figure 2.4. We find that the expansion raises the probability of completing college of exposed individuals by 3.6 to 4.5 percentage points, as indicated by the results in column (1). Accounting for pre-treatment observables via PSM or differential linear trends brings the estimate slightly down, but the result is robust. Notably, those who were exposed between age 19 to 25 also saw an increase of 1.5 to 2.2 percentage points in college completion rate. This is likely due to potential spillover effect of college expansion as slightly older cohorts may also try to attend college as it becomes available in their provinces. The adjustments for differences in observables and differential pre-trends do not alter the result significantly, implying that our result is not driven by pre- treatment trends. Figure 2.3(a) confirms this implication. Those who were exposed to the expansion between age 36 and 27 saw very little effects on college completion, while those who were exposed in latter ages saw larger effects. In column (2), we find that being exposed to the expansion between age 14 to 18 years old raises the probability of employment by 13.1 percentage points in the base specification. Adjusting for pre-treatment observables via PSM and differential trends lower the estimates to 12.6 percentage points and 7.4 percentage points, respectively. In other words, it can be concluded that the expansion raises employment for those who 20 were exposed. The event study in Figure 2.3(b) yields a very similar conclusion. Table 2.2: Summary statistics by cohort and province Treatment Control Age exposed 26-36 19-25 12-18 26-36 19-25 12-18 Age 38.602 29.756 24.555 38.017 29.236 24.404 (3.916) (3.052) (2.064) (3.875) (2.999) (2.050) Female 0.509 0.503 0.488 0.504 0.499 0.483 (0.500) (0.500) (0.500) (0.500) (0.500) (0.500) Completed college or higher 0.102 0.123 0.108 0.120 0.115 0.077 (0.302) (0.329) (0.311) (0.325) (0.319) (0.266) Employment 0.426 0.532 0.582 0.343 0.396 0.357 (0.494) (0.499) (0.493) (0.475) (0.489) (0.479) Log monthly wage 3.315 3.273 3.205 3.160 3.062 2.944 (0.706) (0.630) (0.592) (0.870) (0.827) (0.842) Ag. employment 0.375 0.343 0.330 0.517 0.505 0.569 (0.484) (0.475) (0.470) (0.500) (0.500) (0.495) Manufact. employment 0.137 0.180 0.210 0.077 0.089 0.102 (0.344) (0.385) (0.407) (0.267) (0.285) (0.302) Service employment 0.096 0.108 0.107 0.111 0.120 0.079 (0.295) (0.311) (0.309) (0.314) (0.325) (0.270) Given the positive impact on employment, it is also unsurprising to find that ex- posure to the expansion also raises the monthly wage by 4 to 8.6 percentage points in column (3). The effects are significant in the main specification and in the one with PSM, but not in the third specification where we allow the treatment and control groups to have differential linear trends. However, Figure 2.3(c) does not indicate any significant pre-trends. 21 Table 2.3: Difference-in-differences estimates for the effect of exposure to the higher education expansion on individual-level outcomes Complete Being Log Employment college employed wage Ag. Manufact. Service (1) (2) (3) (4) (5) (6) Specification 1: TWFE Exposed at 19 to 25 years old 0.022*** 0.043*** 0.035* -0.014 0.029** 0.000 (0.007) (0.015) (0.020) (0.013) (0.013) (0.007) Exposed at 14 to 18 years old 0.045*** 0.131*** 0.086** -0.090*** 0.046* 0.041*** (0.011) (0.033) (0.037) (0.028) (0.024) (0.009) N 2909364 2731604 1821739 2731604 2731604 2731604 Specification 2: TWFE with PSM Exposed at 19 to 25 years old 0.026*** 0.046** 0.040* -0.020 0.022 0.006 (0.009) (0.020) (0.022) (0.015) (0.017) (0.006) Exposed at 14 to 18 years old 0.044*** 0.129*** 0.082* -0.095*** 0.035 0.043*** (0.014) (0.044) (0.045) (0.034) (0.032) (0.009) N 1912122 1784737 1244384 1784737 1784737 1784737 Specification 3: TWFE with differential linear trends Exposed at 19 to 25 years old 0.017*** 0.009 0.012 -0.009 0.007* -0.003 (0.005) (0.006) (0.011) (0.007) (0.004) (0.005) Exposed at 14 to 18 years old 0.036*** 0.074*** 0.047 -0.082*** 0.007 0.036*** (0.012) (0.017) (0.028) (0.016) (0.008) (0.011) N 2909364 2731604 1821739 2731604 2731604 2731604 Specification 3: TWFE with non-migrant Exposed at 19 to 25 years old 0.022*** 0.048*** 0.041* -0.021 0.033** 0.002 (0.008) (0.015) (0.022) (0.014) (0.014) (0.006) Exposed at 14 to 18 years old 0.047*** 0.137*** 0.102** -0.098*** 0.046* 0.043*** (0.013) (0.036) (0.043) (0.033) (0.025) (0.010) N 2082423 1965391 1526743 1965391 1965391 1965391 Specification 4: TWFE with province of origin Exposed at 19 to 25 years old 0.021*** 0.043*** 0.030 -0.014 0.030** -0.002 (0.007) (0.016) (0.022) (0.015) (0.013) (0.007) Exposed at 14 to 18 years old 0.047*** 0.139*** 0.089** -0.097*** 0.047** 0.040*** (0.012) (0.035) (0.042) (0.030) (0.023) (0.009) N 1797157 1695068 1090311 1695068 1695068 1695068 Table reports DiD estimate for the effects on individual outcomes. For each group of provinces with the same treatment year in each survey year, we create a subdataset with those treatment provinces and provinces that were never treated as the control group. Subdatasets are stacked together and a TWFE model is estimated on this new dataset, controlling for province-by-subdataset and cohort-by-subdataset fixed effects. All models control for age, age squared, and gender. Specification 2 is a TWFE model with a matched control group based on PSM. Specification 3 is a TWFE model that allows treatment and control to have differential linear trends. Specification 4 is the same as the first but restricted to a non-migrant sample. Specification 5 is the same as the first but uses the province of origin to construct the treatment variable instead of province of current residence. All samples include individuals between age 22 and 55. All standard errors are clustered at the province level. Data is drawn from LFS 2010-2018. 22 Figure 2.3: Event Study for the Worker-Level Effects of the Higher Education Expansion (a) Complete college (b) Employment (c) Monthly wage (d) Agricultural employment (e) Manufacturing employment (f) Service employment Note: The graphs display event study estimation for the effects of the higher education expansion on college completion and labor market outcomes. All models control for age, age squared, and gender. Standard errors are clustered at the province-level and 95% confidence intervals are displayed. 23 In columns (4) to (6), we examine whether the expansion has any effect on the sector of work. We find that exposed individuals are less likely to work in the agricultural sector, and more likely to work in the manufacturing and service sectors. It is, however, important to point out that the effect on manufacturing employment is reduced to almost zero when controlling for differential pre-trends, while the effects on the other sectors are robust to different specifications. The event study results in Figure 2.3(d), (e), and (f) yield very similar conclusions. Taken together, the results indicate that exposure to the expansion increases the probability of completing college by more than 4 percentage points among those who were exposed at age 12 to 18; this is equivalent to an increase of over 57%. It also increases the chance of being employed of individuals, especially in the service sector. Subsequently, we also find that exposure to the expansion raises monthly wage by 4 to almost 9 percent. Migration appears to play a very minor role in the effects of exposure. Specification 4 using a non-migrant sample generally yields similar results to the main specification, suggesting that the main findings are driven by individuals born in their province of cur- rent residence. Specification 5 using a sample with treatment status based on province of origin also confirms this conclusion, as the magnitudes of the estimations are similar to our main findings. In Figure 2.4, we estimate the effect of being exposed on indi- viduals’ migration status on two samples. The sample using treatment status based on province of current residence gives us the effect on in-migration, while the sample using treatment status based on province of origin gives us the effect on out-migration. The results on both types suggests that exposure to the expansion increases both migration flowing into and out of treatment provinces, but the treatment effects are small and almost always insignificant. Figure 2.5 presents the distributional effects on log monthly wage using the change- in-changes (CiC) model (Athey and Imbens, 2006).11 All treatment effects are positive and statistically significant; the treatment effects in most quantiles are also in line with the treatment effect estimated from the TWFE model above. Interestingly, these results indicate that those at the bottom of the wage distribution see the largest increase in wages relative to those at the top. 11We use the Stata command cic (Melly and Santangelo, 2015) to estimate this CiC model. 24 Figure 2.4: Event Study for Worker-Level Effects on Interprovincial Migration Note: Graph shows the event study estimation for the effect of expansion on in-migration and out-migration. In-migration effect is measured by effect on migrant-status on a sample using province of current residence. Out-migration effect is measured by effect on migrant-status on a sample using province of origin. Figure 2.5: Change-in-changes Model for Worker-Level Effects on Monthly Wage Note: The graph shows the distributional treatment effects estimated from a change-in-changes model (Athey and Imbens, 2006). 25 These CiC results suggest that our wage estimates are partly driven by general equilibrium effects. To see this, we report the share of individuals completing college by wage quantile in the control provinces. The two lowest quantiles of the wage distribution have college completion rate less than 5%, while the treatment effects are higher than the rest of the distribution. In contrast, the two highest quantiles have up to 35% college completion rate, but the treatment effects are smaller. These results suggest that non-college workers likely see a (larger) increase in wage due to the expansion. 2.4.3 Effects on Occupation and Industry of Work An important implication of our findings is that being exposed to the expansion allows workers to move out of the agricultural sector and into more productive sectors such as service and manufacturing. This is a classic story of structural transformation of economic development: as workers become more educated, they move from less produc- tive to more productive sectors, thus contributing to productivity growth (Porzio et al., 2022). We take a closer look at the specific industries and occupations in which exposed individuals choose to work using the same cohort-based DiD model in Equation 2.4. The industry effects are reported in Figure 2.6(a). First, exposure to the expansion reduces the likelihood of working in agriculture, fishing, and forestry, as well as con- struction. Second, it increases the likelihood of working in education and health as well as politics. More surprisingly, exposure has a positive and significant effect on working in IT, finance, and science, as well as entertainment and other services, but the effect sizes are relatively small. The occupation effects are reported in Figure 2.6(b). We observe that the expansion significantly reduces the likelihood of being an elementary worker (e.g., domestic helpers or manual labor). We also find positive effects on clerks (e.g., secretaries and bank tellers), technicians, scientists, experts, and managerial positions. The effect on being a scientist or expert is the most profound. These results strongly suggest that having access to higher education is valuable because it allows workers to obtain occupations and industries that typically require such a level of educational attainment.12 12In Figure A.1, we show that exposed individuals are more likely to have formal employment, have higher-paid occupation, and work in more skill-intensive industries. 26 Figure 2.6: DiD Results for the Worker-Level Effects on Industry and Occupation (a) Industry (b) Occupation Note: Graph shows the coefficient of the DiD interaction term and its 95% confidence interval. Each outcome is a binary variable indicating whether an individual works in the given industry or occupation. Standard errors are clustered at the province level. 27 Effects by Gender The CiC estimationd reveals interesting heterogeneity in wage effects across the wage distribution. One potential dimension in which this heterogeneity manifests itself is gender. Figure 2.7 provides the event study results estimated by gender. The effects on college education of females are relatively large, while those of males are smaller. As a result, the positive effects on labor market outcomes such as employment and wage are also larger among female than males. The wage effects among males are almost zero. These differences are also consistent with previous studies (Elsayed and Shirshikova, 2023). Figure 2.7: Event Study for the Worker-Level Effects by Gender (a) College education (b) Employment (c) Monthly wage Note: The graphs display event study estimation for the effects of the higher education expansion on individual- level outcomes by gender. All models control for age and age squared. Standard errors are clustered at the province-level and 95% confidence intervals are displayed. 28 There are three potential explanations for such a large gap in treatment effects by gender. First, our results indicate that the expansion has a much larger effect on service employment than other sectors. Since service employment is also dominated by women, we can think of the expansion as an intervention to lower the cost of obtaining a degree to get a career in the service sector, so this intervention is particularly more relevant for female than for male. An alternative explanation is that parents are less willing to let daughters to travel far for college education compared to sons. Opening a new university in a province thus allows girls in that area to access to higher education without traveling. Sa´nchez and Singh (2018) also offer another alternative explanation: in Vietnam, aspiration for higher education of girls is higher than that of boys. Aspiration is also an important predictor of enrollment for girls but not for boys. Therefore, improving access unlocks the possibility that girls can enroll in higher education in Vietnam. 2.4.4 Returns to Completing Higher Education in the Presence of Equilibrium Effects A relevant question related to the expansion of higher education is what the value of a higher education degree is to a marginal student in developing countries. That is, is there any returns to an individual who choose to complete a college education because there is a new university in their province? We briefly discuss the methodological challenge of calculating the returns to higher education in this setup and propose a method to do so. Conventionally, one can leverage the higher education expansion exposure as an instrumental variable to estimate the returns to completing higher education. Scaling the wage effect in column (3) by the college completion effect in column (1) in Table 2.3 can give a Wald estimator for the returns of completing college among those who were induced to do so by the expansion (De Chaisemartin and d’Haultfoeuille, 2018).13 Like any other IV design, this approach assumes that exposure to the expansion can only affect wage through college completion. This assumption, however, is unlikely to be true in our context. The expansion is likely to have general equilibrium (GE) 13Other studies, e.g., Zimmerman (2014); Ost et al. (2018); Lovenheim and Smith (2022), and Bleemer and Mehta (2022), also use this approach to estimate the returns to college education in the US. 29 effects on the wage of college-educated and non-college workers in both the exposed and unexposed cohorts. As pointed out elsewhere (Khanna, 2023), the DiD estimate reflects a return somewhere between the partial equilibrium and GE wage effect; as a result, a Wald estimator based on the DiD result would likely be confounded by these general equilibrium effects. For simplicity, consider a young and an s, denoted as y and o, respectively, and let T = {0, 1} denote the treatment status of a province. The wage effect estimated by the DiD model can be rewritten as:14 WDiD = ln Wy,T=1 Wo,T=1 − lnWy,T=0 Wo,T=0 Let also denote c and n as college-educated and non-college workers; the expression above can be rewritten as WDiD = ∆lcy.βcy,T=0 + (lcy,T=0.∆wcy + lny,T=0.∆wny) + (lco,T=1.∆wco + lno,T=1.∆wno) (2.2) where lcy and lny are the shares of younger cohorts who are college-educated or non- college, respectively; similarly, lco and lno are the shares for the s; wcy and wny are the GE effects on college-educated and non-college workers in the younger cohorts; wco and wno are the GE effects on the s. ∆lcy captures the share of “switchers” due to the expansion. Our parameter of interest is the partial equilibrium returns, βcy,T=0, which captures the returns of completing college for an individual in the control provinces. Equation 2.6 indicates that the wage effect from the DiD model captures both the partial equilibrium effect of completing college and the GE effects on college and non- college workers in both cohorts. Unless these GE effects cancel each other out, our cohort-based DiD approach is not sufficient to identify the partial equilibrium returns to higher education. Equation 2.6, however, shows us how to recover the partial equilibrium returns if we know the GE wage effects by cohort and educational level. In the next sections, we discuss the GE effects of the expansion on the labor market and firms and our identification strategy to estimate these effects. We then come back to recover the partial equilibrium returns toward the end of the paper. 14See Khanna (2023) for detailed derivation. 30 2.5 Equilibrium Effects on the Labor Market and Firm Productivity In Section 2.4.2, we found that expanding access to higher education greatly benefits individuals at college-going age. Having a university in their provinces significantly in- creases the probability of completing college and, hence, improve their employment and wage outcomes. Yet, as discussed above, it remains unclear what the general equilib- rium (GE) effects of the expansion are, especially in terms of college wage premium and firm productivity. To guide our empirical exploration, we first consider a theoretical model of en- dogenous technological adoption by Acemoglu (2007) and Carneiro et al. (2023). We then extend the stacked regression approach to study the GE effects of the expansion on province-level labor market as well as firm-level productivity and capital intensity. Lastly, we revisit the partial equilibrium returns to higher education. 2.5.1 A Model of Endogenous Technological Adoption We present a simple framework on the relationship between skill premium and relative supply of skilled workers (Card and Lemieux, 2001) in the presence of endogenous technological change (Acemoglu, 1998; Moretti, 2004a; Blundell et al., 2022; Carneiro et al., 2023; Khanna, 2023). First, let us denote two cohorts, j = {y, o} y and two levels of education, e = {c, n} as before. Suppose the following constant elasticity of substitution (CES) production function: Yt = [α.(Ac.Lc) σE−1 σE + (1− α).(An.Ln) σE−1 σE ] σE σE−1 where Lc = (ϕcyl σA−1 σA cy + ϕcol η co) σA−1 σA and Ln = (ϕnyl η ny + ϕnol η no) σA σA−1 are the aggregate supply of college-educated and non-college workers across the two cohorts; ϕje denotes productivity of cohort j of educational level e; Ae denotes productivity of educational level e. Elasticity of substitution between educational level is denoted by σE and elas- ticity of substitution between cohort is denoted by σA. Assuming that workers are paid at their marginal productivity, the college wage 31 premium can be written as:15 log wcj wnj = log Ac An − ( 1 σA − 1 σE ) . log Lc Ln − 1 σA log lcy lny (2.3) where the first three terms capture the general equilibrium effects on both young and s due to skill-biased technological change and a shift in the aggregate skill distribution, while the last term captures the additional effect on the younger cohort due to cohort- specific skill distribution shift. In the short run, firms cannot adjust their production technology, so the relative productivity term, log AcAn , which reflects the type of technology firms use, is not de- pendent on the share of college-educated workers in the labor market. Therefore, the increase in the relative supply would lower the college wage premium, which is captured by the last two terms in Equation 2.3. In the long run equilibrium,log AcAn is dependent on the relative supply of college- educated workers in the given labor market. Firms can adopt production technology that either augments college-educated or non-college workers. Technology that augments college-educated workers, also known as skill-biased technology, is one that increases the marginal product of college-educated workers more than that of non-college workers. Technology that augments non-college workers, on the other hand, raises the marginal product of non-college workers more. To understand how the relative supply affects firm productivity, suppose that the relative productivity log AcAn is driven by the type of technology θ k that firms choose to adopt. Firms choose to adopt technology θk over θk ′ when Y (Lc, Ln, θ k) − c(θk) > Y (Lc, Ln, θ k′)−c(θk′) where Y is the production function and c(θ) is the cost of adopting a technology. It can be shown that firms will switch to skill-biased technology as the quantity of college-educated workers exceeds a certain threshold (Acemoglu, 2007; Carneiro et al., 2023).16 As a result, we can write the relative productivity as a function of the relative 15This is the canonical model of skill differentials (Katz and Murphy, 1992; Card and Lemieux, 2001). 16Suppose there are two technologies that are skill-biased and labor-biased, denoted as θc and θn, and all firms start with the labor-biased technology θn. As the economy has more college-educated workers, the marginal product of college-educated workers under the skill-biased technology becomes higher than that under the labor-biased technology, i.e., ∂Y ∂Lc ∣∣∣ θc > ∂Y ∂Lc ∣∣∣ θn . Firms switch to skill-biased technology when the number of college-educated workers C exceeds a certain threshold C∗ where firms 32 supply of college-educated workers log AcAn = f(log Lc Ln ). As the expansion increases the relative supply log LcLn , firms are more likely to adopt skill-biased technologies, raising the relative productivity log AcAn of college-educated workers. However, it is important to note that the technological adoption is a long-term equilibrium outcome, since the share of college-educated workers has to reach a certain threshold before firms find it worthwhile to invest in a new technology. In other words, we would expect the labor market to be affected first as the supply of college-educated workers in the younger cohort increases due to the expansion. Specifically, it reduces the college wage premium via a supply effect. In the long run, as firms adopt new technology, the relative productivity would increase and offset the negative wage effect. Given the implications of this model, we structure the rest of the analysis as follows. We first analyze the effects on the labor market at the province level. We allow the treatment effects to vary between the short run (4 to 8 years after the expansion) and the long run (9 to 12 years after the expansion). We specifically study how the share as well as the relative supply of college-educated workers evolve in the short and long run for both cohorts and, in turn, how college wage premium changes in response. Using Equation 2.3, we can further recover the structural parameters in this model such as the elasticity of substitution across educational level, σE as well as the elasticity of substitution across cohort, σA. Specifically, we can assume that the technological adoption process does not happen in the short run, which we confirm with firm-level data later on.17 Thus, ( 1 σA − 1σE ) captures the effect on college wage premiums of both cohorts, while − 1σA captures the additional effect on college wage premium of the younger cohort only. Given these GE effects on the labor market, we then turn to examine the effects on firm-level productivity. Similarly, we allow the treatment effects to vary in the short and long run. While log AcAn captures the total factor productivity, we also examine if the results are consistent with labor productivity, which is measured by value added per worker. This model belongs to a class of endogenous technological adoption models such as Beaudry et al. (2010), Clemens et al. (2018), and Blundell et al. (2022). There are other are indifferent between the two technologies. 17This assumption is also commonly made in this literature. For example, see Card and Lemieux (2001) and Carneiro et al. (2023). 33 frameworks to think about how firms adapt to a labor supply shock.18 Early model of human capital externalities assumes that an increase in the stock of college-educated workers would make everyone more productive (Moretti, 2004a). Therefore, an increase of college-educated workers would make the labor force more productive, offsetting some of the negative effect on relative wage. The empirical predictions of this model and ours are essentially identical. Second, Acemoglu (1998, 2002) develops a model of directed technical change, in which an increase of college-educated workers would create an incentive for R&D firms to create innovations that are biased towards college-educated workers, raising the overall productivity level in sectors that use college-educated workers more. Given the lack of a formal R&D market, the directed technical change model is not applicable in our context. Third, a class of capital-skill complementarity models argues that firms adopt skill- biased capital to replace unskilled labors in response to an increase of the relative supply of skilled labors because of the complementarity. As a result, productivity can also rise in such models. This alternative model is entirely possible in our context, given that other studies have found that firms adopt skill-biased capital in response to an increase in supply of skilled labor (Che and Zhang, 2018; Khanna, 2023). There is a longstanding debate about whether firms adjust along the capital or production technique dimension in response to a change in skilled or unskilled labor (see Lewis (2011, 2013)). Our firm-level data allows us to directly test whether the capital skill complementarity model or the endogenous technological change model is correct by examining the effects on capital per worker and capital intensity. 2.5.2 Empirical Strategy for Market-Level Analysis We can apply the stacked regression approach to study the effects of the expansion on firm-level productivity as well as market-level supply and college wage premium. Let Yp,t denote firm-level outcome in province p in year t and Postt indicates whether t is after the treatment starts. Similarly, we construct a subdataset for each group of provinces with the same treatment year g with a clean control group. The group-specific TWFE 18See a more thorough discussion of these related frameworks in Lewis (2013). 34 model is: Y Gp,t = β G.(TGp × PostGt ) + θp + κt + ϵp,t where βG captures the province-level treatment effect for group G, while θp and κt capture province and year fixed effects. The analogous stacked regression at the market level is: Yp,t,s = β.(Tp,s × Postt,s) + θp,s + κt,s + ϵp,t,s (2.4) where β is the weighted average of all βG. The main difference between the cohort-based DiD model and this model is that the first one uses variation across birth cohorts while the second one uses variation across survey years. The advantage of the stacked regression approach, relative to other DiD staggered timing estimators, is that we can naturally extend our setup to a triple-differences model. Given our theoretical predictions, we can estimate the elasticity of substitution between college-educated and non-college workers via the difference in treatment effects between the two cohorts. Therefore, we estimate the following triple differences model: Yp,t,s,j = δ.(Tp,s × Postt,s × Y oungj,s) + θp,j,s + κt,j,s + γp,t,s + ϵp,t,s (2.5) where θp,j,s, κt,j,s, and γp,t,s denote province-by-cohort-by-subdataset, year-by-cohort- by-subdataset, and province-by-year-by-subdataset fixed effects. 2.5.3 Results for the Effects on the Labor Market and Firms In Table 2.4, we report the results for the effects of the expansion on several province- level outcomes by cohorts. First, we examine the effect on the share of college-educated workers at the province level in column (1). In the first four to eight years post- expansion, the share of college-educated workers among the younger cohort raised by 1.3 percentage points. After 9 years post-expansion, this share increased slightly by 1.6 percentage points. Both short-term and long-term results are statistically significant. In contrast, the saw effects close to zero (0.4 percentage points) and statistically in- significant. Interestingly, even after 8 years, the effect remains very small. Figure 2.8(a) suggest that these effects are not driven by pre-trends. 35 Table 2.4: Difference-in-differences estimates for general equilibrium effects on labor market at province level College Employment Log wage Relative College share College Non- college College Non- college supply premium All cohorts 4-8 years after 0.003 0.003 0.034*** -0.052*** 0.078 0.054 -0.130* (0.004) (0.006) (0.011) (0.018) (0.066) (0.050) (0.067) 9+ years after 0.002 0.008 0.044** -0.041** 0.237** 0.031 -0.278** (0.008) (0.010) (0.017) (0.020) (0.112) (0.093) (0.112) Young cohort 4-8 years after 0.013*** 0.035*** 0.052*** -0.040 0.077 0.133** -0.117* (0.004) (0.011) (0.016) (0.035) (0.054) (0.052) (0.058) 9+ years after 0.016** 0.067*** 0.064** -0.003 0.204** 0.130 -0.206** (0.008) (0.019) (0.024) (0.038) (0.095) (0.097) (0.092) Old cohort 4-8 years after 0.004 0.008 0.039*** -0.044* 0.081 0.059 -0.125* (0.004) (0.007) (0.012) (0.023) (0.065) (0.056) (0.067) 9+ years after 0.004 0.016 0.049** -0.026 0.239** 0.042 -0.266** (0.008) (0.012) (0.019) (0.026) (0.112) (0.099) (0.110) Triple differences 4-8 years after 0.012** 0.041*** 0.020** 0.006 -0.006 0.111* 0.012 (0.005) (0.011) (0.010) (0.029) (0.029) (0.056) (0.041) 9+ years after 0.018*** 0.076*** 0.021 0.035 -0.054 0.132 0.089* (0.006) (0.017) (0.013) (0.036) (0.047) (0.078) (0.050) N 2716 2716 2716 2716 2716 2716 2716 This table reports DiD estimate for the effects of the expansion at the province level. For each treatment group of provinces with the same treatment year, we create a subdataset with those treatment provinces and provinces that were never treated as the control group. These subdatasets are stacked together and a stacked DiD model is estimated on this new dataset, controlling for province-by-subdataset and year-by-subdataset fixed effects. Only the DiD coefficient of the interaction term is reported. Young and Old cohorts are defined as those above or below 35 years old. All samples include individuals between age 22 and 55. Relative supply is defined as log of number of college-educated workers divided by number of non-college workers. Premium is defined as the log of wage ratio between the two workers. All standard errors are clustered at the province level. 36 The expansion also has a positive effect on employment rate among the college- educated workers, although the effect is larger and statistically significant for the younger cohort. Interestingly, the long-run employment effect is twice as big as the short-run ef- fect for college-educated workers. This is suggestive that firms adjust their skill-biased technology in the long run and, thus, hire more college-educated workers. The ex- pansion also has positive employment effects on non-college workers in both cohorts. The younger cohort, again, appears to experience larger employment effects than the , although the short-run and long-run employment effects are relatively similar. It is, however, important to note that some of these effects might be driven by pre-treatment trends, as suggested in Figure 2.8(b) and (c). The expansion have negative but insignificant wage effect among college-educated workers in both cohorts. The long-run wage effects are smaller than the short-run effects although they are mostly statistically insignificant. In contrast, the wage effects on non-college workers are both positive and statistically significant. Most surprisingly, the long-term effects are much larger and statistically significant for both cohorts. These results point towards a higher demand for non-college workers in both cohorts in the long run. The stark contrast between the college-educated and non-college workers can also be observed in Figure 2.8(d) and (e). We now turn to the effects on the relative supply and college wage premium in the last two columns. As expected, the expansion raises the relative supply of college- educated workers among the younger cohort in both short and long run. In the short run, the relative supply increased by 13.3 percentage points among the younger cohort. As expected, college premium declined by 11.7 percentage points due to the supply effect. More surprisingly, college premium of the younger cohort decreased almost twice as much in the long run as in the short run; the long-run effect of the expansion is 20.6%, while the relative supply did not appear to change. For the , these patterns are similar. The expansion has very small effects on the relative supply of skilled labors in both short and long run. College premium among the also suffered negative effects in both short and long run; the magnitudes of these effects are slightly higher than those among the younger cohort. The event study in Figure 2.8 paints a similar conclusion. Pre-trends do not appear to drive the effects on relative supply and premium. 37 Figure 2.8: Event study results for the effects on province-level outcomes - by age cohort (a) Share of adults with college education (b) Employment rate of non-college (c) Employment rate of college-educated (d) Log wage of non-college (e) Log wage of college-educated Note: These graphs show the event-study estimation results for the effects province-level outcome. Each estimate shows the coefficient of the interaction term between year dummy and treatment status. All models control for province and year fixed effects. Standard errors are clustered at the province-level and 95% confidence intervals are displayed. 38 In summary, the short-run effects are as expected from the theoretical model. The expansion shifts the relative supply of college-educated workers in the younger cohort. As a result, it reduces the college wage premium for both cohorts, suggesting that the two cohorts are close substitutes in production. In the long run, the effects on college wage premium are more negative in the long run, driven by substantial increases in the monthly wage of non-college workers. These long-run results suggest that the expansion creates a larger demand for unskilled labor in the long run. We also find that the expansion has a positive employment effect on non-college workers in both short and long run. The larger negative effect on college premium in the long run is unexpected from both the theoretical model and also findings from the previous literature (Goldin and Katz, 2010; Blundell et al., 2022; Carneiro et al., 2023). That is, once firms can adjust their production technology in the long run, we would expect the negative effect on college wage premium would revert back as more firms adopt skill-biased technology and, thus, raise demand for college-educated workers. We come back to this after examining whether firms adopt better technology in the next part. As discussed above, we can use these results to back out the relevant elasticities of substitution in Equation 2.3. Specifically, we first assume that technological adoption does not take place in the short run,19 so the college wage premium effect on the in the short run, scaled by the respective relative supply effect, captures the effect of the aggregate skill distribution on college wage premium, i.e., ( 1 σA − 1σE ) . The additional effect on the younger cohort (still in the short run) in the triple-differences model, scaled by the respective relative supply effect, captures the additional GE effect on the younger cohort (given that the two cohorts are imperfect substitutes), which is 1σA . Given that the triple-differences estimate for college premium after 4 to 8 years is statistically indistinguishable from zero, we can conclude that young and old workers are perfect substitutes in production. Assuming that 1σA is zero, the elasticity of substitution between college and non-college workers is 2.12. Our result lies well within the range of estimates for this parameter from the previous literature (Acemoglu and Autor, 2011). 19This standard assumption is also made in other studies in this literature, such as Card and Lemieux (2001), Blundell et al. (2022), and Carneiro et al. (2023). 39 Firm-Level Effects We now turn to the results for the effects of the expansion on firm-level outcomes, which we report in Table 2.5. Same as before, we report separately the short-run and long- run effects. Our main outcome is total factor productivity (TFP), which we discuss the details of constructing in A.2. We also report the results for labor productivity, measured as log value added per worker, capital per capita, measured as log assetslabor , and capital intensity, measured as log assetsrevenue . Since different sectors can respond differently to the labor supply shock, we split our sample into service firms and manufacturing firms. We drop agricultural firms as well as firms in the energy, water, and waste, who are mostly controlled by the government. These firms tend to have different level of access to capital (Baccini et al., 2019) and thus, may respond differently than the rest of the sample. First, we find that in the short run, service firms experience almost no effect on TFP (8 percentage points and not statistically significant). In the long run, the expansion increases their log TFP by 31.2%. This is consistent with our model that firms can only adjust their production technology in the long run in response to a change in the labor supply. This result is robust when we apply PSM to select the appropriate control group or when we allow for differential linear trends. We also find similar results when using labor productivity as an alternative outcome. In contrast, we find that the expansion has positive effects in both short and long run on manufacturing firms’ TFP; the short- run effect is 12.4% while the long-run effect is 31.7%. These results are also robust with different specifications and different measures of productivity. These results are consistent with our model of endogenous technological adoption. The increase in the number of college-educated workers raises the marginal product of these workers more than non-college workers under the skill-biased technology than labor-biased technology. Thus, when the number of skilled labor reaches a certain threshold, firms would find it worthwhile to adopt the skill-biased technology. 40 Table 2.5: Difference-in-differences estimates for effects of the expansion on firm-level outcomes Service Manufacturing TFP Labor pro- ductiv- ity Capital per capita Capital inten- sity TFP Labor pro- ductiv- ity Capital per capita Capital inten- sity Specification 1: TWFE 4-8 years after 0.088 0.018 0.039 0.017 0.124*** 0.059* 0.017 -0.054 (0.055) (0.045) (0.064) (0.070) (0.033) (0.031) (0.070) (0.061) 9+ years after 0.312*** 0.194*** -0.079 -0.195*** 0.317*** 0.125* -0.195*** -0.186 (0.103) (0.059) (0.103) (0.072) (0.057) (0.066) (0.072) (0.095) N 3093185 3101045 3565527 3161636 775841 777424 3161636 847216 Specification 2: TWFE with PSM 4-8 years after 0.084 0.034 0.046 -0.023 0.113*** 0.063* -0.023 -0.003 (0.057) (0.060) (0.075) (0.090) (0.035) (0.035) (0.090) (0.058) 9+ years after 0.239** 0.164*** -0.059 -0.234*** 0.305*** 0.123* -0.234*** -0.143 (0.104) (0.060) (0.126) (0.084) (0.059) (0.064) (0.084) (0.094) N 2808882 2815861 3249452 2862601 717205 718601 2862601 784241 Specification 3: TWFE w/ linear differential trends 4-8 years after 0.046 -0.030 0.059* 0.069 0.127** 0.146*** 0.069 -0.034* (0.047) (0.037) (0.030) (0.042) (0.049) (0.049) (0.042) (0.046) 9+ years after 0.245** 0.117 -0.047 -0.113 0.323*** 0.269** -0.113 -0.153 (0.114) (0.116) (0.054) (0.086) (0.119) (0.115) (0.086) (0.057) N 3093185 3101045 3565527 3161636 775841 777424 3161636 847216 This table reports DiD estimate for the effects of the expansion at the individual level. For each treatment group of provinces with the same treatment year and for each survey year, we create a subdataset with those treatment provinces and provinces that were never treated as the control group. These subdatasets are stacked together and a stacked DiD model is estimated on this new dataset, controlling for province-by-subdataset and cohort-by-subdataset fixed effects. Only the DiD coefficient of the interaction term is reported. All models control for age, age squared, and gender. In specification 2, we alternatively use the TWFE model using a matched control group based on propensity score matching. In specification 3, we allow the treatment and control groups to have linear differential trends. All samples include individuals between age 22 and 55. All standard errors are clustered at the province level. 41 As discussed above, an alternative explanation for the rise of productivity is that firms adopt more capital as the number of college-educated workers goes up due to the complementarity nature of capital and skill; this increase in capital adoption would also lead to higher productivity (Lewis, 2013; Khanna, 2023). We can test this hypothesis by examining the effects on capital per capita and capital intensity. Interestingly, we find that not only capital did not rise, it appears to even decrease in the long run. These results suggest that firms were switching to a different model of production in response to the expansion. In other words, the capital-skill complementarity framework does not appear to apply in our context. Two results stand out from our earlier discussion. We find that the expansion has a positive long-run effect on total factor productivity and a negative long-run effect on college wage premium. Generally, we would expect that an increase in total factor productivity would mitigate the negative effect on college wage premium in the long run if the technological change is biased towards skilled labor. That is, if technological change increases the marginal productivity of college-educated workers (and thus, their wage), then in the long run, skill-biased technological change would mitigate the supply effect in Equation 2.3. There are two possible explanations for this puzzling result. First, it is possible that we are only observing the medium-term effects in this study, so firms may not have time to adjust wage as quickly as they adjust to their new technology. Second, it is possible that the expansion has positive spillover effects on demand for non-college workers. As Mazzolari and Ragusa (2013) and Liu and Yang (2021) note, high-skill labors may have demand for low-skill consumption such as domestic helpers or food services. The expansion might have raised the demand for such services, thus increasing non-college wage even further. 42 Figure 2.9: Event Study for Firm-Level Effects (a) Log TFP (b) Labor productivity (c) Capital per capita (d) Capital intensity Note: These graphs show the event-study estimation results for the effects of the higher education expansion on firm-level outcome. Each estimate shows the coefficient of the interaction term between year dummy and treatment status. All models control for province and year fixed effects. Standard errors are clustered at the province-level and 95% confidence intervals are displayed. Capital intensity is measured by total value of capital divided by total revenue. Labor productivity is measured by value added per worker. 2.6 Partial equilibrium returns to higher education The previous sections suggest that returns to higher education is subject to several gen- eral equilibrium effects. As discussed in Section 2.4.4, the cohort-based DiD estimates cannot be used to recover the returns to higher education because they would be con- founded by these GE effects. Yet Equation 2.6 also provides a way to recover the partial equilibrium returns to higher education. Revisiting the equation, we note that we can, and have, identify the GE effects 43 on wage of college-educated and non-college workers across both cohorts. It is also straightforward to obtain the shares of college-educated and non-college workers in both cohorts. Therefore, we can recover βpartial, which is the returns to college completion for an individual living in a control province. WDiD = ∆lcy.βpartial + (lcy,T=1.∆wcy + lny,T=1.∆wny) + (lco,T=1.∆wco + lno,T=1.∆wno) (2.6) where le,a denotes the share of those with educational level e = {c, n} (college and non- college) among cohort a = {y, o} (young and old); ∆we,a are the GE effects on wage by educational level and age cohort; ∆lcy = lcy,T=1 − lcy,T=0 is the difference in the share of college-educated workers in the younger cohort between the treatment and control provinces. The partial equilibrium returns is βpartial, which captures college completion returns for an individual in the control province. Given that the GE effects vary in the short run and long run, we also estimate the partial equilibrium returns for both periods and report them in Table 2.6. The results suggest that higher education has a relatively high partial equilibrium rates of returns, between 235% and almost 400%, after accounting for the GE effects. The return using the long-run estimates is almost twice as smaller than the return using the short-run estimates; in other words, the return to higher education diminishes over time. It is also important to note that these estimates reflect the returns for those who “switched” to complete college because of the expansion. In other words, they may not be representative for other populations. 2.7 Conclusion In the past three decades, Vietnam has experienced rapid economic growth due to both within-sectors productivity growth and structural transformation (McCaig and Pavcnik, 2013; McMillan et al., 2017; Liu et al., 2020). As noted by McMillan et al. (2017), structural transformation may lead to episodic growth but it requires more fundamental changes such as human capital investment and institutional changes to sustain economic growth. Decree 121/2007 is an effort to push for such fundamental changes, as it allowed Vietnam to establish over 100 new universities in a short period. 44 Table 2.6: Partial equilibrium returns to higher education in short run and long run Short run Long run College- educated Non-college College- educated Non-college Share of Old cohort: 0.086 0.914 0.092 0.908 (0.028) (0.028) (0.023) (0.023) Young cohort: 0.095 0.905 0.097 0.903 (0.029) (0.029) (0.019) (0.019) DiD Wage 0.087*** 0.044* (0.024) (0.024) Partial equilibrium returns: 3.960 2.350 At the individual level, the expansion increases the probability of completing college by over 57%. It also raises their monthly wage by over 8%. Being exposed to the expansion also allows individual to work in skill-intensive sectors, better-paid jobs, and, thus, have higher wage. The expansion also affects firms through changes in local labor markets. As it raises the relative supply of college-educated workers, it increases the marginal product of college-educated workers in more productive technologies, thus inducing firms to adopt such technologies and raising firm productivity. Surprisingly, the expansion also raises non-college wage substantially in the long run, thus lowering the college wage premium even further. These findings allow us to draw a number of policy implications for expanding ac- cess to higher education. First, expanding access to higher education does not appear to hurt college-educated workers, despite certain concerns about the quality of higher education and the lack of demand for college-educated workers in developing countries. These concerns were also apparent in Vietnam (The World Bank, 2020). Our findings indicate that exposure to the expansion improves individuals’ job market outcomes in terms of employment and wage. At the labor market level, it does not hurt college- educated workers’ wage. The returns to higher education is, in fact, very high even after accounting for the general equilibrium effects. 45 Second, higher education plays a significant role in fostering productivity growth in developing countries such as Vietnam. Our results indicate that expanding access to higher education contributes to economic growth via two channels. The expansion induces workers to move out of less productive sectors and into more productive sectors, speeding up the structural transformation process. More importantly, the expansion also forces firms to adopt better technology, raising their productivity. This finding addresses the concern that less developed countries may lack a formal market for innovation (Acemoglu, 1998, 2002) or an incentive to focus on innovative activities (Acemoglu et al., 2006; Vandenbussche et al., 2006; Aghion et al., 2009) and, thus,an expansion of high-skilled labors may not lead to more innovation activities. Our study also indicates that the expansion benefits female workers more than male workers both in terms of college completion and wage. These results imply that the expansion also lowers the wage gap across gender, which is an important goal of economic development itself. Our study provides an important case study on the effects of expanding access to higher education in developing countries; however, generalizing its results to other developing countries would require careful consideration because Vietnam may differ from other developing countries in many dimensions. Despite being a low-middle income country, Vietnam has been well-recognized for its success in basic education in terms of enrollment and international assessment scores (Dang and Glewwe, 2018; Dang et al., 2021). Such a factor may contribute to the effectiveness of the expansion. Since we focus on new universities in provinces that never had a university before, we mostly capture the effects of public universities.20 Therefore, our results cannot be generalized to the effects of private universities. 20Private universities tend to be located in large cities where there have already been existing univer- sities before the reform. Chapter 3 Maternity Benefits Mandate and Women’s Choice of Work 3.1 Introduction Labor regulations requiring employers to provide maternity benefits are traditionally considered as protection for female workers during the childbearing and child-rearing years, as they provide job security and shield female workers from employer discrimi- nation. In developing countries, many women engage in informal work or care work, where maternity benefits are typically unavailable (ILO, 2014). Therefore, when the government imposes requirements on employers to provide maternity benefits for fe- male workers who are pregnant, it may create an incentive for women to switch to a formal job so they can receive the benefits, because such policies often apply only to the formal, regulated sector (Pettit and Hook, 2005; Almeida and Carneiro, 2012). At the same time, a maternity benefits requirement may impose costs on employers in the formal sector, which, in turn, can discourage the hiring of women in the formal sector (Uribe et al., 2019). The role of maternity benefits in women’s decisions on whether to work in the infor- mal or formal sector in developing countries remains understudied. Although there are several studies examining the impacts of general labor regulations on the informal sector of the labor market (e.g. Almeida and Carneiro, 2012; Freeman, 2010), these studies do 46 47 not focus on maternity benefits regulations or on women’s labor market outcomes. Sim- ilarly, although there is a sizable literature on how mother’s labor market outcomes are affected by maternity benefits mandates,1 they tend to focus exclusively on mothers and, therefore, do not provide any empirical evidence on whether female workers respond to the incentive provided by the maternity benefits. Most of the existing studies on the employment effects of maternity benefits regulation also focus on developed countries, where the labor markets can be very different from those in developing countries. To the best of our knowledge, only a few studies examine the association between maternity leave policies and female labor force participation (Besamusca et al., 2015; Amin et al., 2016; Uribe et al., 2019; Amin and Islam, 2019), and only Uribe et al. (2019) examine changes in terms of informal and formal employment among women due to changes in maternity leave regulation. In this paper we assess these important questions in the context of the labor market of Viet Nam during the period between 2010 and 2018. The 2012 Amendment to the Viet Nam Labor Law raised the required maternity leave by two months compared to the original requirement in the law of 1994, providing a unique opportunity to study maternity benefits regulations and female labor market outcomes in developing coun- tries. Viet Nam is particularly interesting to examine because it is known for having a relatively high female labor force participation rate compared to other countries at the same income level (Klasen et al., 2021), and a high share of informal sector work- ers (ILO, 2016). It also imposes relatively generous maternity benefits on employers. Many have argued that Viet Nam’s high share of women in work is partly due to the socialist regime’s policies to promote gender equality and to draw women into the labor force, such as maternity leave requirements (e.g. Gaddis and Klasen, 2014; Klasen, 2019; Klasen et al., 2021). We examine whether the maternity leave extension of the 2012 Amendment has en- couraged female workers to transition from informal work to formal employment. The basic assumption that we make in this study is that female workers who are more likely to give birth would be more responsive to the incentive created by the maternity leave extension. We use two difference-in-differences (DiD) designs based on this assumption 1See Thomas (2016), Strang and Broeks (2017), and Rossin-Slater (2017) for a summary of this extensive literature. 48 to identify the effect of the maternity leave extension. In the first approach, we compare the labor market outcomes of women of childbearing age with women beyond childbear- ing age both before and after the new law came into effect. In the second approach, we compare the labor market outcomes of women with different birth rates also before and after the law to quantify such effects. In other words, we leverage the variation in childbearing age in the first DiD approach and the variation in expected birth rates in the second DiD approach. These two different designs allow us to ensure that we are identifying the effect of the maternity leave extension, instead of any other component of the 2012 Amendment. We specifically focus on whether women choose to work in the formal sector in response to the maternity benefits they will receive when they become pregnant or when they give birth. Therefore, we do not restrict our study to mothers or those who would become mothers. We find that the maternity leave extension of the new law increases formal employ- ment of women of childbearing age by 2.7 percentage points, and decreases agricultural household work by 3.2 percentage points. The increase in formal employment mostly happens in the private formal sector, especially in the manufacturing industry and among middle-skilled occupations such as plant and machine operators as well as craft workers. The contributions of this study are threefold. First, despite an extensive body of studies on the effects of maternity leave regulations on women in developed countries (e.g. Ruhm, 1998; Akgunduz and Plantenga, 2013; Rossin-Slater, 2017; Hook and Paek, 2020), few studies explore the labor market effects on women in developing countries (Besamusca et al., 2015; Amin et al., 2016; Uribe et al., 2019; Amin and Islam, 2019). Besamusca et al. (2015), Amin et al. (2016), and Amin and Islam (2019) document that countries with paid and longer maternity leave regulation are associated with more women choosing to work. Uribe et al. (2019) find that extending the required maternity leave has a negative effect on women’s formal employment in Colombia. In contrast, we find that a similar law has a positive effect on women’s formal employment in Viet Nam using the same empirical strategy as Uribe et al. (2019). This difference suggests that maternity benefits regulations may have very different effects in different labor market settings.2 2Our study is also broadly related to a large literature on family, fertility and women’s work in 49 Second, our study complements a small but growing literature on female labor force participation in Viet Nam (Kreibaum and Klasen, 2015; Klasen, 2019; Dang et al., 2022; Klasen et al., 2021; Feeny et al., 2021). While the existing studies already explore what might have led to Viet Nam’s relatively high female labor force participation, only a few studies examine the causal factors of female labor force participation and especially female formal employment.3 Our study contributes to this literature by providing direct evidence for the role of maternity leave policies in female workers’ decisions on where to work. Our study is also related to a different literature on issues about labor rights and labor reforms in Vietnam, which mostly comprises of qualitative studies. Tran (2013) reviews and discusses extensively the history of labor organizing to fight for better working conditions, better pay, and better rights as well as the cultural ties among labor right activists, which are among the main drivers of reforms around labor rights (Tran, 2013), including the 2012 Amendments that we study. Evans (2020) and Evans (2021) further argue that labor reforms in Vietnam are driven by domestic activism alone but are assisted by the interests of the communist government around labor right issues.4 These labor reform studies highlight the significance of labor rights such as maternity leave from the cultural and political lens of workers and policymakers that are not often discussed in typical economics studies. The rest of the paper is organized as follows. Section 2 provides an overview of the 2012 Amendment.5 Section 3 reports different data sources used in the study as well as how we define different labor market outcomes for our analyses. Section 4 provides a discussion of our empirical strategies. In Section 5 we report and discuss the findings, and Section 6 concludes the paper. developing countries, e.g. Bloom et al. (2009), Krafft (2020), Selwaness and Krafft (2021), and Finlay (2021). 3Dang et al. (2022) find that childcare increases the probability of formal employment among mothers. 4The government has an interest in labor repression to maintain the low labor-cost advantage of the Vietnamese economy to promote exports. At the same time, the government recognizes that trade agree- ments, e.g., the Trans-Pacific Partnership (TPP) and EU Trade Agreement, can strengthen the political regime, but they also require more progressive reforms around the labor right issues (Evans, 2021). The unauthorized strikes by workers also triggered the government’s fear for the regime’s legitimacy, putting more pressure on them to pursue pro-labor reforms (Evans, 2020). 5See Appendix B.2 for an overview of the recent trends in labor market of Vietnam. 50 3.2 Background The main regulatory framework for the labor market, especially for formally registered businesses, is the Viet Nam Labor Law, which was first written in 1994 (World Bank, 1995) and amended in 2002, 2012, and 2019. The original Viet Nam Labor Law of 1994 already provided substantial protections for female workers. Under the original law, female workers were entitled to four months of paid leave during prenatal and postnatal periods, and such leave would be extended by one month for each additional child. During this leave, female workers would still receive full wages from their employer and maternity benefits from the Viet Nam Social Insurance Fund. More importantly, female workers were guaranteed their previous job back, or guaranteed a new job with an equivalent wage. The law also prohibited employers from giving work that might be harmful to the mother or the child. After giving birth, female workers were able to take an additional hour-long break per working day. The 2012 Amendment to the Viet Nam Labor Law (Law no. 10/2012/QH13) was passed in June 2012 and came into effect on 5 January 2013. The Amendment makes several changes to the original codes. First, it formally defines strikes, including when and how they are allowed. This provision is considered by labor rights scholars as weakening the workers’ role in resolving strikes (Tran, 2013). Second, the new law prohibits workers from working more than eight hours per day and 48 hours per week; the law also limits overtime work to 30 hours per month and 200 hours per year. Third, the Vietnamese New Year holidays are extended by one additional day, and workers are allowed to take an unpaid day off when a family member passes away. Fourth, the law formally states that: (1) wages shall be paid based on labor productivity and quality of work performed; and (2) the minimum wage is the lowest payment for an employee who performs the simplest work and must ensure his or her minimum living needs.6 Most importantly, the new law extends the mandated paid maternity for female workers up to six months during the prenatal and postnatal periods. In other words, the new law increases the required maternity leave by two months. The salary during the maternity leave is completely paid for by the social insurance, and employers do not 6It is important to note that this provision only put forward the formal definition for minimum wage. The actual regulations on the minimum wage levels were established in 2011 by the Decree 70/2011/ND-CP (Tran, 2013). 51 have to pay for this expense. Employers do have to contribute 18% of monthly payroll toward social insurance, while workers contribute 8% of their monthly salary. 3.3 Data Our primary data source is the Viet Nam Household Living Standard Survey (VHLSS), a nationally representative, biennial survey conducted by the General Statistics Office of Viet Nam. The survey sample, which was based on the Population and Housing Census in 1999 and in 2009, consists of roughly 9,000 households from all over the country. The survey collects data from all members in the household, so the sample includes over 36,000 individuals. The data provide rich information about labor force participation, as well as individual and household demographics, economics, and education. For this study, we use the five most recent waves of the data, covering the period 2010–18, to construct a repeated cross-sectional sample.7 Since we aim to investigate the effect on individual decisions to work, we want to exclude who are still in school or in retirement. This is because those who are in school also have to face the decision whether to continue to invest in their education, while those who are of retirement age have to consider whether to retire (or continue to work in the informal sector); therefore, their decisions to work are likely very different from those of the working age population. Given these considerations, we choose to restrict the study population to individ- uals aged 25–54. The official retirement ages in Viet Nam are 60 for men and 55 for women, while the official age of completing schooling, including university education, is 21. However, many individuals remain in school until later. For example, 23.5% of individuals aged 22 and 12.8% of those aged 23 are still school; by the age of 25, however, only 5.4% remain in school (according to the VHLSS 2010-2018). This means that almost everyone in the 25-54 age range can choose between formal or informal employment, or not working at all, for reasons unrelated to schooling and retirement. These age cutoffs also allow us to split the sample into 6 age groups of 5 years: 25-29, 30-34, 35-39, 40-44, 45-49, and 50-54 for further analysis by age group. We focus our 7Although the VHLSS data has been collected since 2002, only the 2010-2018 surveys ask for in- formation about social insurance, which allows us to construct the main outcome variable for formal employment, as described below. 52 analysis on non-college-educated women because 80 per cent of this group have family unpaid work and, hence, are more likely to respond to the maternity leave increase. We set the childbearing age to be between 25 and 44 because birth rates fall below 0.1 per cent for women older than 45 (Figure 3.1) and because our sample covers women aged 25–54.” Figure 3.1: Birth rates by age 0 5 10 15 15-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-65 Age Birth rate (%) Source: Authors’ calculations using the most recent live birth information from the 2009 Population and Housing Census. In order to measure changes in formal and informal employment, we focus on four main employment outcomes, defined as follows, along with individuals’ monthly in- comes: • Not working: not holding any job. The reasons for not working include studying, housework, old or retired, disabled or chronic illness, or cannot find a job. The main reason for not working for men is disability, while the main reason for women is housework (Figure B.1). • Formal employment: holding a job that (1) comes with a wage or salary, and (2) provides social insurance (ILO, 2016). This category mainly includes workers in the public or private formal sectors (domestic and foreign firms). • Informal wage work: Holding a job that comes with a wage or salary, but does 53 not provide social insurance. In other words, these are wage workers who do not receive any social insurance. • Agricultural and non-agricultural household work: workers who contribute to a family business in agriculture or non-farm sectors. These family-contributing workers are also considered as informal workers (ILO, 2016).8 Table 3.1 provides a summary of key statistics for women aged 25–44 and 45–54 for before and after the new law took effect in 2013. In Figure 3.2 we also plot these labor market outcomes by age groups, gender, and year. In Figure 3.2(a) we note that the share of women with formal employment in the 25–44 age group is generally higher than the share of men in the same age group, and also higher than the shares of men and women in the older age group. During 2010-2014, formal employment of women and men age 25-44 follow an upward trend, while formal employment of both gender age 45-54 remain stable. This is consistent with what McCaig and Pavcnik (2013) and Liu et al. (2020) observe: as more formal business open over time, demands for younger workers in the formal sector also increase as they tend to have better education. After 2014, formal employment among women between age 25-44 rises more sharply. At the same time, formal employment among men age 25-44 continues to rise at the same pace as before. In contrast, the formal employment among women age 45-54 remains stable throughout the study period, and that among men age 45-54 appears to decrease after 2014. In other words, there is an increase in formal employment among women of childbearing age following the 2012 Amendment that we do not observe in men age 25-44 or women age 45-54. We also observe a decrease in formal employment among older men following the new law. The age and gender-specific changes in formal employment suggest that these are more related to the maternity leave extension of the Amendment, instead of the other provisions, which are likely gender-neutral. The shares of women of both age groups who have informal wage employment in- creases steadily during the study period. In contrast, the share of informal wage em- ployment among men aged 25-44 stays relatively stable, and that share among men aged 8Since the VHLSS survey does not ask about whether an individual is a self-employed worker, we do not include this labor market outcome. If the self-employed workers earn wage, then they would be under the “Informal wage work” category, but if they work unpaid for their household, then they would be under the “Agricultural or non-agricultural household work” category. 54 Table 3.1: Summary statistics 2010–12 2014–18 Age 25–44 Age 45–54 Age 25–44 Age 45–54 Demographics and education level Age 34.60 49.36 35.14 49.48 (5.76) (2.83) (5.70) (2.88) Urban 0.26 0.29 0.26 0.29 (0.44) (0.45) (0.44) (0.45) Married 0.87 0.83 0.88 0.84 (0.33) (0.38) (0.32) (0.36) Children in household 1.02 0.37 1.04 0.44 (0.93) (0.66) (0.94) (0.72) Primary education or none 0.55 0.52 0.50 0.50 (0.50) (0.50) (0.50) (0.50) Lower or upper secondary education 0.45 0.48 0.50 0.50 (0.50) (0.50) (0.50) (0.50) labor market outcome Formal employment 0.11 0.06 0.16 0.05 (0.31) (0.23) (0.37) (0.23) Not working 0.08 0.09 0.07 0.10 (0.27) (0.29) (0.26) (0.30) Unpaid work 0.66 0.74 0.59 0.71 (0.47) (0.44) (0.49) (0.45) Agricultural household work 0.53 0.61 0.47 0.56 (0.50) (0.49) (0.50) (0.50) Non-agricultural household work 0.24 0.22 0.24 0.25 (0.43) (0.41) (0.43) (0.43) Log earning 9.70 9.68 10.09 9.96 (0.75) (0.87) (0.69) (0.78) N 19,592 8,972 25,702 14,041 Source: Authors’ estimation based on the VHLSS 2010–18. 45-54 fluctuates during 2010-2018. The trends in not working are similar across genders and age groups and are also relatively stable over time, suggesting that the new law 55 does not have any effect on this outcome. Agricultural household work, on the other hand, experiences a sharp drop in 2016 across all age groups and genders. This decrease is part of a longer-term decline in agricultural employment in Vietnam. Indeed, McCaig and Pavcnik (2013) and Liu et al. (2020) find that agricultural employment has been steadily declining since the early 2000s as trade liberalization and enterprise reforms increase the number of employment opportunities outside the agricultural sector. This structural transformation process, however, has similar effects on both genders in terms of agricultural employment (Liu et al., 2020), which is consistent with what we observe in Figure 3.2(d). We also observe an opposite trend for the non-agricultural household work across all age groups and genders: non-agricultural household work decreases from 2010 to 2014 and reverses, increasing in 2016–18. For both agricultural and non-agricultural household work, these trends represent economy-wide changes as they apply to all age groups and genders. The log monthly wage also increases for all groups, although it increases faster for women aged 25-44 throughout the study period. Indeed, the wage gap between women of the two age groups was very small in 2010, but has been widening since 2012, i.e., even before the new law was implemented. In contrast, the wage gap between the two groups of men remains stable until it closes in 2018. These descriptive comparisons illustrate the increase in formal employment among women of childbearing age relative to the other age and gender groups following the implementation of the new law. They also show the overall shift away from the agri- cultural sector as documented in the previous literature (McCaig and Pavcnik, 2013; Liu et al., 2020). In the next section, we present our empirical strategies to identify the effect of the maternity leave extension while accounting for these structural changes of the economy. 3.4 Empirical Strategy Identifying the effect of the maternity leave extension is not straightforward because the extension was combined with several other changes in the 2012 Amendment, which means we may pick up the effects of those changes as well. Our empirical strategies, 56 Figure 3.2: Employment outcome for women aged 25–54 by year (a) Formal employment 0.05 0.10 0.15 0.20 2010 2012 2014 2016 2018 Female, 25-44 Female, 45-54 Male, 25-44 Male, 45-54 Percentage (b) Informal wage employment 0.02 0.04 0.06 0.08 0.10 2010 2012 2014 2016 2018 Female, 25-44 Female, 45-54 Male, 25-44 Male, 45-54 Percentage (c) Not working 0.02 0.04 0.06 0.08 0.10 2010 2012 2014 2016 2018 Female, 25-44 Female, 45-54 Male, 25-44 Male, 45-54 Percentage (d) Agricultural household work 0.40 0.45 0.50 0.55 0.60 0.65 2010 2012 2014 2016 2018 Female, 25-44 Female, 45-54 Male, 25-44 Male, 45-54 Percentage (e) Non-agricultural household work 0.20 0.25 0.30 0.35 0.40 2010 2012 2014 2016 2018 Female, 25-44 Female, 45-54 Male, 25-44 Male, 45-54 Percentage (f) Log monthly wage 9.60 9.80 10.00 10.20 10.40 2010 2012 2014 2016 2018 Female, 25-44 Female, 45-54 Male, 25-44 Male, 45-54 Percentage Source: Authors’ calculation based on the VHLSS 2010–18. therefore, are built on the assumption that female workers who are more likely to give birth are also more likely to respond to the incentive of the maternity leave extension. In 57 other words, we assume that the other components of the Amendment may affect the la- bor market outcomes of everyone, but the maternity leave extension would differentially affect women who are more likely to give birth. We consider two difference-in-differences research designs to identify the effect of the extension based on this premise. For the first design, we compare employment outcomes between women of childbearing age and women older than childbearing age before and after the law came into effect. For the second design, we compare employment outcomes between women with different expected birth rates, also before and after the maternity leave extension. 3.4.1 First Research Design: Comparison by Childbearing Age First, we observe in Figure 3.1 that the fertility rates among the 25-44 age group are positive and substantially higher than those among the 45-54 age groups, which are effectively zero. Therefore, this group is unlikely to respond to the maternity leave extension, allowing us to exploit it as the control group. Formally, we estimate the following model for woman i aged 25–54 in year t: Yit = β + δ.(Age 25–44it × Postt) +X′itΓ+ ∑ s κs(Y ears ×Ageit) + θp,t + ηp,g + εit (3.1) where Yit denotes the employment outcomes of woman i in year t, Age 25-44it indicates whether the woman is between age 25 to 44, and Postt indicates whether year t is after 2013. We control for province-by-year fixed effects (θp,t), province-by-age-group fixed effects (ηp,g), and other individual characteristics, captured by Xit; εit denotes the error term. The sample is split into six age groups: 25–29, 30–34, 35–39, 40–44, 45–49, and 50–54, which will be used for the age group fixed effects. The coefficient of interest is δ. Individual characteristics include urban, ethnicity, household size, number of children under 10 years old in the house, primary education or less, secondary education, and marital status. The term Y ears×Ageit are year dummy variables times age; this allows us to account for changes in the effect of age on labor market outcomes over time that is unrelated to the maternity leave change. In all models we cluster standard errors at the commune–year level to account for the survey’s sampling design. 58 We also extend the main DiD model by estimating the dynamic treatment effects in the following event-study specification: Yit = β + ∑ s ̸=2012 δs.(Age 25–44it × Y ears) +X′itΓ+ ∑ s κs(Y ears ×Ageit) + θp,t + ηp,g + εit (3.2) where s = {2010, 2014, 2016, 2018} and 2012 is the reference year. We use the same set of controls, including province–year fixed effects, province–age group fixed effects, and year dummy variables times age. Embedded in this model is a check for parallel pre-treatment trends (or pre-trends): δ2010 represents the difference-of-differences of employment outcome between the childbearing group and the non-childbearing group in 2010 and in 2012. Because the law was implemented in 2013, we would expect δ2010 to be close to zero and/or statistically insignificant. Based on the different birth rates across age groups, we would expect the effects to be larger among younger age groups if they respond to the incentive from the maternity leave extension. Therefore, we can assess the validity of our approach by allowing the treatment effects to vary by age group. Specifically, we estimate the treatment effects by age group: 25–29, 30–34, 35–39, and 40–44, while the control group is women aged 45–54: Yit = β0 + ∑ γg(Postt ×AgeGroupgt) + θp,t + ηp,g +X′itΓ+ εit (3.3) where γg denotes the treatment effect on women of age group g. We also control for province–year fixed effects and province–age-group fixed effects and individual charac- teristics, as before. Importantly, we drop the control variables for year dummy variables times age because there would not be enough variation to differentiate the treatment effects by age groups. 3.4.2 Threats to Validity and Robustness Checks The underlying identifying assumption of the first research design is that the employ- ment outcomes of the treatment group and the control group would have followed the 59 same trends if the required maternity leave length had not been changed.9 This parallel trends assumption would be violated if other factors unrelated to the maternity leave extension also affect employment outcomes differently for the two groups over time. In this section, we discuss different potential threats to this identifying assumption and how we attempt to address them. The most likely confounders of the maternity leave requirement extension are the other components of the 2012 Amendment. As discussed in Section 3.2, the new law also defines strike conditions, limits the working hours and overtime work, extends a holiday, and provides a formal definition for the minimum wage. Because our DiD design identifies the effect of the maternity leave extension from variation in whether a woman is of childbearing age, our estimation would be biased if these unrelated provisions also affect the two age groups differently. Another potential confounder is the rapid structural transformation that Vietnam has undergone since 2000, as workers moved out of the agricultural sector and into higher productivity sectors such as manufacturing and services (McCaig and Pavcnik, 2013; Liu et al., 2020). A large part of this structural transformation process was due to trade liberalization and enterprise reforms that increased the number of employment opportunities in the manufacturing sector, specifically in the clothing, food products and beverages, furniture, and footwear industries (McCaig and Pavcnik, 2013). McCaig and Pavcnik (2015) and Liu et al. (2020) further note that younger and more edu- cated workers are more likely to be affected. Limited employment opportunities in the agricultural sector, along with mechanization and uptake of labor-saving inputs, allow younger workers who are increasingly more educated to transition out of agriculture and into non-farm sectors (Liu et al., 2020). Structural transformation, therefore, can bias upward our estimation as it can increase the number of formal jobs available in the local labor market every year.10 Lastly, our results can also be biased by the minimum wage reform in 2011 (Decree 70/2011/ND-CP) which took effect in 2012. Prior to 2011, the government maintained 9We also assume that the maternity leave extension did not affect our control group, which is women age 45-54. The trends in Figure 3.2 suggests that this is not a problem in our data. 10Relatedly, the Vietnamese government also passed the 2013 Land Law, which extends the land tenure for farmers who grow annual crop and is found to boost agricultural investment (Bellemare et al., 2020). The increase in land tenure security may also induce workers to pursue off-farm employment. However, we do not expect this law to have different effects on the childbearing and non-childbearing age groups. 60 two systems of minimum wage: one was for all domestic employers (including the govern- ment), and the other one was for the foreign direct investment sector. As a requirement to join the World Trade Organization (WTO) in 2007, the government equalized the minimum wage levels of the two sectors in 2011 (Tran, 2013; Nguyen, 2017). It is impor- tant to note that the minimum wage levels have been increased every year and across every geographic region (and sector), so this is not a shock on the level of the minimum wage, but rather a shock on the difference in minimum wage levels between two sectors. Both the structural transformation and the minimum wage reform are shocks at the local labor market level. On one hand, the rapid structural transformation means there would be more job opportunities in the non-farm sectors; therefore, it is a positive labor demand shock (McCaig and Pavcnik, 2013; Liu et al., 2020). On the other hand, a higher minimum wage in a competitive labor market means that the cost of hiring would increase, so it is a negative labor demand shock (Nguyen, 2017). Our empirical strategy accounts for both of these concerns, to some extent, by controlling for province– year fixed effects in all models to account for any changes in the local labor market in any given year. We also control for province–age-group fixed effects and year fixed effects times age11 to absorb any unobserved heterogeneity across age groups over time and across different provinces. Lastly, we exclude college-educated women and control for educational attainment in Equation 3.1, since younger cohorts tend to have higher educational attainment and, hence, are more likely to switch to the non-farm sectors. It is important to note that all of the potential confounding factors discussed above would likely affect both male and female workers. Specifically, the other provisions of the 2012 Amendment to the Labor Code are gender-neutral, so we would not expect them to affect the two genders differently. Liu et al. (2020) find that the gender com- position of the agricultural labor force was stable during the 1992–2016 period, which suggests that the structural transformation process may not affect men and women differently. Del Carpio et al. (2013) find that the shares of male and female workers who have wages below the minimum wage level are relatively similar (6.1% and 6.5%, respectively). Therefore, we can assess the extent to which our main findings are con- founded by replacing women aged 45–54 with men aged 25–44 as the control group for 11Note that we control for year dummy times age for all specifications except for the specification for treatment effects by age group in Equation 3.3 as explained above. 61 the DiD estimation. Because this alternative control group is as young as the treatment group, this estimation should not be affected by any differential effects of the structural transformation process or the other provisions of the 2012 Amendment by age. In other words, if the estimates using men aged 25–44 as the control group are not different from the main findings (i.e. using women aged 45–54 as the control group), then we can be more assured that our main findings are not driven by the potential confounding factors discussed above. Alternatively, we can estimate a triple-differences regression in which women of childbearing age are compared to men of the same age and both men and women of age 45-54. The triple-differences approach allows us to further control for labor market trends across different age groups. Formally, we estimate the following model: Yit = β + δ DDD.(Age 25–44it × Postt × femalei) +X′itΓ+ θp,t,f + ηp,g,f + ζp,t,g + εit (3.4) where θp,t,f represents the province–year–female fixed effects, ηp,g,f represents the province– age group–female fixed effects, and ζp,t,g represents the province–year–age group fixed effects. The last term absorbs any differential changes in labor market outcome across different age groups unrelated to gender. This term, therefore, will account for the confounding effects of the structural transformation as well as the other provisions of the 2012 Amendment as they likely affect both genders equally. The triple-difference specification can be easily extended to allow the effects to vary by year or age group as done for the difference-in-differences model. 3.4.3 Second Research Design: Comparison by Expected Birth Rates The main DiD approach identifies the effect of the maternity leave extension using variation in childbearing age. These estimates would be biased if other factors unrelated to the maternity leave extension also have differential effects across age groups, as we discussed previously. In the previous section, we extended this design by switching to a different control group and employing a triple-differences model with extensive controls for unrelated labor market shocks by age, gender, and geography to address that concern. However, this approach still relies on the assumption that the effect 62 identified by variation in demographic characteristics related to giving birth such as childbearing age and female would reflect the effect of the maternity leave extension and not other factors. In this section, we consider an alternative research design to identify the effect of the extension using variation in expected birth rates across districts and age groups. That is, we would expect that women with higher expected birth rates would respond more strongly to the incentive of the maternity leave extension than those with lower expected birth rates. Therefore, we can directly compare the groups with different birth rates over time to identify the effects of the leave extension instead of comparing between childbearing and non-childbearing age or between men and women. We restrict our analysis to the childbearing-age women sample to ensure that the results from this design are driven entirely by differences in expected birth rates. Specifically, we use the 2009 Population and Housing Census to calculate district- level age-specific fertility rates, defined as the number of children born to females of a specific age group in each district, for the women of childbearing age sample. We split the birth rates into five equal-sized bins, then estimate the following model: Yit = β + ∑ b δb(W b g,d × Postt) + θp,t + ηp,g + γt,g +X′itΓ+ εit (3.5) where W bg,d indicates whether the birth rate of the respective district–age-group d and g falls into bin b, θp,t are province–year fixed effects, ηp,g are province–age-group fixed effects, and γt,y are year–age-group fixed effects. The reference group is the group with the lowest birth rates. This model quantifies the effects by birth rates while accounting for any unobserved confounders across age groups within provinces and any secular shocks across provinces and age groups. Comparing different birth rate groups allows us to assess whether the main results are driven by responses to the maternity leave extension of the 2012 Amendment. If the effects are larger among groups with higher birth rates and smaller among groups with smaller birth rates, it would be consistent with the interpretation that the employment outcomes are responding to the maternity leave extension and not to other unrelated components of the new law. In contrast, if the effects’ magnitudes do not correspond to the birth rates, then it is likely that the employment outcomes are driven by other 63 factors. A useful placebo test for the validity of this approach, therefore, is to estimate the same model on the sample of only men between age 25-54. That is, we merge the expected birth rates of women by district and age groups with a male sample and re- estimate Equation 3.5. Since men do not respond to the incentives from the maternity leave extension, we would expect the effects to be small and statistically insignificant. 3.5 Results 3.5.1 Main Findings For the main findings, we present three sets of results in Table 3.2. Columns (1) to (3) report the results from estimating different variants of Equation 3.1—that is, a DiD model with women aged 25–44 as the treatment group and women aged 45–54 as the control group. Columns (4) to (6) report the results from estimating the alternative DiD model where men aged 25–44 are the control group. Lastly, column (7) reports the results from estimating the triple-differences model in Equation 3.4. The results for different outcomes are presented in panels A–E of Table 3.2. In panel A the outcome is formal employment, defined as holding a job that provides wage/salary and social insurance. In panel B the outcome is informal wage employment, defined as holding a job that provides wages/salary but not social insurance. In Panel C, the outcome is not holding a job. In panels C and D, the outcome variables are whether individuals are working in an agricultural or non-agricultural activity in their own household, respectively. In panel E, the outcome is the log monthly wage from the current job among those who earn a monthly wage. In other words, those who are self-employed or do not work are not included in this regression. 64 Table 3.2: DiD results for the effect of extending maternity leave Main DiD model Alt. DiD model DDD Women age 45-54 as control Men age 25-44 as control model Specification (1) (2) (3) (4) (5) (6) (7) Panel A: Formal employment 0.039*** 0.039*** 0.031** 0.033*** 0.034*** 0.029*** 0.027** (0.013) (0.013) (0.014) (0.007) (0.006) (0.006) (0.012) N 34701 34701 34422 45294 45294 45215 67739 Panel B: Informal wage work 0.011 0.011 0.010 0.007 0.008 0.010** -0.004 (0.009) (0.009) (0.010) (0.005) (0.005) (0.005) (0.010) N 34600 34600 34321 45155 45155 45075 67509 Panel C: Not holding a job -0.010 -0.006 0.002 -0.004 -0.001 0.000 -0.009 (0.013) (0.013) (0.014) (0.005) (0.005) (0.005) (0.010) N 34701 34701 34422 45294 45294 45215 67739 Panel E: Agricultural household workers (unpaid) -0.032 -0.043** -0.046** -0.025*** -0.032*** -0.029*** -0.032** (0.020) (0.019) (0.019) (0.008) (0.008) (0.008) (0.016) N 34701 34701 34422 45294 45294 45215 67739 Panel F: Independent production/business household workers (unpaid) -0.005 0.002 0.007 -0.010 -0.007 -0.008 0.019 (0.020) (0.019) (0.021) (0.008) (0.008) (0.008) (0.017) N 34701 34701 34422 45294 45294 45215 67739 Panel G: Log monthly income 0.108* 0.072 -0.024 0.084*** 0.086*** 0.090*** 0.019 (0.064) (0.060) (0.085) (0.020) (0.018) (0.021) (0.069) N 9129 9129 7507 17591 17591 16788 20684 Additional controls Age group FE ✓ ✓ Year FE ✓ ✓ Female FE ✓ Province × Year FE ✓ ✓ Province × Age group FE ✓ ✓ Province × Female ✓ District × Year FE ✓ ✓ District × Age group FE ✓ ✓ District × Female ✓ District × Female × Year FE ✓ District × Age group FE × Female ✓ Province × Age group FE × Year FE ✓ Note: This table reports the DiD results for the effects of the maternal leave extension on labor market outcomes. In columns 1-3, we use women aged 45-54 as the control group, and the results are reported for the interaction term Post-2013 × Age 25-44. The sample for these estimates is non-college females aged 25-54. In columns 4-6, we use men aged 25-44 as the control group, and the results are reported for the interaction term Post-2013 × Female. The sample for these estimates includes all non-college individuals aged 24-44. Standard errors are clustered at the commune-year level and reported in parentheses and p-value is reported in brackets; sampling weights are applied. All models control for year FE, age group FE, urban, ethnicity, household size, number of children aged under 10 in household, educational attainment, marital status, and year-specific cohort linear trends. Data is drawn from the VHLSS 2010-2018. 65 For the main DiD model comparing women of childbearing age (as the treatment group) to older women (as the control group), we consider three different specifications that control for different levels of fixed effects, from least to most parsimonious. All specifications control for urban, ethnicity, household size, number of children under 10 years old in household, educational attainment, marital status, and year fixed effects times age. The baseline specification in column (1) controls only for age group fixed effects and year fixed effects. This accounts for unobserved heterogeneity by age group as well as labor market shocks at the national level. The specification in column (2) instead controls for province–age group and province–year fixed effects; the first fixed effects absorb unobserved heterogeneity by age group and province, while the second fixed effects absorb local labor market shocks at the province level. The specification in column (3), on the other hand, controls for district–age group and district–year fixed effects. This specification accounts for heterogeneity and shocks at the district level, which is more granular than controlling at the province level. The specifications in columns (2) and (3) account for the fact that the local labor market conditions may induce younger workers to move out of agriculture and into formal employment as driven by structural transformation (McCaig and Pavcnik, 2013). To check whether the results from the main DiD model are still confounded by local labor market conditions or other provisions of the 2012 Amendment that are gender- neutral, we contrast the main DiD model with an alternative DiD model comparing women of childbearing age (as the treatment group) with men of the same age (as the control group). We estimate three specifications that are analogous to those in the first three columns. Specifically, the specification in column (4) controls for female dummy, year and age group fixed effects. This baseline specification accounts for time-invariant heterogeneity by gender and age group, as well as shocks at the national level. For column (5), we estimate a specification that controls for province–female, province–age group, and female–age group fixed effects. As in the specification in column (2), these controls account for local labor market shocks as well as gender and age differences at the province level. In column (6), we estimate a specification controlling for district–female, district–age group, and district–year fixed effects, which is analogous to that in column (3). Lastly, in column (7), we present the results from estimating the triple-differences in Equation 3.4. As described in Equation 3.4, we control for province–year–age group, 66 district–female–year, and district–age group–female fixed effects. This model accounts for the gender and age group-specific shocks in any given year. The estimations in Panel A Table 3.2 indicates that the maternity leave extension has a positive and statistically significant effect on formal employment of women of childbearing age. In the base specification, the estimate is 0.039. When accounting for age group heterogeneity and local labor market conditions at the province level, the estimate is still 0.039. However, the estimate is reduced to 0.031 when controlling for district–age group and district–year fixed effects. These results indicate that the baseline estimate is partly driven upward by the local labor market shocks at the district level.12 Turning to the estimates in columns (4) to (6), the estimate in the baseline speci- fication of the alternative DiD model is 0.033, which is slightly lower than that of the baseline specification of the main DiD model in column (1). Since the two specifications are analogous except for the control group, the difference between the results in column (1) and (4) suggests that the baseline estimate from the main DiD model may be biased to some extent, although the difference is quite small. When controlling for province– year, province–female, and province–age group fixed effects, the estimate changes only slightly to 0.034. However, when replacing the province-level controls with district-level controls, the estimate is again reduced to 0.029. This suggests that the local labor mar- ket conditions also influence the estimates in the alternative DiD model. However, the most parsimonious specifications from the two DiD models yield very similar results, 0.031 and 0.029, which means that the extensive set of fixed effect controls successfully absorb most of the biases in both models. Lastly, the triple-differences estimate is 0.027, which is slightly smaller than the results from both of the DiD models, but it remains positive and significant. This model further accounts for different changes of formal em- ployment across genders and age groups, so the estimate is slightly more conservative than those of both DiD models. The fact that the estimates are considerably similar to each other and remain statis- tically significant when using different control groups, models, or controlling for different 12Note that the sample size is reduced slightly from 34,701 to 34,422 in the third specification. As we control for more granular geographic level, some district–age group cells have one observation and, hence, are dropped from the regression as these are known to introduce bias in high-dimensional fixed effects models (Correia, 2015). 67 levels of fixed effects, is reassuring evidence that any biases from local labor market con- ditions or factors unrelated to the maternity leave extension are relatively small. That is, under the assumption that the other provisions of the 2012 Amendments and the structural transformation are likely to affect both genders, the alternative DiD model using men age 25-44 would yield unbiased estimates. Given that both DiD models yield very similar results, the main DiD estimates are also likely to be unbiased. Similarly, the triple-differences model is likely to be unbiased because it accounts for both gender and age group-specific shocks. Since the estimates from both of the DiD models and the triple-differences model are very close to each other, the biases are, again, likely to be insignificant. Panels B, C, and F suggest that the maternity leave extension had no effect on informal waged employment, not working, or non-agricultural household work. The results are insignificant across the board, and the estimates are close to zero. Different DiD and triple-differences models yield the same conclusion. These results suggest that workers are not moving from these labor market options to formal employment. In contrast, we find substantial evidence that workers move out of agriculture house- hold work in response to the maternity leave extension. In the baseline estimate of the main DiD model, the treatment effect is -0.032 but is not significant. However, when we control for province–age group and province–year fixed effects, the estimate rises to -0.043 and becomes significant. Further controlling for local labor market conditions also increases the estimated treatment effect to -0.046. The alternative DiD model yields estimates between -0.025 in the base specification to -0.029 in the least parsi- monious specification. These results again suggest that local labor market shocks may bias the DiD estimates downward since the treatment effect becomes larger as we con- trol for fixed effects at a more granular geographic level. Lastly, the triple-differences estimate is -0.032, which is in between the estimates of the two DiD models. These robust estimates suggest that the maternity leave extension decreases employment in the agricultural household sector. The magnitude of this negative effect is comparable to the magnitude of the positive effect on formal employment, implying that workers mainly move from agricultural household work to formal employment. Unlike the results for employment, the evidence for the effect on log monthly income is mixed across different models. Without controlling for local labor market shocks, 68 the estimate from the main DiD model is 0.108. When adjusting for shocks at the province level, the model yields a smaller and insignificant estimate, which is 0.072. When we adjust for shocks at the district level, the estimate becomes negative and still insignificant. These results indicate that local labor market shocks bias the main DiD model’s estimate upwards. In contrast, the alternative DiD model yields similar estimates, which range between 0.084 to 0.090 and are all statistically significant. The triple-differences model yields an estimate of 0.019, which is also insignificant. The considerable inconsistency between the three models warrants a further inves- tigation. Figure 3.2 shows monthly wage increases throughout the study period for all genders and age groups, although women of childbearing age appear to experience a slightly larger increase relative to other groups. This difference, however, can be driven by the local labor market conditions; as more firms enter the market over time, de- mands for labor also increase and so do wages. The triple-differences model addresses this problem by accounting for changes for both gender and age group over time, so the estimate is smaller and insignificant. Taken together, these results provide some evidence that the extension increased monthly wages, but the evidence is not robust across different estimation methods. The event study estimates in Figure 3.3 provide a more detailed picture of the dynamic effects of the maternity leave extension. For each outcome, we show three sets of results which correspond to the two DiD models and the triple-differences model. For the formal employment outcome, we observe that the 2010-term is statistically insignificant for all three models, allowing us to conclude that the main finding is unlikely to be driven by differential pre-treatment trends. The estimates are close to zero for the treatment effect in 2014, but become positive, larger, and significant for 2016 and 2018. In other words, the extended maternity leave increases the formal employment among women of childbearing age over time. 69 Figure 3.3: Event-study estimates for effects on women’s labor market outcomes (a) Formal employment -0.05 0.00 0.05 0.10 2010 2012 2014 2016 2018 Main DiD Alternative DiD Triple differences Estimated coefficients and 95% CI (b) Informal wage employment -0.04 -0.02 0.00 0.02 0.04 2010 2012 2014 2016 2018 Main DiD Alternative DiD Triple differences Estimated coefficients and 95% CI (c) Not working -0.10 -0.05 0.00 0.05 2010 2012 2014 2016 2018 Main DiD Alternative DiD Triple differences Estimated coefficients and 95% CI (d) Agricultural household work -0.15 -0.10 -0.05 0.00 0.05 0.10 2010 2012 2014 2016 2018 Main DiD Alternative DiD Triple differences Estimated coefficients and 95% CI (e) Non-agricultural household work -0.10 -0.05 0.00 0.05 0.10 2010 2012 2014 2016 2018 Main DiD Alternative DiD Triple differences Estimated coefficients and 95% CI (f) Log monthly income -0.20 0.00 0.20 0.40 2010 2012 2014 2016 2018 Main DiD Alternative DiD Triple differences Estimated coefficients and 95% CI Note: Standard errors are clustered at the commune–year level. Source: Authors’ estimating event study specifications for the main DiD, the alternative DiD, and the triple-differences models. In the main DiD model, the control group is women aged 45-54; the model controls for district–year and district–age-group fixed effects. In the alternative DiD model, the control group is men aged 25-44; the model controls for district–year, district–female, and district–age group fixed effects. In the triple-differences model, we control for province–year–age group, district–group–female, and district–female–year fixed effects. All models control for the individual covariates described in text. 70 The evidence for the treatment effects on agricultural household work is less clear. The main DiD model using older women as the control group and the triple-differences model do not suffer from the pre-trends problem, while the alternative DiD model using men aged 25-44 as the control group appears to have negative pretrends. All three models indicate no effect in 2014. The effect become negative and significant in 2016 in the alternative DiD model. In 2018, the effects are negative and significant across all three models. These event-study estimates do confirm that the maternity leave extension decreases agricultural household work, although the effects do not become apparent until 2018. In short, the main DiD and triple-differences results as well as the event-study esti- mates indicate that women of childbearing age respond to the maternity leave extension of the 2012 Amendment by switching out of agricultural household work and into formal employment. Accounting for local labor market shocks at the district level appears to account for most biases from factors unrelated to the maternity leave extension. We rule out the possibility that our findings reflect the effects of the local labor market or the other components of the 2012 Amendment, e.g., changes in the strike conditions or minimum wage definition, which are likely to be gender-neutral. By comparing women of childbearing age with older women as well as with men of the same age group, we find that our results are driven specifically by women of childbearing age instead of by younger workers relative to older workers. Figure 3.4 shows our estimates for treatment effects by age groups. We present two sets of results for each outcome from estimating a DiD model and a triple-differences model that allows treatment effects to vary by age group. For the formal employment outcome, the treatment effects from the first model are smallest among the 40-44 age group and largest among the 35-39 age group, and the estimates are all statistically significant. Although the estimate for the 40-44 age group is relatively small compared to other age groups, one may still be concerned that the effect size might be unreasonably large given that this group has a very low, albeit non-negative, birth rates (see Figure 3.1). In other words, some unrelated factors might be at play that are driving these estimates. The triple-differences model appears to address some of this concern by adjusting downward these estimates. The treatment effect for the formal employment outcome estimated from this model are smaller and the results are significant only for 71 the 25-29 and 35-39 age groups. Figure 3.4: DiD estimates for effects by age (a) Formal employment -0.05 0.00 0.05 0.10 25-29 30-34 35-39 40-44 DiD Triple-differences Estimated coefficients and 95% CI (b) Informal wage employment -0.04 -0.02 0.00 0.02 0.04 25-29 30-34 35-39 40-44 DiD Triple-differences Estimated coefficients and 95% CI (c) Not working -0.04 -0.02 0.00 0.02 0.04 25-29 30-34 35-39 40-44 DiD Triple-differences Estimated coefficients and 95% CI (d) Agricultural household work -0.08 -0.06 -0.04 -0.02 0.00 0.02 25-29 30-34 35-39 40-44 DiD Triple-differences Estimated coefficients and 95% CI (e) Non-agricultural household work -0.10 -0.05 0.00 0.05 0.10 25-29 30-34 35-39 40-44 DiD Triple-differences Estimated coefficients and 95% CI (f) Log monthly income -0.20 -0.10 0.00 0.10 0.20 0.30 25-29 30-34 35-39 40-44 DiD Triple-differences Estimated coefficients and 95% CI Source: Authors’ estimating the DiD model in Equation 3.3 and the triple-differences model, in which treatment effects are allowed to vary by age group. In the main DiD model, the control group is women aged 45-54; the model controls for district–year and district–age-group fixed effects. In the triple-differences model, we control for province–year–age group, district–group–female, and district–female–year fixed effects. All models control for the individual covariates described in text. 72 The treatment effect estimates for agricultural household work are all negative, but are significant only in the main DiD model. The triple-differences estimates have similar magnitude, but wider confidence intervals. In other words, the triple-difference model yields less precise estimates, likely because of the extensive set of fixed effects as control variables. Interestingly, the main DiD model also estimates a negative and significant effect on non-agricultural household work among the 30-34 and 35-39 age group. However, these results are likely biased by other factors, which are accounted for by the triple-differences model. 3.5.2 Results from The Second Research Design As discussed in Section 3.1, the main difference-in-differences design relies on the as- sumption that the formal employment outcome of women of childbearing age and older women would have followed the same trends if the maternity leave requirement had not been extended. This parallel trends assumption would be violated if other factors also affect the labor market outcomes of younger women relative to older women (or workers in general). In Equation 3.5, we consider a different DiD design where we compare directly women between age 25-44 in low birth-rate groups with those in high birth-rate groups before and after the law was implemented. This allows us to identify the effect using the variation in birth rates conditional on age group–year fixed effects, which absorb the differential effects of any unrelated factors on different age groups over time. The plots in Figure 3.5 summarizes the estimates for the effects on the four groups with higher birth rates relative to the group with the lowest birth rates. Relative to the lowest birth-rate group, the treatment effects on formal employment for the other four higher birth-rate groups are positive and statistically significant. The two highest birth-rate groups also have substantially higher treatment effects than the rest, indicating that the 2012 Amendment has a larger effect among women with higher expected birth rates. This is consistent with our main findings that this is driven by the maternity leave extension. 73 Figure 3.5: Event Study for the Effects by District–Age-Group Birth-Rate Bin (a) Formal employment 0.00 0.05 0.10 0.15 0.20 1 2 3 4 5 Birth rate group Estimated coefficients and 95% CI (b) Informal wage employment -0.15 -0.10 -0.05 0.00 1 2 3 4 5 Birth rate group Estimated coefficients and 95% CI (c) Not working -0.10 -0.05 0.00 0.05 1 2 3 4 5 Birth rate group Estimated coefficients and 95% CI (d) Agricultural household work -0.15 -0.10 -0.05 0.00 0.05 1 2 3 4 5 Birth rate group Estimated coefficients and 95% CI (e) Non-agricultural household work -0.10 -0.05 0.00 0.05 0.10 0.15 1 2 3 4 5 Birth rate group Estimated coefficients and 95% CI (f) Log monthly income -0.20 0.00 0.20 0.40 1 2 3 4 5 Birth rate group Estimated coefficients and 95% CI Note: The graph reports the results from estimating the DiD model in Equation 3.5, in which treatment effects are allowed to vary by birth-rate groups. Birth rates vary by district and age group, and are binned into five equal groups. The lowest birth-rate group is the reference group. All models control for province–age group fixed effects, year–age group fixed effects, and province–year fixed effects. The sample includes women aged 25–44 in the VHLSS 2010–18 sample. Standard errors are clustered at the commune–year level and sampling weights are applied in the regressions. 74 Surprisingly, we find that the treatment effects on informal wage employment are negative and larger among the groups with higher birth rates. The treatment effects are significant for the two highest birth-rate groups. The estimates for the agricultural household work are negative but no longer significant across all birth-rate groups. These are different from the main findings, where we find that the maternity leave extension decreases agricultural household work and does not affect informal wage employment. In contrast, the estimates for non-working, non-agricultural household work, and log monthly income are relatively small and insignificant, which is consistent with the results from the main DiD design. To assess whether this specification absorbs any confounders unrelated to birth rates, we re-estimate the same model for a sample of only men between age 25 and 54 as a placebo test and present the results in Figure B.2. The estimates here represent the treatment effects on men in district–age groups with different expected birth rates. Since these birth rates should not have any effect on men (conditional on the fixed effects), we would expect all estimates to be statistically insignificant if this DiD design is valid. Indeed, we find that the treatment effects on men are mostly insignificant and relatively close to zero. The results for the informal wage employment in the male sample are similar to those from the female sample, but they are not precisely measured. Overall, the results from this placebo test support the conclusion that the increase in formal employment among women in the higher birth-rate group is indeed driven by the maternity leave extension and not the other provisions of the 2012 Amendment. In summary, the results for the formal employment outcome are robust against dif- ferent specifications and research designs, which allow us to confirm that the maternity leave extension provides an incentive for women of childbearing age to move into the formal sector. The evidence for informal wage employment and agricultural household work is less conclusive; the research design using the variation in age and gender sug- gests that women move out of the agricultural sector, while the research design using the variation in birth rates suggests that women move out of informal wage employment. 3.5.3 Formal Employment in the Public and Private Sectors In this section, we apply the same approaches to understand more about sectors (i.e., public or private), industries, and occupations that women move into as a result of the 75 maternity leave extension. First, we note from Figure 3.6(a) that public employment has been declining among women of both age groups during the entire study period, but the decrease for women of childbearing age has slowed down since 2014. Similarly, public employment also decreased among older men during 2010–18, while that of younger men only decreased in 2018. Figure 3.6: Public and private formal employment for women aged 25–54 by year (a) Formal employment in public sector 0.02 0.03 0.04 0.05 0.06 0.07 2010 2012 2014 2016 2018 Female, 25-44 Female, 45-54 Male, 25-44 Male, 45-54 Percentage (b) Formal employment in private sector 0.00 0.05 0.10 0.15 2010 2012 2014 2016 2018 Female, 25-44 Female, 45-54 Male, 25-44 Male, 45-54 Percentage (c) Log monthly income (public) 10.00 10.20 10.40 10.60 10.80 11.00 2010 2012 2014 2016 2018 Female, 25-44 Female, 45-54 Male, 25-44 Male, 45-54 Percentage (d) Log monthly income (private formal) 10.00 10.20 10.40 10.60 10.80 2010 2012 2014 2016 2018 Female, 25-44 Female, 45-54 Male, 25-44 Male, 45-54 Percentage Source: Authors’ calculation based on the VHLSS 2010–18. The formal employment in the private sector in Figure 3.6(b) are more similar to the patterns of overall formal employment that we observed in Figure 3.2(a). There is an upward trend among all age and gender groups, but the increases among younger men and women appear to be larger than those among older men and women. Within the 25-44 age group, women experienced a larger increase than men, especially after 2014. These trends suggest that formal employment in the private sector of women of 76 childbearing age increases following the maternity leave extension, relative to the other demographic groups. Despite these changes in employment among women aged 25-44, there is little evidence that their monthly wages are affected in both the public sector and the private formal sector. Log monthly wages follow an upward trend for all genders and age groups, although those among the older age group are more volatile for both genders. We now turn to the regression estimates for the effects on these outcomes in Table 3.3. The main DiD model using older women as the control group yields positive and significant estimates for the effect on public sector employment. The baseline estimate is 0.029, and controlling for province–year and province–age group fixed effect yields a similar result. Controlling for district–year and district–age group, on the other hand, yields slightly lower estimate of 0.024. Surprisingly, the estimates from the alternative DiD model using men of the same age as the control group are all close to zero and statistically insignificant. This is consistent across different levels of fixed effects as control variables. The triple-differences estimate is -0.007 and also insignificant. In contrast, the estimates for the effect on formal em- ployment in the private sector are small and insignificant in the main DiD model, but are positive and significant in the alternative DiD model and also the triple-differences model. The baseline estimate of the alternative model is 0.036, while controlling for district–year and district–age group fixed effects reduces the estimate to 0.031. The triple-differences model also indicates that the treatment effect on private formal em- ployment is positive and significant. The magnitude of the treatment effect on formal employment in the private sector is also comparable to the effect on overall formal em- ployment. Unlike employment outcomes, the estimates for both wage outcomes are all small and statistically insignificant.13 13The sample size for log monthly wages decrease substantially when we consider public and private sectors separately. 77 Table 3.3: DiD results for the effects on public and private formal employment DiD model DiD model DDD Women age 45-54 as control Men age 25-44 as control model Specification (1) (2) (3) (4) (5) (6) (7) Panel A: Public sector 0.029*** 0.030*** 0.024*** -0.004 -0.005 -0.003 -0.007 (0.008) (0.008) (0.009) (0.004) (0.004) (0.004) (0.009) N 34701 34701 34422 45294 45294 45215 67739 Panel B: Private formal sector 0.009 0.009 0.006 0.036*** 0.037*** 0.031*** 0.032*** (0.011) (0.011) (0.012) (0.006) (0.005) (0.006) (0.009) N 34701 34701 34422 45294 45294 45215 67739 Panel C: Log monthly income (public) 0.061 -0.102 0.596* -0.020 -0.018 -0.052 0.154 (0.110) (0.148) (0.346) (0.047) (0.056) (0.123) (0.446) N 1388 1265 349 2093 2075 805 621 Panel D: Log monthly income (private formal) -0.047 -0.062 -0.172 0.036 0.030 0.061 -0.071 (0.089) (0.098) (0.184) (0.029) (0.033) (0.052) (0.281) N 2419 2302 1432 3776 3721 2724 2130 Additional controls Age group FE ✓ ✓ Year FE ✓ ✓ Female FE ✓ Province × Year FE ✓ ✓ Province × Age group FE ✓ ✓ Province × Female ✓ District × Year FE ✓ ✓ District × Age group FE ✓ ✓ District × Female ✓ District × Female × Year FE ✓ District × Age group FE × Female ✓ Province × Age group FE × Year FE ✓ Note: This table reports the DiD estimates for the effects of the maternal leave extension on labor market outcomes. In columns 1-3, we use women aged 45-54 as the control group, and the results are reported for the interaction term Post-2013 × Age 25-44. The sample for these estimates is non-college females aged 25-54. In columns 4-6, we use men aged 25-44 as the control group, and the results are reported for the interaction term Post-2013 × Female. The sample for these estimates includes all non-college individuals aged 24-44. Standard errors are clustered at the commune-year level and reported in parentheses and p-value is reported in brackets; sampling weights are applied. All models control for year FE, age group FE, urban, ethnicity, household size, number of children aged under 10 in household, educational attainment, marital status, and year-specific cohort linear trends. Data is drawn from the VHLSS 2010-2018. 78 The event-study estimations provide a similar story as the main models (see Figures 3.7(a) and (b)). The main DiD model indicates a positive effect on public employment, while the alternative DiD and the triple-differences models suggest that the effects are close to zero. The opposite is true with private formal employment; the main DiD estimates are positive but insignificant; the alternative DiD and triple-differences’ estimates are positive and significant. The inconsistency between the main DiD model and the two other models raises a concern about whether the maternity leave extension increases formal employment in the public or the private sector. We find it unlikely that the maternity leave exten- sion increased public employment. First, when estimating the treatment effects by age group, the estimates for the public employment outcome follow a puzzling pattern: the estimates are larger for older age groups, but the estimate for the 25-29 age group, the youngest individuals in the sample, is negative and significant (see Figure 3.7(c)). We would expect younger individuals (with higher birth rates) to experience larger effect than older individuals (with lower birth rates) if these estimates represent the treatment effects of the maternity leave extension. More importantly, these estimates are also not robust to the triple-differences model. When we switch to the second DiD design that uses birth rate groups as the treatment variable (see Equation 3.5), the estimated treat- ment effects are also small and statistically insignificant across all birth rate groups. In other words, the effect on public employment is also not robust to our second DiD design. The results for the private formal employment are similar to our main conclusion about the overall formal employment. The estimated treatment effects by age, as shown in Figure 3.7(d), show that the effects are largest among women between age 25 and 29, those that have the highest birth rate among all age groups (see Figure 3.1). This is consistent with what we would expect if women switch their career in response to the incentive of the maternity leave extension. When estimating the model using birth rate groups as the treatment variable, the estimates for public employment are small and sta- tistically insignificant. In contrast, the treatment effects on private formal employment are positive, significant, and larger among higher birth-rate groups. Our conclusion is that it is more likely that the maternity leave extension increases formal employment in the private sector. 79 Figure 3.7: Event study for the effects on public and private formal employment (a) Event-study: Public empl. -0.05 0.00 0.05 0.10 2010 2012 2014 2016 2018 Main DiD Alternative DiD Triple differences Estimated coefficients and 95% CI (b) Event-study: Private formal empl. -0.05 0.00 0.05 0.10 2010 2012 2014 2016 2018 Main DiD Alternative DiD Triple differences Estimated coefficients and 95% CI (c) Effects by age: Public empl. -0.04 -0.02 0.00 0.02 0.04 25-29 30-34 35-39 40-44 DiD Triple-differences Estimated coefficients and 95% CI (d) Effects by age: Private formal empl. 0.00 0.05 0.10 0.15 25-29 30-34 35-39 40-44 DiD Triple-differences Estimated coefficients and 95% CI (e) Effects by birth rates: Public empl. -0.05 0.00 0.05 0.10 1 2 3 4 5 Birth rate group Estimated coefficients and 95% CI (f) Effects by birth rates: Private formal empl. 0.00 0.05 0.10 0.15 0.20 1 2 3 4 5 Birth rate group Estimated coefficients and 95% CI Note: Figures (a) and (b) report the event-study estimations for the main DiD, the alternative DiD, and the triple-differences model. In the main DiD model, the control group is women aged 45-54; the model controls for district–year and district–age-group fixed effects. In the alternative DiD model, the control group is men aged 25-44; the model controls for district–year, district–female, and district–age group fixed effects. In the triple-differences model, we control for province–year–age group, district–group–female, and district–female–year fixed effects. Figure (c) and (d) report the estimations for the main DiD and the triple-differences models, in which the treatment effects vary by age group. Figure (e) and (f) report the DiD estimates where the treatment effects vary by birth-rate groups. All models control for the individual covariates described in text. Standard errors are clustered at the commune–year level. 80 Instead of splitting by public and private sectors, we can also split the formal employ- ment by industry and occupation. That is, we consider which industry and occupation women of childbearing age are switching to as a result of the maternity leave extension.14 We present the triple-differences estimates for the treatment effect on 20 industries and 10 occupations along with the means of the outcome variables in Figure 3.8. First, we observe a relatively large, positive and significant effect on formal employment in the manufacturing sector (which also includes the textile industry), where the estimate is about 0.04. The large magnitude of the effect on this sector is perhaps not surprising, given the fact that it is also the largest sector in terms of formal employment. We also observe a positive and significant effect on the entertainment sector (including arts, sports, and other entertainment industries), although the estimate is smaller than 0.01. For other industries, the effects are close to zero and statistically insignificant. These results are consistent with our earlier conclusion that the effect is mainly on formal employment in the private sector, as these industries are mainly located in the private sector.15 The results on occupation outcomes yield a similar conclusion. The treatment ef- fects are positive and largest among typical occupations in the manufacturing sector, including operators of plants and machines and textile workers. The formal employ- ment of women is relatively large among these occupations, and they likely require lower skills from their workers (relative to technicians and professionals). Therefore, there are likely more opportunities to switch from the informal sector. There is also a positive and significant effect on the clerical occupation, which typically include office jobs such as secretaries and customer service jobs. Consistent with what we find for the industry outcomes, these are the occupations that are mainly found in the private sector.16 14We thank an anonymous reviewer for suggesting that we look at the effects by industry and occu- pation. 15Almost 90% of manufacturing workers work for the private sector (VHLSS). 16According to the 2010-2018 VHLSS, about 86% of the plant and machine operators/workers and 82% of the craft and related workers are hired by the private sector. The clerical occupation is more balanced, as only 53% of the occupation are employed by the private sector. 81 Figure 3.8: Triple-differences results for industry and occupation effects (a) Treatment effects on industry outcomes Agriculture, Fishing, and Forestry Mining Manufacturing Electricity, water, and waste Construction Wholesale and retail sale Storage and transportation Hospitality Information and communication Finance and banking Real estate Science and technology Administration and support Politics Education Health care Entertainment Other services Household activities and services International activities -0.02 0.00 0.02 0.04 0.06 0.08 Treatment effect Industry mean among women Treatment effect and 95% CI (b) Treatment effects on occupation outcomes Armed forces Managers Specialists Technicians and associate professionals Clerks Service and sales workers Skilled agriculture, fishing, and forestry workers Craft and related workers Plant and machine operators/workers Elementary workers -0.02 0.00 0.02 0.04 Treatment effect Occupation mean among women Treatment effect and 95% CI Note: The graph reports the results from estimating the triple-differences model in Equation 3.4. All models control for urban, ethnicity, household size, number of children under age 10 in the household, educational attainment, marital status, province–age group–year fixed effects, district–age group–female fixed effects, and district–female–year fixed effects. The sample is all non-college men and women age 25-54 in the 2010–18 VHLSS sample. Standard errors are clustered at the commune–year level. 82 In short, these results suggest that the maternity leave extension of the 2012 Amend- ment provides an incentive for women of childbearing age to move into formal employ- ment in the private sector. The switch primarily happens in manufacturing where the formal employment is more common than in other industries. The effects are also concentrated among middle-skilled occupations such as machine operators, plant work- ers, craft workers, as well as clerks. This is likely because the labor demands for these occupations have been rising recently as a result of more private firms entering the man- ufacturing industries (McCaig and Pavcnik, 2013), and the skills for these occupations are relatively easy to acquire relative to those in the high-skilled occupations. 3.6 Conclusion The 2012 Amendment to the Labor Law of Viet Nam effectively extends the mandated maternity leave from four months to six months, which is likely enforceable among firms in the formal sector but not informal businesses. This provides an increase in employment benefits particularly for women of childbearing age, creating an incentive for these women to switch from informal work, such as farm or non-farm household work, to formal employment. We assess the impacts of the maternity leave extension on women’s labor market outcomes using two different DiD designs. First, we compare the employment outcomes of women of childbearing age with those beyond childbearing age before and after the implementation of the law. We find robust evidence that the law is associated with an increase of 3.3 percentage points in the probability of formal employment. Our results are robust to switching to using men of the same age group as the control group as well as a triple-differences model. Second, we compare employment outcomes among women by their expected birth rates and arrive at similar conclusions. Groups with higher expected birth rates are more likely to move into formal employment in response to the new law. However, the two designs point to different sources of informal employment that women of childbearing age move out of. The first design suggests that women move out of agricultural household work, while the second design suggests that women move out of informal wage employment. We also observe that the extension encourages women to switch to the private formal sector, specifically in the manufacturing industry 83 and middle-skilled occupations. Interestingly, our findings are opposite of those of Uribe et al. (2019), who use a similar research design to evaluate a similar law in Colombia and find that extending maternity leave has a negative effect on women’s formal employment. The difference might be due to the differences in the labor market and regulations between Vietnam and Colombia. First, the private formal sector in Vietnam has undergone significant expansion in the last two decades, especially in the manufacturing industry. The 2000 Enterprise Law, for example, reduced the administrative burden to register new busi- nesses, contributing to the growth of private formal firms (Malesky and Taussig, 2009). Changes in the trade policies such as the US-Vietnam Bilateral Trade Agreement (2001) and Vietnam’s accession to the WTO (2007) shifted the markets away from agriculture and towards manufacturing exports (McCaig and Pavcnik, 2013). This considerable economic development likely provides more opportunities for workers to move out of the informal sector and into the manufacturing sector. Along with the increase in labor demands, female workers in Vietnam also appear to value sick and maternity leaves as a reason to be employed in a company (Tran and Jeppesen, 2016). Second, in Vietnam, the salary received by female workers during a maternity leave is completely paid for by social insurance.17 In the case of Colombia, the employer has to pay 73.06% of the workers’ salary during leave, while the rest is covered by social security (Uribe et al., 2019). The critical difference between the two countries in how salary is paid during the maternity leave may help explain why our findings are different from those in Uribe et al. (2019). These differences highlight the importance of conducting more studies on the labor market effects of maternity leave mandates in developing countries. As countries differ in terms of labor laws and labor market conditions, the effects on female labor market outcomes may also vary. To close, we note that our study comes with two major caveats. First, the VHLSS data allow us to study formal employment only during 2010 or later because the survey did not ask for information about social insurance before that, which means that we can examine the pre-treatment trends for only two periods: 2010 and 2012. Although our event study results support the assumption of parallel trends between the treatment 17Employers do have to contribute 18% of their payroll and employees have to contribute 8% of their salary towards social insurance. 84 group and two different control groups during these two years, it is possible that the trends may not be parallel before 2010. As a result, the counterfactual trends (in the absence of the maternity leave extension) may also be non-parallel. However, we observe that the shares of formal employment of all age groups and genders follow relatively stable trends up until 2016 when the trend of women of child- bearing age diverged substantially (see Figure 3.2). More importantly, the trends of the control groups (men between age 25-44 and women between age 45-54) continue to be very stable over time. These patterns in the raw data provide reassuring evidence that the changes in the labor market outcomes respond to the 2012 Amendment rather than other factors. Our second DiD design further suggests that the results are driven by birth rates among childbearing-aged women. Therefore, we can conclude that the intervention is related to fertility-related incentives, which can only be the maternity leave extension. Second, the sample size for individuals with monthly wage information is likely too small to detect any effect of the leave extension on this outcome, especially for wages separately in the public and private formal sectors. Although on-leave employees are paid through their social insurance, firms are still likely to incur indirect costs for such leaves by leaving the position unfilled or hiring temporary workers. Hence, one may expect the maternity leave extension to have some effect on monthly wages as the required leave is increased from 4 to 6 months. It is possible that the 2-month increase is relatively small so the effects on monthly wages are also small. Chapter 4 Income Shock and Food Insecurity during the Pandemic 4.1 Introduction Food insecurity has become a major economic consequence of the global pandemic as lockdown and social distancing are widely adopted to contain the COVID-19 virus.1 During a lockdown, many workers lose part of their incomes or their jobs as they with- draw from social interaction, which, in turn, severely affect their ability to afford food (Devereux et al., 2020). Although several COVID-19 vaccines were successfully devel- oped and approved in early 2021, the world continues to experience new waves of infec- tion with newer and more transmissible variants, especially among developing countries that lack the resources to purchase and distribute vaccines to their citizens. Thus, food insecurity will continue to be an issue until lockdown orders and/or withdrawals from work and social interaction are no longer necessary, i.e. when herd immunity is achieved through vaccine-induced or natural immunity. Given the severity of the economic impacts of the pandemic, governments and in- ternational organizations have pursued various actions to provide financial support and food aid to vulnerable populations (Gentilini et al., 2020), which raises an important 1See Devereux et al. (2020); Brown et al. (2020); Mishra and Rampal (2020); Ahn and Norwood (2020); Paslakis et al. (2020); Reardon et al. (2020); International Monetary Fund (2020). 85 86 question: how should governments and organizations effectively target places or house- holds that need their support most within a country? It is now clear that the economic impacts of COVID-19 vary across populations, geography, and economic sectors (The World Bank, 2020; International Monetary Fund, 2020). Prioritizing places affected more by the pandemic will allow governments or international organizations to provide more support to people needing them most instead of extending resources to people who were relatively unaffected. This is especially true for low- and middle-income countries with limited resources. A common approach is to send aid to households with a pre-pandemic poverty or low-income status.2 While it is simple to target poor households, not all poor households experience food insecurity during the pandemic. For example, households whose mem- bers work in unaffected sectors are not suffering, while near-poor households may still suffer a food insecurity shock if their members work in a severely affected sector.3 This issue is particularly severe in developing countries where data on poverty status may be outdated and data on income does not include informal economic activities (Aiken et al., 2021). A better method is to combine household targeting with geographic tar- geting: government (or international organizations) can first identify and allocate an appropriate budget and resources to localities that are affected most, so that local gov- ernments or field agents from international organizations can identify and distribute aid to households or individuals that require assistance. This practice is used by the United Nations World Food Programme (World Food Programme, 2015). Identifying which places are affected more, however, would require conducting a large-scale survey to determine the percentage of food-insecure households in differ- ent areas across a country, which can be both costly and time-consuming. Facing this constraint, most existing studies only document food insecurity shocks during the pan- demic at the national or regional level.4 Several innovative studies propose predicting 2The goal of this targeting approach can be to support the most vulnerable population or to com- pensate for their income losses. 3Gundersen et al. (2011) show that in the US, 65% of households close to the poverty line are food secure, while 10%–20% of households with income twice as high as the poverty line are food insecure. 4See, e.g., Ahn and Norwood (2020); Wolfson and Leung (2020); Bitler et al. (2020); Aggarwal et al. (2020); Amare et al. (2021); Hirvonen et al. (2021); Gupta et al. (2021); Nguyen et al. (2021); Barrett et al. (2021); Gatto and Islam (2021) among many others. 87 and targeting food insecurity at a more granular level using remote sensing data on ge- ography and weather (e.g., Andree et al., 2020; von Carnap, 2021; Zhou et al., 2021) or combining such data with phone surveys and mobile phone usage (Blumenstock et al., 2015; Aiken et al., 2021). These studies tap into a significant amount of data that are traditionally underutilized, but they also require using cutting-edged machine learning technique to process satellite photos and make prediction based on a large number of factors. The complex nature of these advanced methods, however, can pose a significant barrier for policymakers to adopt widely. These methods rely heavily on the expertise of the researchers and modelers to make decisions on which data and methods to use as well as decisions about variables, parameters, and assumptions specific to each method.5 We propose an alternative approach that also predicts and target food insecurity at the granular level and only relies on a common and widely-used method in the econometric toolkit. The first step is to estimate the causal effect of household income on food insecurity. The standard OLS estimation may suffer potential bias as food insecurity can simultaneously affect income by lowering productivity, so we use local exposure to national employment changes to instrument for household income. The second step is to combine this estimate and external information on aggregate income shocks at the sector level during the pandemic to predict changes in food insecurity at the locality level. Because we only focus on predicting changes, not levels of, food insecurity, we only require information about how income changes at the sector level. Under the assumption that the pandemic mainly affects food insecurity through the income loss channel, this method can provide a reasonable and timely prediction that policymakers can quickly act on. We apply this approach to predict and assess potential food insecurity shocks during the pandemic in Vietnam. In 2020, Vietnam was among the most successful countries at containing new infections with strict measures of mass testing, targeted lockdowns, and quarantine policies. However, in 2021, the country entered a new and more severe wave of infection, forcing more extensive lockdowns in several parts of the country. Although food insecurity is reported as one of the major concerns of households during this period (Yang et al., 2020), little is known about which places in Vietnam are affected more in 5Zhou et al. (2021) document several modeling choices that can have important consequences on the prediction of these models. 88 terms of food insecurity, preventing the government and international organizations from targeting and distributing aid. We address this problem by predicting food insecurity shocks due to changes in income for 702 districts in Vietnam, allowing identification of districts that are likely affected most. We first use the 2010–2018 Vietnam Household Living Standard Survey (VHLSS) to estimate the effect of income shock on food insecurity. We then use information from the World Bank on pandemic income shocks and the 2019 Labor Force Survey (LFS) to predict changes in food insecurity risk for 702 districts of Vietnam. The share of food- insecure households is predicted to rise by 0.82 percentage points, but a small number of districts are predicted to experience increases as large as 7.86 percentage points. We also predict an increase of 0.997 percentage points in the share of food-insecure children under 5, and a few districts are predicted to experience an increase as large as 19.33 percentage points. These predictions suggest that the average impact of the income shock during the pandemic in Vietnam may be small, but certain parts of the country are likely affected more severely than others. Our study makes two contributions to the literature. First, it contributes to a growing number of studies on predicting food insecurity at the locality level for targeting purposes, which are part of a broader literature on interventions and disaster-relief policies to address food insecurity during emergencies.6,7 Whether an intervention is effective depends crucially on the quality of the targeting approach (Barrett, 2010; Lentz and Barrett, 2013): targeting the wrong places is costly, while missing the food- insecure places can have long-term negative consequences for food-insecure households (Brown et al., 2018). Although the literature offers several machine learning approaches to make timely and granular prediction on food insecurity for targeting purposes during crises that are unrelated to COVID-19 (e.g., Lentz et al., 2019; Andree et al., 2020; Zhou et al., 2021), only two studies focus on developing methods specific to the pandemic. Gundersen et al. (2020) employ an OLS-based approach to predict changes in food 6Several studies find that timely interventions such as cash or food transfers can reduce food insecurity and mitigate any negative effects of food insecurity, such as Nikulkov et al. (2016); Gelli et al. (2017); Hidrobo et al. (2018); Christian et al. (2019); Savy et al. (2020) and others. Lentz and Barrett (2013) show that food assistance policies have the highest returns when targeting children under 2 years old. 7Our study is also related to an extensive literature on factors of food insecurity such as rainfall (Niles and Brown, 2017), agricultural income (Kuma et al., 2019), farm production (Jones et al., 2014), and economic vulnerability (Chaaban et al., 2018). 89 insecurity at the county level in the US. Relatedly, Aiken et al. (2021) propose a new framework to combine data from satellite images and mobile phone networks with phone survey data to predict levels of poverty in Togo. Like Gundersen et al. (2020), our study also predict changes, not levels, of food insecurity. We extend the methods in Gundersen et al. (2020) to address the simultaneity bias in the standard OLS-based approach. Our proposed method is more suitable for developing countries than that of Gundersen et al. (2020) because it employs datasets that are typically available in these countries, including population censuses, national household surveys, and labor force surveys. Second, our study can help policymakers address the rising concerns about food insecurity in Vietnam and other countries.8 In the case of Vietnam, a national phone survey conducted by the World Bank indicates that 33% of households were concerned about not having enough food (Yang et al., 2020), yet which districts are affected more remains unknown. While Vietnam continues to combat new waves of infection with extensive lockdown orders, it becomes increasingly important for policymakers to sketch a rapid and well-targeted response to alleviate the pandemic’s economic impacts in Vietnam. Using information about income shocks in 2020, our study maps the projected food insecurity shocks at the district level, allowing policymakers to identify districts that may require more assistance than others. Policymakers can also apply our proposed method to predict food insecurity shocks as the pandemic continues or for other shocks in Vietnam and other developing countries. The rest of this paper is organized as follows. Section 2 summarizes the impacts of COVID-19 on the Vietnamese economy. Section 3 proposes an empirical framework to predict changes in food insecurity due to sector-specific income shocks and describes the data used. Section 4 reports the regression results. Section 5 applies the framework to predict food insecurity changes in Vietnam in 2020 and discusses policy implications. Section 6 summarizes the lessons from our predictions and discusses different caveats of the proposed approach. 8Because we only focus on food insecurity prediction, our study is only indirectly linked to an extensive literature on measuring the impacts of the pandemic on food insecurity especially in developing countries. See Picchioni et al. (2021) and Be´ne´ et al. (2021) for recent systematic reviews on this topic. 90 4.2 Impacts of COVID-19 on the Vietnamese Economy In this section, we briefly discuss how the pandemic evolved in Vietnam and its eco- nomic consequences from January 2020 to July 2021. The first five cases of COVID-19 were detected during the last week of January 2020. Since then, several public health measures have been taken to contain the virus’s spread, which in turn has affected the economy. On February 1, 2020, flights between China and Vietnam were canceled (Tuoi Tre online, 2020). The borders between the two countries have also been tight- ened, disrupting agricultural exports from Vietnam to China (BBC, 2020). Moreover, this strongly affected tourism, an important industry in Vietnam. According to the Vietnam National Administration of Tourism (2021b), Vietnam received 18 million in- ternational tourists in 2019 and collected revenue of $27.5 billion in 2018, which is about 11% of 2018 GDP (Vietnam National Administration of Tourism, 2021a). Following the travel restrictions imposed in February 2020, the Vietnamese tourism industry lost 32% of its international customers (mainly from China). Starting in March 2020, all flights to Vietnam were canceled due to the outbreak in the EU and the US. The Vietnamese government also imposed a two-week social distancing order on April 1, 2020. These events created shocks in both demand and supply in the tourism sector as well as other related industries in the service sector. The social distancing order in April also affected manufacturing industries by re- quiring additional distance between workers. According to the Ministry of Industry and Trade (MOIT), the industrial production index decreased by 18% in April 2020 and only recovered to pre-distancing levels in June 2020. Meanwhile, the outbreak also created a supply chain disruption in the manufacturing sector. According to a survey conducted by the Vietnamese General Statistics Office (GSO) in April 2020, 42.8% of surveyed firms reported a supply shortage, and this number increased to more than 70% in the garment industry (GSO, 2020b). This issue was partly resolved when China, Korea, and Japan reopened their economies, and Vietnam’s imports in the first seven months of 2020 were lower than the same period in 2019, only by 2.9% (MOIT, 2020). While the initial supply shock abated as the country successfully contained the first wave of the virus, Vietnam still faced lower demands from importing countries due to global supply chain issues. For instance, in the first seven months of 2020, the garment 91 and footwear industries exported less than that of 2019 by 12%. To cope with lower demand due to the pandemic, firms reportedly cut costs by terminating labor contracts, laying off workers, cutting workloads, and/or cutting wages.9 This in turn led to higher risks of food insecurity, especially for female and low-skilled workers. The first wave in Vietnam ended in April 2020 when the country recorded no new local transmission cases for almost three months. The second wave, however, started in July 2020 when there were several new cases and deaths across multiple cities, forcing local lockdowns. Following the second wave, economic recovery slowed down. According to The World Bank (2020), the Vietnamese economy grew at a slower rate in August 2020 compared to previous months. Retail sales grew by 5.2% in July 2020 and only 2.3% in August 2020. FDI flow also decreased substantially; in July 2020, it was US$3.1 billion but dropped to US$720 million in August 2020. Within Vietnam, vulnerable workers continue to worry about future financial outcomes (Dang and Giang, 2020). The third wave started in late January 2021 and quickly spreaded across the country, forcing several cities and provinces to lock down until March 2021 (Ministry of Health, 2021). In early May 2021, state and foreign media issued warnings about a potential fourth wave as new cases were detected in Hanoi, Vinh Phuc, and Ha Nam (BBC, 2021). A fourth wave of infection started in early May 2021, and Vietnam continued to struggle with containing it as of July 2021. This wave is the most severe so far, as the number of new cases from May accounts for 84% of total cases in Vietnam up to this point. The industrial powerhouses, including Ho Chi Minh City, Bac Ninh, and Bac Giang, were the epicenters of this wave. Targeted lockdowns and contract tracing were implemented across the country, affecting 15 million workers in the second and third quarters of 2021 (GSO, 2021). The full extent of the economic impacts of the fourth wave and the pandemic on Vietnam remains unclear as new variants such as the Omicron variant continue to emerge. Given these economic impacts, one can expect the pandemic to have severe effects on food insecurity. As household income decreases due to the pandemic, households are less likely to afford nutritious food, which threatens their food security. We illustrate 9In April 2020, the GSO reported that 66.8% of firms applied at least one of these measures GSO (2020b), and the LFS reports that 30.8 million workers were affected in quarter 2 (GSO, 2020a). 92 the relationship between food insecurity and income in Figure 4.1 by plotting province- level average monthly income against shares of food-insecure households in (a) as well as province-level shares of poor households against shares of food-insecure households in (b). Provinces with higher average income and a lower share of poor households tend to have a lower share of food-insecure households. In the next section, we discuss in detail how to predict food insecurity shocks due to the pandemic in Vietnam. (a) Income versus food insecurity (b) Poverty versus food insecurity Figure 4.1: Correlation between province-level income and poverty with share of food- insecure households in 2018 Note: Figure (a) shows a scatter plot of the province-level share of food-insecure households and the average monthly income per capita. Figure (b) shows a scatter plot of the province-level share of food- insecure households and the share of poor households. A household is defined as food insecure when its share of calories from staple food exceeds 84% (see more in the next section). A household is defined as poor when its monthly income per capita falls below the national poverty line. The data are from the Vietnam Household Living Standard Survey in 2018. 4.3 Empirical Strategy 4.3.1 Data As we will discuss in the next section, the method proposed in this study requires data for two steps: (1) estimating the causal relationship between household income and food insecurity and (2) predicting food insecurity changes at the district level by combining the estimates from the first step with external information about income shocks. We use the 2010-2018 Vietnam Household Living Standard Survey (VHLSS) for the estimation step. The VHLSS is a biannual survey conducted by the General Statistics Office of 93 Vietnam (GSO). Each wave of data consists of nearly 9,400 household observations and roughly 36,000 individual observations across the country. This dataset is well- suited to estimate the effect of household income on food insecurity status for two main reasons. First, it contains many variables related to household food consumption and food insecurity, allowing us to construct different measures of food insecurity based on nutritional quality.10 Second, it covers all provinces over a long time period, which allows us to control for any long-term trends in income and food insecurity at the province level. However, there are two important caveats of the VHLSS that make it unsuitable for the prediction step: (1) the latest wave available is 2018, while the income shocks reported by the World Bank are a comparison between 2020 and 2019; and (2) the sampling of the study is not designed to calculate aggregates at the district level; that is, it does not have enough observations per district for aggregation. Due to these limitations, we use the Labor Force Survey (LFS) in 2019 for the prediction step. The survey is conducted annually by the GSO and includes more than 800,000 individuals from about 200,000 households across 702 districts. Households are selected from the stratified random sampling method that is representative at the district level. The LFS focuses on the labor market information of individuals of legal working age, which includes employment status, income, and workload, and has demographic information on all household members. The LFS allows aggregating data at the district level, and the latest data available is for 2019, which is more comparable to 2020 than the latest VHLSS wave in 2018. However, the LFS data do not have information on food consumption, which is why we cannot use it for the estimation step. In Table 4.1, we provide summary statistics for household data from the 2010–2018 VHLSS and the 2016, 2018, and 2019 LFS. 10In addition to containing a lot of information about household consumption, VHLSS is a rotating panel, where 50% of the sample appears in two consecutive surveys and a household can only be tracked for three waves. We take advantage of this unique feature to construct a subsample with household-level panel data for a household fixed effects specification. 94 Table 4.1: Summary statistics P a n e l A : 2 0 1 0 – 2 0 1 8 V H L S S V a ri ab le s 20 10 20 12 20 14 20 16 20 18 M ea n (S .E .) M ea n (S .E .) M ea n (S .E .) M ea n (S .E .) M ea n (S .E .) S ta p le ca lo ri e sh ar e (% ) 75 .1 2 (1 1. 40 ) 74 .8 9 (1 0. 84 ) 72 .6 4 (1 1. 60 ) 70 .8 7 (1 1. 16 ) 68 .9 0 (1 1. 76 ) C on st ru ct ed fo o d in se cu ri ty (% ) 21 .8 3 (4 1. 31 ) 19 .7 8 (3 9. 83 ) 14 .1 2 (3 4. 83 ) 11 .0 4 (3 1. 34 ) 7. 81 (2 6. 83 ) S el f- re p or te d fo o d in se cu ri ty (% ) 4. 64 (2 1. 04 ) 3. 74 (1 8. 98 ) 2. 62 (1 5. 99 ) 1. 85 (1 3. 47 ) 1. 23 (1 1. 00 ) H ou se h o ld d ie ta ry d iv er si ty sc or e 10 .0 1 (1 .5 4) 10 .0 7 (1 .5 1) 10 .1 6 (1 .5 6) 10 .2 5 (1 .4 5) 10 .2 6 (1 .5 4) Y ea rl y in co m e (’ 00 0, 00 0, 20 10 V N D ) 66 .3 6 (1 17 .2 5) 71 .0 9 (7 8. 90 ) 75 .9 6 (7 6. 34 ) 84 .8 8 (8 3. 76 ) 10 0. 48 (9 5. 18 ) H ou se h o ld si ze 3. 95 (1 .5 6) 3. 91 (1 .5 6) 3. 84 (1 .5 7) 3. 81 (1 .6 0) 3. 74 (1 .6 0) U rb an (% ) 28 .2 6 (4 5. 03 ) 28 .8 0 (4 5. 29 ) 29 .7 6 (4 5. 72 ) 30 .3 3 (4 5. 97 ) 30 .3 0 (4 5. 96 ) N o. ob se rv at io n s 9, 26 7 9, 27 8 9, 29 7 9, 30 2 9, 29 9 P a n e l B : 2 0 1 6 , 2 0 1 8 & 2 0 1 9 L F S V a ri ab le s 20 16 20 18 20 19 M ea n S .E . M ea n S .E . M ea n S .E . Y ea rl y in co m e (’ 00 0, 00 0, 20 10 V N D ) 71 .9 8 (7 2. 17 ) 78 .2 5 (7 1. 78 ) 82 .2 0 (7 1. 84 ) H ou se h o ld si ze 3. 78 (1 .5 5) 3. 77 (1 .5 7) 3. 70 (1 .5 6) U rb an 42 .6 8 (4 9. 46 ) 42 .6 7 (4 9. 46 ) 42 .6 3 (4 9. 45 ) N o. ob se rv at io n s 20 6, 38 5 20 8, 90 5 21 2, 04 0 95 Our instrumental variable is the share of adult employment at the district-industry level in 2009 times the national employment growth of each industry. We construct this variable using the 2009 Population and Housing Census. The census is conducted every ten years by the GSO and contains information of the whole population such as demographics, migration, educational attainment, employment, fertility, mortality, and housing quality. We use the 15% sample provided by Minnesota Population Center, which has about 14 million observations and is representative at the district level. 4.3.2 Measuring Food Insecurity According to Bickel et al. (2000), food security is when access to nutritionally adequate, safe, and socially acceptable foods is not limited or uncertain. Following this defini- tion, there are three core concepts that define a food-insecure household: availability, accessibility, and utilization (Webb et al., 2006; Barrett, 2010). Depending on the data availability and the context, researchers can use proxy for these different aspects of food insecurity. In this study, we focus on measuring the utilization aspect of food insecurity, which reflects concerns about whether households make good use of the food to which they have access (Barrett, 2010). Following Baylis et al. (2019) and Jensen and Miller (2010), we define a food-insecure household as a household with the share of calories from staples exceeding 84%. This approach follows Bennett’s Law that people consume more nutrient-dense foods and reduce their consumption of calorie-dense staple foods as their income increases. As households prioritize their calorie requirement for basic activities, the shift from con- suming calorie-dense staple food to more protein-dense non-staple food implies that households have achieved their desired calorie intake. This approach falls under the uti- lization concept of food insecurity, as it is concerned with whether households consume nutritionally essential or nutritionally inferior foods (Barrett, 2010). Jensen and Miller (2010) find that an adult in China would not meet his nutritional demand if the fraction of his calories from rice, wheat, and other staple food exceeds 84%. We also use this 84% of the staple calorie share to define a food-insecure household given that Vietnam and China are similar. With the conversion table from Thi et al. (2018),11 We calculate the monthly calorie intake by food categories and construct a dummy variable equal to 11The table is a summary of National Institute of Nutrition (2007). 96 zero when the staple calorie share is below 84% and 1 when the share is above 84%. This binary variable is our main measure of food insecurity outcome. There are other methods to measure food insecurity. One approach is using calorie intake to define food insecurity: researchers define the minimum requirement of calorie intake in a day. For example, Abebaw et al. (2020) define a household as food-insecure when it uses adult-equivalent 2,200 calories or less per day. However, such measures may not be appropriate because there are no consistent ways to calculate the adult-equivalent unit. Thi et al. (2018) compare the estimations on average adult-equivalent calorie intake at the household level from studies using different methods on 2004 VHLSS, and the estimation results can range between 2,300 to 3,300 calories per day across different methods. Given that the results are very sensitive to the choice of an adult-equivalent method, using calorie intake to measure food insecurity will lead to the same problem. Another popular approach is using self-reported information on food insecurity. Within the sustainability dimension, researchers can measure food security by asking whether a household has had two meals per day in the last 12 months or if house- holds have not had enough food to eat in any month within the last 12 months (e.g., Kuku et al., 2011; Ratcliffe et al., 2011; Verpoorten et al., 2013; Gundersen et al., 2017; Abebaw et al., 2020). In addition, this type of approach can measure the utilization di- mension by asking households whether they consume less preferred foods or unbalanced meals (Ratcliffe et al., 2011). The self-reported approach is problematic in our setting for two main reasons. First, a household only answers the question about food insecurity when it has been recognized as a “poor household” by the government in the last five years; in other words, a non-poor household is automatically not considered as food insecure. This approach would underestimate the number of food-insecure households by excluding the near- poor group and lead to selection bias when estimating the effect of household income on food insecurity. If the government changes the definition of a poor household, it can also lead to artificial changes in the food insecurity prevalence. Second, asking whether households have enough food to measure food insecurity can be misleading given the context of Vietnam. Being the second-largest rice exporter in the world, the price of rice in Vietnam can be as cheap as 22 cents per pound,12 but 12Source (in Vietnamese): Vietnambiz (2020b) 97 nutrition-dense food is more expensive. For example, the unit price of pork is ten times more expensive than rice.13 Therefore, nutritional deficiency is likely a more significant concern than the number of meals per day. Moreover, due to the short-term nature of lockdowns during the pandemic, households may fall back on staple food and reduce consumption of more expensive food to smooth food consumption (Hirvonen et al., 2021), which means the composition of food consumption is likely more relevant than the overall consumption. Given these two reasons, we choose not to use the self-reported measure as the main measure of food insecurity.14 Another commonly used measure is the household dietary diversity score (HDDS) (Swindale and Bilinsky, 2016; Maxwell et al., 2014). However, the HDDS only counts the number of food categories that households consume and does not account for the depth of food insecurity in terms of food quantity. In other words, the score would take into account households with high HDDSs that still suffer nutritional deficiency. HDDSs are also typically calculated using daily or weekly food consumption data; however, we only have monthly data, so this might exaggerate the dietary diversity. Furthermore, dietary diversity could be explained by climate or culture; for example, tropical climates are associated with higher biodiversity and hence higher dietary diversity. For this reason, there is no universal HDDS threshold for a food-insecure household across different countries (Deitchler et al., 2010), as one threshold that works in one country may not work in another. To illustrate the problem with the food insecurity threshold using the HDDS, in Figure 4.2a we plot average HDDS scores by income group and the thresholds being used to determine food-insecure households by other food insecurity studies (e.g., Vaitla et al., 2015, 2017; Lentz et al., 2019). The figure shows that even the lowest income group in our data has an average HDDS score that exceeds all of these thresholds. Therefore, these HDDS thresholds used in other studies or other countries cannot be applied in Vietnam’s setting because all households would be classified as not food insecure.15 Our main outcome uses the fraction of calories from staple food and the 84% cutoff from 13Source (in Vietnamese): Vietnambiz (2020a) 14Another problem, although not as severe, is that self-reported data can be inconsistent over time. In the case of the VHLSS, a household is considered food-insure if the household does not have enough two meals a day at some points in 2012–2018, but in 2010 a household is food insecure if it reports that it does not have enough food. 15For this reason, we must pick a higher HDDS cutoff for food-insecure households for our analysis. 98 Jensen and Miller (2010); given Vietnam and China are relatively more similar, we do not observe such a problem in our approach, as seen in Figure 4.2b. (a) HDDS by income quantile (b) Staple calorie share by income quantile Figure 4.2: Food insecurity measures by income group, 2010–2018 Note: The figure shows the average and standard deviation for the Household Dietary Diversity Score (HDDS), and the staple calorie share is estimated for ten income quantiles. For the HDDS measure, we also show the common cutoffs to identify food-insecure households (e.g., Vaitla et al., 2015, 2017; Lentz et al., 2019). For the staple calorie share, we show the food-insecure cutoff estimated (Jensen and Miller, 2010) for Wuhan, China, which we also use in this study. In order to check the robustness of our regression estimation, we also use the food insecurity measures from self-reporting and the dietary diversity score as alternative outcomes, both of which are binary variables. The self-reported food insecurity variable is equal to one if a household does not have enough two meals per day for one month or more, and zero otherwise. There is no universal cutoff for the HDDS to define a food-insecure household based on the dietary diversity score, but the official guideline of HDDS suggests that a meaningful target level of diversity can be set based on the average score of the richest 33% of households (Swindale and Bilinsky, 2006). We follow this guideline and set the cutoff value for HDDS at 10; that is, the HDDS-based food insecurity variable is one if the dietary diversity score is below 10, and zero if the score is above 10. This cutoff, however, likely results in an overestimated share of food-insecure households because this definition implies that only the richest 33% of households are not food-insecure. To see how different food insecurity measures actually differ in our data, we use the 2010–2018 VHLSS to map the province-level share of food-insecure households over 99 time using the (a) self-reported measure, (b) HDDS-based measure, and (c) staple calorie share-based measure in Figure 4.3. For comparison, we also map each province’s share of poor households, defined as those with a monthly income per capita lower than the national poverty line in Figure (d). We observe that provinces with a higher share of poverty also have a higher share of food-insecure households across all three measures. Provinces in remote and mountainous areas, including the Northeast, Northwest, and Central Highlands, tend to have higher poverty and food insecurity rates. There are stark differences between different measures of food insecurity. On one hand, the rates of self-reported food insecurity tend to be substantially lower than the poverty rates across provinces and years, confirming the selection bias problem that we discuss above. On the other hand, the rates of HDDS-based food insecurity tend to be substantially higher than the poverty rates, while the staple calorie share-based food insecurity rates track closely with the poverty rates. We also observe that poverty declines over time;16 at the same time, we observe that the self-reported food insecurity rises (especially in 2016)17 while the HDDS-based food insecurity stays relatively stable. In contrast, the changes of food insecurity based on the staple calorie share approach also track the movement of the poverty rates. This suggests that measuring food insecurity using the staple calorie share approach is valid in the context of Vietnam. We further discuss the validity of this measure in Appendix C.2. 16The rapid poverty reduction can be explained by a booming export sector and rising domestic demand (The World Bank, 2018). 17The rise in self-reported food insecurity might be due to changes in how the government classified poor households in 2016 (Decision 59/2015/QD-TTg). 100 (a) Share of food-insecure household using self-reported data (b) Share of food-insecure household using the HDDS (c) Share of food-insecure household using staple calorie share (d) Share of households below the poverty line Figure 4.3: Food insecurity and poverty by province and year 101 4.3.3 Econometric Model We propose a simple approach to predict food insecurity risk caused by major income shocks using past household living standard surveys and external information about sector-specific income shocks, which is typically reported by the government or inter- national organizations. Although we focus on income shocks during a pandemic, this method can also be applied to income shocks during other emergencies. Let Yit de- note the binary outcome variable that indicates whether household i is food insecure in period t, consider the following linear probability model: Pr(Yit = 1|incomeit,Zit) = β.incomeit + Zitθ, (4.1) where Pr(Yit = 1|incomeit,Zit) is the conditional probability of being food insecure; incomeit denotes the household’s real income in period t; Zit denotes other time-varying household-level factors driving food insecurity, and β represents the causal effect of the household’s income on food insecurity. Assume that household i provides labor in sector s and receives a salary of eist in period t. Then household income is defined as the sum of the salary received in each sector s in which the household provides labor: incomeis = ∑ j eist. To understand how a major income shock can affect food insecurity, we first assume that the pandemic changes the salary of an average worker in sector s by zs. Let income pre it and income post it denote the household’s income in the pre-shock and post-shock periods, respectively. Then the post-shock income can be written as incomepostit = ∑ j(1+zs)eist. One can obtain zs from external information about sector- specific income shocks caused by the pandemic. For example, as previously stated, this paper uses the World Bank’s report on the economic impacts of the pandemic in Vietnam. The process to predict changes in food insecurity at some locality level involves two steps. In the first step, we use a regression to estimate β and θ on a household-level dataset and denote these estimates as βˆ and θˆ; we refer to this as the “estimation” step. In the second step, we generate locality-level pre- and post-pandemic food insecurity risk and calculate the difference in order to measure changes in food insecurity; we refer to this as the “prediction” step. Specifically, we combine βˆ and θˆ with incomepreit , incomepostit , and Zit to generate the pre- and post-pandemic predicted probability of 102 food insecurity at the household level: P̂ r(Yit = 1) pre/post = βˆ.income pre/post it + Zitθˆ, (4.2) where P̂ r(Yit = 1) pre/post is the predicted probability of food insecurity and reflects the “risk” that household i is food-insecure; that is, households with a higher value are more likely to be food insecure. Although the predicted probability of food insecurity is a good indicator to measure food insecurity shock, it is perhaps more policy-relevant to predict the changes in the share of food-insecure households as a result of the income shock during the pandemic. Since our prediction step in Equation 4.2 only provides a continuous variable for pre- dicted probability of being food-insecure, we can only identify households with high risk of being food-insecure. By choosing a threshold c such that when the predicted prob- ability exceeds the threshold, a household is classified as “high-risk”. This additional step will then allow us to measure how the share of “high-risk” households changes due to the income shock. Choosing a threshold to map the predicted probability values to a binary category, i.e. high-risk or not, is a standard step in the predictive modeling literature (James et al., 2013). It is important to note that this threshold is applied to the predicted probability of food insecurity to classify a “high-risk” household. This classifying threshold is not the same as the cutoff applied to the staple calorie share to define a food-insecure household. The criterion to pick an optimal threshold for the predicted probability depends on the goal of the prediction. Our goal is to obtain the shares of “high-risk” households that is as close to the shares of households that are actually food-insecure; the first value is often known as the predicted prevalence while the latter is often known as the observed prevalence (Freeman and Moisen, 2008). Freeman and Moisen (2008) find that the two best approaches to satisfy this criterion. The first approach is choosing a threshold to minimize the difference between the predicted prevalence and the observed prevalence. The second approach is choosing a threshold to maximize the Cohen’s Kappa statistic, which measures how close the predicted status and the actual status of food insecurity are, after accounting for the fact that they might be the same due to chance.18 18Another common approach in the machine learning prediction is choosing a threshold to maximize the true positive rate minus the false positive rate (also known as the Youden’s J statistic). However, 103 Formally, let hit denote the “high-risk” status of a household, i.e. the predicted food insecurity status, and is defined as hcit = 1{P̂ r(Yit = 1) > c} while Yit is the true food insecurity status. The predicted prevalence of threshold c, PPc, is the probability of cases being predicted by threshold c as food-insecure: PPc = Pr(h c = 1), and the observed prevalence (OP) is probability of a household being food-insecure, OP = Pr(Y = 1). The first approach is to select a threshold to minimize dc = |PPc − OP | which is the difference between the predicted and observed prevalence rates. The second approach is to choose a threshold with the largest Cohen’s Kappa statis- tic, which is measured by κc = Pr(hc = Y )− Pr(hc = Y |hc ⊥ Y ) 1− Pr(hc = Y |hc ⊥ Y ) where Pr(hc = Y ) is the probability that we correctly predict a case to be food-insecure or not, which measures the accuracy of the prediction. Pr(hc = Y |hc ⊥ Y ) is the probability that a case is correctly predicted when hc and Y are independent of each other, which measures the accuracy by chance. Intuitively, the Cohen’s Kappa measures the accuracy of the prediction after removing part of the accuracy that happens due to chance (Ben-David, 2007, 2008). In the predictive modeling literature, the Cohen’s Kappa can also be written in terms of true positive (TP), true negative (TN), false positive (FP), and false negative (FN). Specifically, the accuracy of the prediction is measured by the proportion of cases that are either true positive or true negative, Pr(hc = Y ) = TPc+TNcN where N is the sample size. The accuracy by chance is Pr(hc = Y |hc ⊥ Y ) = Pr(hc = 1).P r(Y = 1) + Pr(hc = 0).P r(Y = 0), which can be written as: Pr(hc = Y |hc ⊥ Y ) = (TPc + FPc) N . (TPc + FNc) N + (TNc + FNc) N . (TNc + FPc) N We now turn to discuss the details of the estimation step. The simplest approach would be to use OLS to estimate the following linear probability model: Yit = β.incomeit + Zitθ + provincep × yeart + ϵit, (4.3) this approach is more appropriate for evaluating diagnostic test because it tends to overestimate the true prevalence when the event is rare (Freeman and Moisen, 2008). 104 where Yit is the binary variable for household’s food insecurity status, Zit denotes the time-varying household characteristics and provincep×yeart denote province-year fixed effects.19 Two potential biases may arise with the OLS estimation. The first bias comes from unobserved factors that affect both income and food insecurity such as changes in household demographics or employment. The second bias is due to food insecurity si- multaneously affecting household income by lowering productivity. Although controlling for household covariates and different levels of fixed effects may account for unobserved confounders, it is not sufficient to address the simultaneity problem. We therefore use an instrumental variables (IV) approach to address the simulta- neous bias. Specifically, household income can be instrumented by local exposure to national employment growth across different industries. As exposure to employment shocks affects household income (through household employment), households would change their food consumption accordingly. This is a form of a shift-share (or Bartik) instrument (Goldsmith-Pinkham et al., 2020). The IV is constructed as follows: IVdt = ∑ j wbdj (Ltj − Lbj) Lbj , where wbdj denotes the employment share of industry j 20 in district d in the baseline period b, Ltj denotes the national employment of industry j in period t, and Lbj denotes the national employment of industry j in the baseline period b. Note that (Ltj−Lbj) Lbj is the national employment growth rate of industry j in period t relative to the baseline period b. Our study period is between 2010 and 2018, and the baseline year is 2009. We estimate the following equation using a two-stage least squares (2SLS) approach: food insecurityit = β.̂incomeit + Zitθ + Jd,2009η + provincep × yeart + ϵit, (4.4) and the first-stage equation is incomeit = β.IVdt + Zitζ + Jd,2009α+ provincep × yeart + ϵit, (4.5) 19Controlling for province-year fixed effects is preferred for the purpose of estimation, but doing so will not generate parameter estimates for future years such as 2020. Therefore, controlling for province- linear trends is more suitable for the purpose of prediction (Auffhammer and Steinhauser, 2012; Newell et al., 2018). In the next section, we show that both approaches yield similar estimates. 20For Vietnam, industries are classified using the three-digit ISIC system. 105 where Jd,2009 denotes district d’s characteristics, which include demographics (gender and marriage), economics (wealth and unemployment), and education (college educated) in 2009 as control variables. This allows us to account for baseline factors that may affect both local exposure to and changes in food insecurity. In a two-industry, one-period scenario, the instrument measures the variation of lo- cal exposure, measured by district-industry employment share in the baseline, to the national employment growth of one industry relative to another (Goldsmith-Pinkham et al., 2020). The research design thus captures the effect of the district-industry em- ployment share on changes in food insecurity relative to the baseline period. In a multiple-period scenario, the effects of the baseline share on outcome are scaled by the national growth of employment. The research design is analogous to a dose-response design as we compare differential outcome changes in districts with different baseline shares of employment. Goldsmith-Pinkham et al. (2020) further show that in a multiple- industry, multiple-period scenario, the Bartik instrument is numerically equivalent to using multiple instruments that are baseline shares of different industries. The iden- tifying assumptions for the estimates using a Bartik instrument to be consistent are that (1) the baseline district-industry share of employment is conditionally exogenous to changes in food insecurity, and (2) the baseline share only affects changes in food insecurity through the endogenous variable, which is household income (conditional on control variables). We assess these assumptions in different ways. The IV exogeneity assumption is vi- olated when there are other district-level characteristics in 2009 affecting both district- industry employment shares and changes in food insecurity. Therefore, we include control variables for various district-level demographic, economic, and educational char- acteristics in 2009 in Equation 4.4 and 4.5. These baseline characteristics can explain up to 94% of the variation in the baseline industry-district employment share (see Table 4.4), which means controlling for these factors will account for most of the endogeneity concerns with the IV (Goldsmith-Pinkham et al., 2020). Furthermore, we also follow the recommendations in Goldsmith-Pinkham et al. (2020) by estimating the IV model using baseline industry-district employment shares as multiple IVs to employ an overi- dentification test. That is, we estimate Equation 4.3 where the first-stage equation 106 is incomeit = ∑ t ∑ j γjt.(wbdj × yeart) +Zitζ + Jd,2009α+ provincep × yeart + ϵit (4.6) and wbdj×yeart is the industry-district employment share in 2009 times the year dummy variables. Because there are multiple instruments, we use different estimators including generalized method of moments (GMM) and limited information maximum likelihood (LIML) since they are more appropriate when dealing with many instruments. We use the Sargan-Hansen overidentification test for the 2SLS estimator and the Anderson- Rubin overidentification test for the GMM and LIML estimators. The exclusion restriction assumption is not testable in this study, but it likely holds because employment shocks can only affect household food insecurity through changes in household income. There are two potential concerns about the exclusion restriction assumption: (1) the employment shocks may shift the market demand for food and change the market food price, which in turn affects household food consumption deci- sions; and (2) the employment shock in the agriculture sector may affect agricultural wages, which in turn will also affect food prices. We account for local market conditions including food prices by controlling for province-year fixed effects. We also check for potential violations; we construct a “non-agriculture Bartik IV” where we only consider non-agriculture sectors for the shift-share IV, and, in the next section, we compare the results using this IV with the main results. Our approach has the following caveats. First, the estimation step uses a linear probability model, so the predicted probability is not bounded between 0 and 1; this feature is undesirable compared to standard binary dependent variable models such as the logistic or probit regression. An alternative approach is using an IV probit model which will provide predicted probability of food insecurity that is bounded between 0 and 1. However, this control function estimator is not preferred because it requires that the first stage model is correctly specified. In other word, if we do not include the correct set of instruments, the control function estimator is no longer consistent (Lewbel et al., 2012).21 In contrast, a true instrumental variable estimator (such as the linear IV approach that we use) only requires that the instrumental variable is 21This is true even when the IV probit is estimated using maximum likelihood. 107 correlated with the endogenous regressor and uncorrelated with the error term; it does not impose any structural assumptions on the errors in the first stage regression. When we do not include all of the right instruments in the first stage, the estimator is less efficient but still consistent (Lewbel et al., 2012). Therefore, making prediction based on an IV probit approach will lead to the same problem as the OLS approach unless we know the correct set of instruments in the first stage. In Section 4.5.2, we show that our linear IV approach actually outperforms the IV probit approach in out-of-sample prediction despite the unbounded predicted probability problem. More importantly, since our prediction approach relies on income shocks induced by changes in the labor market for the IV estimation, it implicitly assumes that such shocks are similar to the income shocks caused by the pandemic. If the two shocks are very different from one another, then the IV-based prediction may no longer be accurate. Specifically, the income shocks from our IV estimates are broad changes in income due to national labor demand shocks, while the income shocks caused by the pandemic might be driven by changes in employment, especially more vulnerable workers such as those in the informal sector, part-time employees, or workers with lower education or less experience. We compare the prediction from our approach with the food insecurity measure generated from the actual income data during the pandemic in Appendix C.4 and find that our prediction is highly accurate, suggesting that a violation of this assumption is not a major empirical concern. Another caveat of our approach is that the household income function does not include non-working income such as remittances and social transfers, which can play a major role in shielding households from food insecurity. Although it can easily be extended to account for such features, we choose not to include these variables because such data are not always available, especially in labor force surveys. External informa- tion about changes in remittance is also unlikely to be available. Therefore, we interpret our prediction as a lower bound of the actual food insecurity shock when remittances fall as they did during this pandemic (The World Bank, 2020). 108 4.4 Estimating the effect of household income on food in- security In this section, we present the results from estimating the relationship between house- hold income and food insecurity using the 2010–2018 VHLSS data for the OLS and IV approaches. Specifically, we use OLS to estimate Equation 4.3 and 2SLS to estimate Equation 4.4 and 4.5 and report the results in Section 4.4.1. We then discuss different validity assessments in Section 4.4.2. 4.4.1 Regression Results Table 4.2 provides the regression estimates for the effect of household income on food insecurity. Panel A reports the estimates from the OLS approach, and panel B reports the estimates from the IV approach. We consider controlling for different levels of fixed effects to account for any unobserved heterogeneity. For the models in columns 1, 2, and 3, we alternatively control for province, district, and household fixed effects. The household fixed effects model is run on a subsample that forms an unbalanced household panel, which is why there are fewer observations. The model in column 4 controls for province and year fixed effects, and the model in column 5 also controls for province- specific linear trends. In column 6, the model controls for province-by-year fixed effects. All models control for household characteristics: urban, household size, and the fraction of households with postsecondary education. We also control for district demo- graphic and economic characteristics in 2009 that might correlate with food insecurity and district-industry employment shares in 2009. These characteristics include gender, marital status, college degree holders, disability, immigration status, average wealth,22 and unemployment rate. In the household fixed effects model, these time-invariant con- trol variables are not included. For the OLS estimations, we cluster the standard errors at the commune-year level to account for the survey’s sampling design. In the IV esti- mation, we cluster the standard errors at the district level to account for district-level exposure to the labor market shocks (Abadie et al., 2017). 22Wealth is estimated using principal component analysis on electricity, piped water, air conditioner, computer, washing machine, refrigerator, television, and radio. 109 Table 4.2: Estimates for income effects on food insecurity Specification (1) (2) (3) (4) (5) (6) Panel A: OLS approach -0.124*** -0.111*** -0.071*** -0.113*** -0.112*** -0.112*** (0.003) (0.003) (0.006) (0.003) (0.003) (0.003) [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] N 46443 46443 27657 46443 46443 46443 Panel B: Bartik IV approach -0.263*** -0.222*** -0.202*** 0.182 -0.349*** -0.396*** (0.018) (0.015) (0.026) (0.231) (0.132) (0.130) [0.000] [0.000] [0.000] [0.430] [0.008] [0.002] Cragg-Donald F-stat 381.86 372.58 632.82 4.54 6.99 7.94 Montiel Olea and Pflueger F-stat 2.0e+05 1.9e+05 2336.57 3599.18 4087.31 N 46443 46443 24499 46443 46443 46443 Additional controls Province FE ✓ ✓ ✓ ✓ District FE ✓ Household FE ✓ Year FE ✓ ✓ Province FE x linear trends ✓ Province x Year FE ✓ The table report results from OLS and IV estimations for the effect of household income on food insecurity in Equation 4.3. IV estimation instruments household income with Bartik IV (see Equation 4.5). All models control for urban, household size, and the fraction of households with postsecondary education, and 2009 district characteristics (gender, marital status, college degree, immigration status, disability, average wealth index, and unemployment rate). The household fixed effects model does not include 2009 controls. Standard errors are reported in parentheses, and p-values are reported in brackets. OLS standard errors are clustered at the commune-year, IV standard errors are clustered at the district level, and household FE estimations are estimated on a subsample of households that form an unbalanced panel with robust standard errors. *** p < 0.01, ** p < 0.05, * p < 0.1 We first compare the estimates from the province fixed effects, district fixed effects, and household fixed effects models in columns 1 to 3. The IV estimates are higher than the OLS estimates for all specifications. In column 1, the OLS estimate is –0.124 and the IV estimate is –0.263 when controlling for province fixed effects. In column 2, the estimates are –0.111 for OLS and –0.222 for IV when controlling for district fixed effects. In column 3, the estimates are –0.071 for OLS and –0.188 for IV when controlling for household fixed effects (note that this is estimated on a subsample that forms an unbalanced household panel). Given that the estimates do not vary considerably across different levels of fixed effects, we focus on models that control for province fixed effects and secular trends. 110 Because the variation of the shift-share IV is at the district-year level, controlling for district fixed effects or household fixed effects and trends would absorb all of this vari- ation. We compare the models that control for province fixed effects and year fixed effects (column 4), province fixed effects and province-specific linear trends (column 5), and province, year, and province-by-year fixed effects (column 6). In column 4, the estimates are –0.113 for the OLS approach and 0.182 but are insignificant for the IV approach. In column 5, the estimates are –0.112 for OLS and –0.349 for IV. In column 6, the estimates are –0.112 for OLS and –0.396 for IV. The specifications in column 4 only control for trends at the national level, while the specifications in columns 5 and 6 control for trends at the province level. It appears that controlling for province-linear trends and controlling for province-by-year fixed effects yield similar estimates. These results suggest that the main specification is correctly specified. Our findings are generally consistent across different measures of food insecurity. We estimate our models using the self-reported and HDDS-based food insecurity out- comes. The results are reported in Table 4.3. As mentioned before, using self-reported outcomes severely underestimates the effect of income on food insecurity because only poor households are asked to report their food insecurity status; the estimates also have reverse signs when controlling for time trends or year fixed effects. The results for the HDDS-based measure are qualitatively similar to our main findings. 4.4.2 Assessing the Validity of the Shift-share Instrument To assess the validity of the IV approach, we first assess the exogeneity assumption of the shift-share IV. Following Goldsmith-Pinkham et al. (2020), we identify district-level factors that are correlated with the initial industry shares of employment in 2009. We consider the district-level wealth index; the population shares of those who are female, married, college educated, disabled, and an immigrant; and the unemployment rate, which is calculated on the sample of 15-to-69-year-olds using the 2009 census data. Table 4.4 presents the estimates from regressions of these covariates on the initial industry share of labor in 2009. It is implied from the R-squared that the covariates explain mostly 35%-94% of the 2009 industry share of employment variation. We control for these variables to avoid model misspecification due to omitted variables.23 23See Goldsmith-Pinkham et al. (2020) for a detailed discussion on this approach. 111 Table 4.3: Estimates for income effects on alternative food insecurity measures Specification (1) (2) (3) (4) (5) (6) Panel A: Self-reported food insecurity outcome OLS approach -0.039*** -0.034*** -0.014*** -0.037*** -0.037*** -0.037*** (0.002) (0.002) (0.003) (0.002) (0.002) (0.002) [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] N 46443 46443 27709 46443 46443 46443 IV approach -0.051*** -0.057*** -0.050*** 0.204 0.081 0.053 (0.008) (0.006) (0.014) (0.144) (0.083) (0.064) [0.000] [0.000] [0.000] [0.157] [0.330] [0.406] Cragg-Donald F-stat 381.86 372.58 631.13 4.54 6.99 7.94 Montiel Olea and Pflueger F-stat 2.0e+05 1.9e+05 2336.57 3599.18 4087.31 N 46443 46443 24570 46443 46443 46443 Panel B: HDDS-based food insecurity outcome OLS approach -0.126*** -0.106*** -0.067*** -0.124*** -0.124*** -0.124*** (0.003) (0.003) (0.008) (0.004) (0.004) (0.004) [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] N 46443 46443 27707 46443 46443 46443 IV approach -0.242*** -0.134*** -0.118*** -1.271** -1.027*** -1.012*** (0.026) (0.019) (0.034) (0.525) (0.330) (0.304) [0.000] [0.000] [0.000] [0.016] [0.002] [0.001] Cragg-Donald F-stat 381.86 372.58 669.48 4.54 6.99 7.94 Montiel Olea and Pflueger F-stat 2.0e+05 1.9e+05 2336.57 3599.18 4087.31 N 46443 46443 24587 46443 46443 46443 Additional controls Province FE ✓ ✓ ✓ ✓ District FE ✓ Household FE ✓ Year FE ✓ ✓ Province FE x linear trends ✓ Province x Year FE ✓ The table reports results from OLS and IV estimations for the effect of household income on food insecurity in Equation 4.3. IV estimation instruments household income with Bartik IV (see Equation 4.5). All models control for urban, the household size, the fraction of households with a postsecondary education, and 2009 district characteristics (gender, marital status, college degree, immigration status, disability, average wealth index, and unemployment rate). The household fixed effects model does not include 2009 controls. Standard errors are reported in parentheses, and p-values are reported in brackets. OLS standard errors are clustered at the commune-year, IV standard errors are clustered at the district level, and household fixed effects estimations are estimated on a subsample of households that form an unbalanced panel with robust standard errors. *** p < 0.01, ** p < 0.05, * p < 0.1 112 Given that the 2SLS estimator with the Bartik IV is numerically equivalent to the GMM estimator that uses industry shares of employment as instruments, we test for the validity of Bartik’s instrument by estimating a regression with multiple IVs as initial industry-district shares of employment times year dummy variables and conduct an overidentification test on this regression, as suggested in Goldsmith-Pinkham et al. (2020). That is, we estimate Equation 4.3 where the household income is instrumented as indicated in Equation 4.6. Table 4.5 presents the results. We note that (1) our overidentified IV estimations are very similar to the main findings, and (2) our results fail to reject the overidentification tests after including the district-level factors as controls. This evidence suggests that the main model is correctly specified. Second, we assess the exclusion restriction assumption by estimating using non- agriculture Bartik IV. Table 4.6 presents the results. The estimates using the alternative Bartik IV are very similar to the estimates from the main results. This suggests that the agriculture component of the Bartik IV does not independently affect our main results. 4.5 Predicting Food Insecurity Risks in Vietnam In this section, we apply the prediction approach proposed in Section 4.3.3 to make predictions about changes in food insecurity risk in Vietnam. First, we use the 2010- 2018 VHLSS to select the optimal classifier threshold in Section 4.5.1. As discussed in Section 4.3.3, we need to select a threshold to classify the the “high-risk” households based on the predicted probability of food insecurity. In Section 4.5.2, we assess the out-of-sample predictive accuracy of our IV-based approach and the standard OLS- based approach by comparing the predictions for 2016 and 2018 with the actual food insecurity data for these two years. In Section 4.5.3, we predict food insecurity changes due to income shocks during the pandemic for 702 districts in Vietnam. In Section 4.5.4, we discuss how these results can inform policymakers and international organizations about where to prioritize assistance. 113 Table 4.4: Relationship between industry-district employment shares and district-level characteristics in 2009 Outcome AgricultureMining Manufact.Utility ConstructionRetail Logistics HospitalityMedia Finance (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Urban -0.20*** 0.00*** 0.04** 0.00*** 0.01** 0.04** 0.02** 0.03** -0.00*** 0.00*** (0.02) (0.01) (0.01) (0.00) (0.01) (0.01) (0.00) (0.00) (0.00) (0.00) Wealth -0.21*** 0.02** 0.07* 0.00*** 0.01** 0.06* 0.02** 0.03** -0.00*** 0.00*** (0.02) (0.01) (0.01) (0.00) (0.01) (0.01) (0.00) (0.00) (0.00) (0.00) Female share -2.22*** -0.33*** 1.38 -0.02*** 0.53 0.37 0.01** 0.14 0.01*** 0.04** (0.40) (0.12) (0.24) (0.02) (0.12) (0.12) (0.05) (0.06) (0.01) (0.01) Married share -0.29*** 0.11 0.15 0.03** 0.19 -0.10*** -0.02*** -0.11*** 0.01*** 0.00*** (0.16) (0.05) (0.09) (0.01) (0.05) (0.05) (0.02) (0.02) (0.00) (0.00) College+ 0.62 -0.12*** -0.91*** 0.01** -0.10*** -0.16*** -0.07*** -0.11*** 0.12 0.10 (0.13) (0.04) (0.08) (0.01) (0.04) (0.04) (0.02) (0.02) (0.00) (0.00) Disability -0.43*** 0.05* -0.49*** -0.05*** 0.74 0.01** 0.03** 0.05** 0.04** 0.02** (0.71) (0.22) (0.42) (0.03) (0.21) (0.21) (0.09) (0.10) (0.02) (0.02) Immigration status -1.22*** 0.05* 1.03 0.02** 0.14 0.08* 0.02** 0.01** -0.02*** -0.01*** (0.10) (0.03) (0.06) (0.00) (0.03) (0.03) (0.01) (0.01) (0.00) (0.00) Uemployed -4.45*** 0.26 0.59 0.09* 0.67 1.27 0.60 0.50 0.00*** -0.00*** (0.51) (0.16) (0.30) (0.02) (0.15) (0.15) (0.07) (0.08) (0.01) (0.01) R-squared 0.92 0.35 0.79 0.69 0.58 0.88 0.85 0.87 0.92 0.94 Observations 703 703 703 703 703 703 703 703 703 703 Outcome Real Science Admin Govern. EducationHealth Entertain. Other Household Estate Services workers (11) (12) (13) (14) (15) (16) (17) (18) (19) Urban -0.00*** -0.00*** -0.00*** 0.02** 0.02** 0.01*** 0.01*** 0.01*** 0.00*** (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) Wealth 0.00*** -0.00*** 0.00*** -0.01*** -0.01*** 0.00*** 0.00*** 0.01*** 0.00*** (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) Female share 0.01*** 0.02** 0.02** -0.11*** -0.02*** 0.03** 0.07* 0.03** 0.04** (0.01) (0.01) (0.01) (0.05) (0.04) (0.02) (0.02) (0.01) (0.01) Married share -0.01*** -0.01*** -0.01*** 0.11 -0.00*** 0.01*** -0.01*** -0.03*** -0.02*** (0.00) (0.01) (0.00) (0.02) (0.02) (0.01) (0.01) (0.01) (0.01) College+ 0.01** 0.12 0.02** 0.23 0.22 0.05** -0.01*** -0.03*** 0.02** (0.00) (0.00) (0.00) (0.02) (0.01) (0.01) (0.01) (0.00) (0.00) Disability 0.01*** 0.03** 0.01** -0.10*** -0.09*** -0.03*** 0.04** 0.10* 0.06* (0.01) (0.02) (0.01) (0.09) (0.07) (0.03) (0.03) (0.03) (0.02) Immigration status 0.02** -0.01*** 0.01*** -0.04*** -0.06*** -0.02*** 0.00*** 0.01*** -0.01*** (0.00) (0.00) (0.00) (0.01) (0.01) (0.00) (0.00) (0.00) (0.00) Uemployed 0.02** 0.02** 0.02** 0.12 0.09* 0.05* 0.01** 0.15 0.01** (0.01) (0.02) (0.01) (0.07) (0.05) (0.02) (0.02) (0.02) (0.02) R-squared 0.76 0.90 0.88 0.67 0.75 0.77 0.66 0.89 0.72 Observations 703 703 703 703 703 703 703 703 703 The table reports results from estimating a regression of industry-district shares of employment (as outcome) and district-level characteristics in 2009. Standard errors are in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1 114 4.5.1 Choosing the Optimal Classifier Threshold Following the estimation step, we generate the predicted probability of food insecurity on the 2010-2018 VHLSS data and plot the distributions by their actual food insecu- rity status in Figure C.1. We choose the optimal threshold c∗ between 0th and 99th percentile given that our regression is a linear probability model. In Figure C.2(a), we plot the difference between the observed and predicted preva- lence rates as well as the Cohen’s Kappa statistics for ∀c ∈ [0, 99]. As discussed in Section 4.3.3, the threshold with the smallest difference in prevalence line satisfies the first criterion, while the threshold at the highest point of the Cohen’s Kappa curve satisfies the second criterion. We find that the 85th percentile satisfies both criteria. Therefore, we classify all households with predicted probability of food insecurity above the 85th percentile as “high-risk” households for the rest of the paper. Table 4.5: Income effect estimates using 2009 district-industry employment shares × year dummy as instruments 2SLS GMM LIML (1) (2) (3) (4) (5) (6) Coefficient -0.254*** -0.365*** -0.290*** -0.386*** -0.283*** -0.397*** [-0.31; -0.20] [-0.44; -0.29] [-0.34; -0.24] [-0.46; -0.31] [-0.35; -0.21] [-0.49; -0.31] (0.029) (0.040) (0.025) (0.039) (0.036) (0.047) Overidentification test J Statistics 53.17 22.64 53.17 22.64 48.98 21.33 χ-squared p-value 0.000 0.161 0.000 0.161 0.000 0.212 Obs 46443 46443 46443 46443 46443 46443 Household controls Yes Yes Yes Yes Yes Yes Province-year FE Yes Yes Yes Yes Yes Yes Baseline district control No Yes No Yes No Yes The table presents the results from estimating the effects of household income on food insecurity in Equation 4.3, where house- hold income is instrumented with industry-district employment shares in 2009 × year dummy variables; the first-stage equation is in Equation 4.6. Household characteristic controls include urban, household size, and share of household members with a postsecondary education. Baseline (2009) district-level characteristics are gender, marital status, college degree, immigration status, disability, average wealth index, and unemployment rate. The Sargan-Hansen test is the overidentification test for 2SLS, and the Anderson-Rubin overidentification test is for GMM and LIML. Standard errors are clustered at the district level. Standard errors are reported in parentheses, and p-values are reported in brackets. ∗ ∗ ∗p < 0.01, ∗ ∗ p < 0.05, ∗p < 0.1. 115 Table 4.6: Income effect estimates using non-agriculture Bartik IV Specification (1) (2) (3) (4) (5) (6) Panel A: Bartik IV -0.263*** -0.222*** -0.202*** 0.182 -0.349*** -0.396*** (0.018) (0.015) (0.026) (0.231) (0.132) (0.130) [0.000] [0.000] [0.000] [0.430] [0.008] [0.002] N 46443 46443 24499 46443 46443 46443 Panel B: Non-ag. Bartik IV -0.240*** -0.195*** -0.188*** 0.567 -0.238** -0.326** (0.017) (0.013) (0.018) (0.519) (0.100) (0.130) [0.000] [0.000] [0.000] [0.275] [0.017] [0.012] N 46443 46443 24499 46443 46443 46443 Additional controls Province FE ✓ ✓ ✓ ✓ District FE ✓ Household FE ✓ Year FE ✓ ✓ Province FE x linear trends ✓ Province x Year FE ✓ The table report results from IV estimations for the effect of household income on food insecurity in Equation 4.3, where household income is instrumented with Bartik IV and non-agriculture Bartik IV (see Equation 4.5). All models control for urban, household size, and the fraction of households with a postsecondary education, and 2009 district characteristics (gender, marital status, college degree, immigration status, disability, average wealth index, and unemployment rate). Standard errors are reported in parentheses, and p-values are reported in brackets. The household fixed effects model does not include 2009 controls. OLS standard errors are clustered at the commune-year, IV standard errors are clustered at the district level, and household fixed effects estimations are estimated on a subsample of households that form an unbalanced panel with robust standard errors. *** p < 0.01, ** p < 0.05, * p < 0.1 4.5.2 Out-of-Sample Prediction Validation in Vietnam To validate the process described in Section 4.3.3, we will predict provinces’ share of households with high food insecurity risk in 2016 using the 2016 LFS data and compare with provinces’ share of households that are actually food-insecure in 2016 from the 2016 VHLSS data.24 Specifically, we implement the estimation step using the 2010– 2014 VHLSS data, and the prediction step on the 2016 LFS data. We also conduct another validation exercise by predicting the 2018 food insecurity using the LFS data and compare with the actual food insecurity data from the 2018 VHLSS data. The OLS-based prediction is used as a benchmark because it is commonly used to predict 24For these validation exercises, we use province-level shares instead of district-level shares because the VHLSS data are not representative at the district level. We aggregate the predicted data (from the LFS data) at the province level to be comparable with the VHLSS data. 116 food insecurity (Lentz et al., 2019; Gundersen et al., 2020; Schanzenbach and Pitts, 2020). To quantify the difference between our predictions with the actual data, we calcu- late the sum of squared errors between the two measures; that is, ∑ p e 2 p = ∑ p(f actual p − fpredictedp )2, where fp measures the province-level share of households with actual food insecurity or the high predicted risk for province p, as a formal measure of predictive accuracy (Auffhammer and Steinhauser, 2012; Athey, 2018). Another measure of pre- dictive performance is the R-squared when regressing the predicted share against the actual share of food-insecure households, which reflects how much variance of the ac- tual share is correctly predicted by each approach. We also plot these province-level percentages along with a 45-degree line in Figure 4.4. The x-axis shows the share of households with actual food insecurity, and the y-axis shows the percentage of house- holds with high predicted risk. The higher the dot compared to the 45-degree line, the more we overestimate food insecurity in that province; the lower the dot, the more we underestimate it. In other words, a model with dots closer to the red line has higher predictive accuracy. We report the results for 2016 in (a) and 2018 in (b). We find that IV-based predictions outperform the OLS-based predictions in both 2016 and 2018. Graphically, most points from the IV models are relatively close to the 45-degree line, while most points in the OLS models are much higher than the line. The sums of squared errors are 2.55 for OLS-based and 0.69 for IV-based predictions in 2016. In 2018, the sums of squared errors are 2.61 for OLS-based and 0.79 for IV- based prediction. Therefore, the IV models are far more accurate than the OLS models. Similarly, IV models have a higher R-squared in both years relative to OLS models, indicating that the IV-based prediction can explain more of the actual variation in share of food insecurity. These results suggest that the IV approach has higher out-of-sample predictive accuracy in predicting food insecurity. 117 (a) Validation using 2016 data (b) Validation using 2018 data Figure 4.4: Predicted and actual food insecurity of 2018 at the provincial level Note: Predicted food insecurity measures each province’s share of households with high predicted food insecurity risk (above 84th percentile); Actual food insecurity measures the share of households with actual food insecurity in each province using the VHLSS sample in year y. Predicted food insecurity risk is obtained by estimating the effect of income on food insecurity using the pre-y VHLSS sample and predicting using the LFS sample in y. We also apply these validation steps for the IV probit approach and compare the results with the linear IV approach. In other words, we estimate the relationship between household income and food insecurity status using IV probit and generate provinces’ share of households with high food insecurity risk for 2016 and 2018.25 We then validate these predictions with the actual shares of food-insecure households for these years using the VHLSS data. We find that the IV probit approach yields more accurate prediction 25We use the same criterion to find the optimal threshold for the predicted probability of food inse- curity generated from the IV probit estimation. The threshold that minimizes the difference between the predicted prevalence and observed prevalence is 33%, and the threshold that maximizes the Co- hen’s Kappa statistic is 35% (see Figure C.2(b). We choose the 33% as the threshold for the predicted probability; the results are similar for the 35% threshold. 118 than the OLS approach, but less accurate than our linear IV approach for both 2016 and 2018 (see Figure C.3). As explained in Section 4.3.3, the assumption of the IV probit approach is relatively strong and unlikely to be met in this study. Therefore, it is not surprising that the linear IV approach outperforms the IV probit approach. 4.5.3 Predicting Food Insecurity Changes During the Pandemic We use the 2019 Labor Force Survey (LFS) data to predict the share of households with high risk of food insecurity at the district level before and after the income shock using the method described in Section 4.3.3. We first obtain the pre-pandemic predicted probability of food insecurity for each household in the 2019 LFS sample using Equation 4.2, where the coefficients come from the IV estimates in Table 4.2 and the right-hand side variables including household income are from the LFS data.26 We then classify those with the predicted probability above the 85th percentile as “high-risk” households and obtain pre-pandemic share of “high-risk” households at the district level. Post-pandemic predicted probability of food insecurity can be generated in the same way except for using the post-pandemic income instead of the 2019 income. The World Bank indicates that the average monthly income per person in the agricultural, man- ufacturing, and service sectors in quarter 2 of 2020 is lower than that of 2019 by 3%, 5.1%, and 7.3%, respectively (Morisset et al., 2020). 27 We use these estimates to calculate the post-pandemic income for each household in the 2019 LFS sample,28 and generate the post-pandemic predicted probability for each household in the 2019 LFS sample following Equation 4.2. Similarly, “high-risk” households are classified using the 85th percentile threshold to obtain the post-pandemic share of high-risk households. We find that the average share of “high-risk” household is 14.74% for before pan- demic and 15.56% for after pandemic. In other word, the share of food-insecure house- holds is predicted to increase by 0.82 percentage points (95% CI: 0.77, 0.87) due to the 26Recall that the LFS data itself do not contain information about food insecurity. 27These figures are based on the GSO’s estimation using the LFS data for the second quarter of 2020. According to the GSO report, the average monthly income reduces from 5,517 VND in Q2-2019 to 5,238 VND in Q2-2019 (5.06%). Income reduces from 3,035 VND to 2,951 VND (2.76%) in the agriculture, forestry, and fishery sector; from 6,534 VND to 6,201 VND (5.01%) in the industry and construction sector; and from 6,939 VND to 6,429 VND (7.35%) in the service sector. 28Note that the income and employment data from the LFS sample are classified at the industry level (three-digit ISIC), while the information about the income shocks is by sector. Therefore, we re-classify the income and employment data by sector before matching with the sector-specific income shocks. 119 income shock during the pandemic. We turn to the most vulnerable population, young children: the average share of children ages 0 to 5 with high food insecurity risk is 18% for before pandemic and 19% for after pandemic. The share of food-insecure children is predicted to increase by 0.997% (95% CI: 0.897, 1.10) as summarized in Figure 4.5. Figure 4.5: Percentage of households with high predicted risk This small increase in food insecurity, however, masks significant geographic varia- tion in changes in food insecurity across districts. In Figure 4.6, we map the districts’ share of households with high pre- and post-pandemic risk and the difference between the two shares. The increases were relatively small across the country, but a small num- ber of districts experienced an increase as large as 7.86 percentage points. Only 102 out of 702 districts in our sample experienced an increase larger than 2 percentage points. Similarly, Figure 4.7 shows the changes in the percentage of children ages 0 to 5 with a high predicted food insecurity risk due to the income shock; 560 districts are predicted to have a increase between 0 and 2 percentage points, 141 districts are predicted to have an increase between 2 and 10 percentage points, and only 7 districts are predicted to have an increase larger than 10 percentage points and up to 19.33 percentage points. This finding suggests that children might be affected more severely in terms of food insecurity during the pandemic. 120 (a) Pre-pandemic (b) Post-pandemic (c) Percentage point change Figure 4.6: Pre-pandemic, post-pandemic, and percentage point change in each district’s share of households with high food insecurity risk The maps show the (a) pre-pandemic, (b) post-pandemic, and (c) percentage point change in the share of households with high food insecurity risk. Pre-pandemic risk is predicted food insecurity using 2019 household income data from the 2019 Labor Force Survey. Post-pandemic risk is predicted food insecurity using post- pandemic income, which is calculated using the quarterly percentage change of average income by sector from Morisset et al. (2020): average income decreased by 3% in agriculture, 5.1% in manufacturing, and 7.3% in services per quarter. 121 (a) Pre-pandemic (b) Post-pandemic (c) Percentage point change Figure 4.7: Pre-pandemic, post-pandemic, and percentage point change in each district’s share of children ages 0–5 with high food insecurity risk Maps show the pre-pandemic, post-pandemic, and percentage point change in the share of children ages 0–5 with a high food insecurity risk. Pre-pandemic risk is predicted food insecurity using 2019 household income data from the 2019 Labor Force Survey. Post-pandemic risk is predicted food insecurity using post-pandemic income, which is calculated using the quarterly percentage change of average income by sector from Morisset et al. (2020): average income decreased by 3% in agriculture, 5.1% in manufacturing, and 7.3% in services per quarter. 122 Alternatively, we can ask what the food insecurity risks would have been if Vietnam had not contained the virus successfully. To answer this question, we predict food insecurity for a scenario where the quarterly reductions from Morisset et al. (2020) were to continue for all four quarters. In other words, we calculate the annualized changes of income and use those calculations to make our food insecurity prediction. Under this scenario, the average income decreases in the agricultural, manufacturing, and service sectors by 12.6%, 22% and 32.55%, respectively. The share of “high-risk” households before the pandemic is 12.46% and the share after the pandemic is 19.28%. In other word, the share of food-insecure household is predicted to rise by 6.81 percentage points (95% CI: 6.67, 6.95). In Figure 4.8, we map the changes of the shares of “high- risk” households due to the income shock during the pandemic. A similar pattern is expected: while the average change is 7.06 percentage points, a small number of districts would have experienced an increase of up to 26.62 percentage points. This is likely an underestimation because those who are food insecure in the short run may experience a decrease in productivity in the long run, which in turn affects both long-run income and food insecurity. 4.5.4 Policy implications These predictions provide a detailed picture of which districts may have experienced larger increases in food insecurity and hence may require more assistance than others. This information will allow the government or international organizations to quickly identify and allocate more resources toward districts experiencing a larger increase in food insecurity risk and allocate fewer resources toward districts with a smaller increase in risk. As an example, consider Figure 4.9’s scatter plot of each district’s pre-pandemic share of households with high food insecurity risk and each district’s percentage point increase in such share due to the pandemic. Districts on the right of the dashed line are in the top 10% in terms of food insecurity, i.e. districts with the largest shares of “high-risk” households in 2019, and districts above the solid line in the top 10% in terms of predicted increase in food insecurity, i.e. districts with the largest increase in the share of “high-risk” households due to the pandemic. 123 Figure 4.8: District-level percentage point change in the share of households with food insecurity risk before and after COVID-19 under annualized changes The map shows the percentage point changes in the share of households with high food insecurity risk. Pre- pandemic risk is predicted food insecurity using 2019 household income data from the 2019 Labor Force Survey. Post-pandemic risk is predicted food insecurity using post-pandemic income, which is calculated using the an- nualized version of the quarterly percentage change of average income by sector from Morisset et al. (2020): decreased by 12.6% in agriculture, 22% in manufacturing, and 32.55% in services per year. One potential way that the government can allocate aid based on needs is priori- tizing districts that are in the first quadrant since they both have substantially high pre-pandemic food insecurity and the largest increase in food insecurity due to the pan- demic. The second priority can be assigned to districts to the left of the dashed line and above the solid line; these are districts that experience a temporary surge in food insecurity. The third priority can be of districts to the bottom right corner; although these districts have a substantially high pre-pandemic food insecurity, the increase is not as high compared to other districts. More importantly, their pre-pandemic level of food insecurity can be linked to factors unrelated to affordability, such as a traditional diet. The districts in the bottom left of the graph have the lowest risk and hence would be assigned the lowest priority. The government can also pursue a more sophisticated pri- oritization scheme based on these food insecurity risks, for example, employing multiple 124 cutoffs. It is important to note that an effective targeting approach would combine the geographical targeting approach proposed in this study with household or individual targeting (Barrett, 2010; World Food Programme, 2015). The method proposed in this study can first provide a district-level assessment of food insecurity risk due to the pandemic, allowing the government or organizations to allocate more resources or money toward districts that are more affected and allocate fewer resources toward districts that are not affected as much. After this allocation process, the district’s local government can target households with higher food insecurity risk as they tend to have more accurate information (World Food Programme, 2015). This approach is likely better than simple household targeting, especially in the context of developing countries. The process of budgeting for emergency aid based on household poverty status is more challenging and is prone to errors, as the household’s status may not be updated annually. More importantly, not all poor households have high food insecurity risk during the pandemic, especially those who work in unaffected industries such as agriculture. In Appendix C.3, we find that 72.53% of households that were previously poor in 2019 are not in the high food insecurity risk group during the pandemic. Therefore, using poverty status alone can lead to overbudgeting toward low-risk districts, while our method allows the government and organizations to allocate resources more precisely and thus more effectively. Once the government or international organizations can identify food-insecure households, they can design a suitable food assistance program based on their budget. It is important to emphasize that our prediction is specifically for short-run food insecurity shocks caused by the pandemic or an emergency and thus should only be used to guide timely responses to provide short-term reliefs during a crisis (Lentz and Barrett, 2013). Such short-term relief policies are very different from social protection programs and policies that aim to address structural causes of food insecurity as they mainly focus on (1) mitigating any negative health effects, especially for children, from short-run shocks (Lentz and Barrett, 2013) and (2) ensuring that families have adequate food during lockdowns or economic downturns so they can follow public health measures. 125 Figure 4.9: Districts’ share of households with high food insecurity risk in 2019 and districts’ increase in the share of high-risk households due to the pandemic Note: The horizontal axis shows each district’s share of households with high food insecurity risk in 2019. The vertical axis shows districts’ increase in the share of high-risk households due to the pandemic. The dashed line shows the top 10% food insecurity in 2019, and the solid line shows the top 10% (predicted) increase in food insecurity due the income shock during the pandemic. Intervention options may vary by country, organization, and objective (Lentz et al., 2013; Barrett et al., 2009). If budgets are constrained, one option is to send staple food to food-insecure households. Unlike the case of natural disasters, food-insecure households still have access to the market; by reducing the expense of staple food, households can also spend more on non-staple food. Staple foods are also cheaper than non-staple food, which will allow the government to send more to each food-insecure household or reach more food-insecure households.29 The government and international organizations can also send non-staple foods and micro-nutrient supplements if they have enough money in their budget and enough resources. Although cash transfer is also a popular approach (Gentilini et al., 2020), this relief policy may come with unintended consequences such as higher food prices in remote markets (Filmer et al., 2021). 29For example, the United Nations World Food Programme provides sorghum, corn, millet and rice, beans, black-eyed peas, and vegetable oil in the eight provinces of Chad for three months as a response to temporary food insecurity shocks caused by the pandemic. 126 4.6 Discussion and Conclusion This study proposes an approach to predict changes in food insecurity risk caused by income shocks at the locality level during emergencies such as the global pandemic. We apply this method to predict changes in food insecurity risk due to the pandemic for all 702 districts in Vietnam. We predict that the share of food-insecure households increases by 0.82 percentage points, on average, as most districts only experience a 0 to 2 percentage point increase, and only 102 out of 702 districts experience increases larger than 2 percentage points. A small number of districts experience an increase as large as 7.86 percentage points. The share of food-insecure children age 0 to 5 is predicted to increase by 0.997 percentage points, although a small number of districts can experience up to 19.33 percentage points. Several important points can be drawn from these predictions. First, Vietnam is among one of the most successful countries at containing the virus at an early stage, and the overall impact on the economy during the study period was relatively small (International Monetary Fund, 2020). As a result, our prediction suggests that the average effect on food insecurity across the country might have been quite small dur- ing this period. However, we show that a small number of districts are predicted to be affected more severely than others because the pandemic has different impacts on different economic sectors. This finding highlights the importance of effective targeting because these districts may require substantially more support than others. This is especially true for districts with more young children who are affected. We also note that if Vietnam had not contained the pandemic early, food insecurity might have been substantially worse, suggesting effective public health responses to the pandemic can mitigate its potential economic impacts. The approach proposed in this study and its findings come with several caveats. First, we focus on predicting changes in food insecurity due to income shocks, so our ability to speak about the actual effect of the pandemic on food insecurity is limited. Specifically, our prediction for the effect on food insecurity caused by the pandemic is likely a lower bound because we do not take into account increases in food prices30 and 30Dietrich et al. (2021) show that stringent lockdowns and reduced mobility can increase food prices via higher trade costs, especially in integrated markets in 44 low and middle-income countries. Ruan et al. (2021) also find that stringent lockdowns increase vegetable prices in China. 127 reductions in remittances and social transfers31 because of the lack of disaggregated data on changes in food prices and these additional incomes. Higher food prices mean non-agricultural households are more likely to be affected, while lower remittance means households’ ability to afford nutritious food decreases even more.32,33 Second, our IV- based method provides a prediction about changes in food insecurity due to income shocks, so its ability to predict the level of food insecurity is likely insufficient compared to existing prediction methods that incorporate more data such as market price, weather conditions, and geographic characteristics (e.g., Lentz et al., 2019). This method will nonetheless allow a better targeting approach than most alternative targeting methods given the same constraint of the data. For instance, a common alternative approach is targeting households that are already poor before the pandemic since they are likely more vulnerable to the pandemic income shock. However, the government would also target poor households with low food insecurity risk (errors of inclusion), especially those with household members working in unaffected industries, while missing near-poor households that actually experience food insecurity (errors of exclusion). We assess each type of errors in Appendix C.3. Targeting previously poor households would include 56.38% poor households that do not have high risk and would miss 11.05% of high-risk households that are not poor. This is highly inefficient, as the amount being sent to the poor but food-secure households can be sent to households that actually need them more. Another alternative method is to target districts with the largest income reduction due to the pandemic or to target districts with the largest increase in poverty before any household or individual targeting. These methods may also mistarget districts with a smaller increase in food insecurity risk or miss districts that may be experiencing significant increases in food insecurity risk. In Appendix C.3, we show that 75.96% of districts in the top 10% of income loss are not in the top 10% of (predicted) increase in food insecurity. Similarly, 90.35% of districts in the top 10% of increases in poverty 31The World Bank reports that remittance may decrease by 14% in 2021. 32In the case of Vietnam, there have not been any official statistics, to the best of our knowledge, on how remittances are affected by the pandemic; The World Bank (2020) indicates that Vietnam was among the top remittance recipients in 2020. According to the VHLSS, the average share of household income coming from remittances and social transfers is roughly 8.68% to 9.59% during the 2010–2018 period. 33Gupta et al. (2021) find that lockdowns in a rural region in India decrease household income substantially, forcing households to reduce their food consumption. 128 are not in the top 10% of (predicted) increases in food insecurity. Moreover, these two targeting methods would also miss 72.31% and 77.82% of districts in the top 10% of (predicted) increase in food insecurity. These simpler methods are associated with very high error rates, which, again, can be very costly for the government. Predictive methods with high accuracy can be used in an effective targeting ap- proach, which in turn will allow preventive measures such as providing food aid to substantially reduce the impacts of food insecurity shocks Barrett (2010). Our method provides a more accurate prediction of food insecurity shocks at the locality level (rela- tive to OLS-based prediction), which the government or international organizations can use to allocate resources and money based on the predicted impacts on food insecurity. A better predicting and targeting approach will allow communities with higher needs to receive more assistance and mitigate any negative impacts of the food insecurity shocks in a timely manner. Chapter 5 Conclusion and Discussion As many countries that went through a period of export-led industrialization, Vietnam faces an increasingly important question about ”What’s next?”. On one hand, export- led industrialization does not lead to sustainable growth in the long run, so a different strategy is required to achieve such a goal. On the other hand, concerns about inequity and natural disasters have also become increasingly apparent as the economy continues to evolve. In Chapter 2, I examined a potential strategy to promote long-term growth: expand- ing access to higher education. Although it is well-known that investment in human capital is an important way to achieve long-run economic growth, it is not always clear whether policymakers should focus on higher education or other levels of human capital, such as early childhood and primary-level education, especially in developing countries. Hanushek (2016), for example, raises a concern that higher education in developing countries does not produce enough knowledge capital to influence growth, relative to basic cognitive skills that are typically taught at the primary level. Along this line, Schoellman (2012) and Martellini et al. (2022) show that the low quality of college in less developed countries can dampen the impacts of human capital on economic develop- ment. Yet the extent to which expanding access to higher education impacts economic growth at the subnational level remains unexplored. Chapter 2 showed that such an expansion of access to higher education can ben- efit the economy in two ways. First, workers exposed to the expansion are less likely 129 130 to work in the agricultural sector and more likely to work in the service sector. Sec- ond, the expansion also raises productivity by inducing firms to adopt new technology. Surprisingly, the expansion also has positive impacts on the employment and earnings of non-college workers. Furthermore, I find that the effects are concentrated among females, unintendedly closing the gender gaps in income and employment. In Chapter 3, I explored a potential policy strategy to lower the gender gap in formal employment in developing countries: maternity leave requirement. Vietnam and many other developing countries have a substantial informal economy that co-exists with the formal economy. The informal economy is often associated with lower productivity (La Porta and Shleifer, 2014) and weaker labor rights protection. The gender gap in formal employment, thus, represents a gender gap in earnings and work benefits. In Chapter 3, I find that by extending the required maternity leave period, the govern- ment can create an incentive for women to switch from an informal setting to a formal job. Female workers are more likely to move into middle-skilled occupations that do not require substantial upskilling. Chapters 2 and 3 also highlight the intersectionality of different economic policies. The higher education expansion policy is aimed at raising human capital, yet it helps close the gender income and employment gaps. The mater- nity leave policy is aimed at benefiting women, yet it also reduces the informal sector and fosters the structural transformation process. Natural disasters and infectious diseases also pose important challenges to the long- term economic growth goal of Vietnam and many other developing countries. Gov- ernments are often expected to respond timely to disastrous events, e.g., the global pandemic, with limited data. In Chapter 4, I develop a simple two-stage approach that allows policymakers to map out the impacts of natural disasters and diseases at a granular level using only existing microdata and external estimates for macroeconomic impacts. Using the COVID-19 and food security impact as a case study, I show that although the average effect of the pandemic on food security may be small, a small number of locations are much more vulnerable than others. This result highlights the importance of mapping the potential damages at a granular level. Chapter 6 References 131 Abadie, A., S. Athey, G. W. Imbens, and J. Wooldridge (2017). When should you adjust standard errors for clustering? National Bureau of Economic Research. Abebaw, D., A. Admassie, H. Kassa, and C. Padoch (2020). Can rural outmigration improve household food security? Empirical evidence from Ethiopia. World Devel- opment 129, 104879. Acemoglu, D. (1998). Why do new technologies complement skills? directed technical change and wage inequality. The quarterly journal of economics 113 (4), 1055–1089. Acemoglu, D. (2002). Directed technical change. The review of economic studies 69 (4), 781–809. Acemoglu, D. (2007). Equilibrium Bias of Technology. Econometrica 75 (5), 1371–1409. Acemoglu, D., P. Aghion, and F. Zilibotti (2006). Distance to frontier, selection, and economic growth. Journal of the European Economic association 4 (1), 37–74. Acemoglu, D. and D. Autor (2011). Skills, tasks and technologies: Implications for employment and earnings. In Handbook of labor economics, Volume 4, pp. 1043– 1171. Elsevier. Ackerberg, D. A., K. Caves, and G. Frazer (2015). Identification properties of recent production function estimators. Econometrica 83 (6), 2411–2451. Aggarwal, S., D. Jeong, N. Kumar, D. S. Park, J. Robinson, and A. Spearot (2020). Did covid-19 market disruptions disrupt food security? evidence from households in rural liberia and malawi. Technical report, National Bureau of Economic Research. 132 133 Aghion, P., L. Boustan, C. Hoxby, and J. Vandenbussche (2009). The causal impact of education on economic growth: evidence from us. Brookings papers on economic activity 1 (1), 1–73. Ahn, S. and F. B. Norwood (2020). Measuring food insecurity during the covid-19 pandemic of spring 2020. Applied Economic Perspectives and Policy . Aiken, E., S. Bellue, D. Karlan, C. R. Udry, and J. Blumenstock (2021). Machine learning and mobile phone data can improve the targeting of humanitarian assistance. Technical report, National Bureau of Economic Research. Akgunduz, Y. E. and J. Plantenga (2013). Labour market effects of parental leave in europe. Cambridge Journal of Economics 37 (4), 845–862. Akresh, R., D. Halim, and M. Kleemans (2018). Long-term and intergenerational ef- fects of education: Evidence from school construction in indonesia. Technical report, National Bureau of Economic Research. Almeida, R. and P. Carneiro (2012). Enforcement of labor regulation and informality. American Economic Journal: Applied Economics 4 (3), 64–89. Amare, M., K. A. Abay, L. Tiberti, and J. Chamberlin (2021). Covid-19 and food security: Panel data evidence from nigeria. Food Policy 101, 102099. Amin, M. and A. Islam (2019). Paid maternity leave and female employment: Evidence using firm-level survey data for developing countries. World Bank Policy Research Working Paper (8715). Amin, M., A. Islam, and A. Sakhonchik (2016). Does paternity leave matter for fe- male employment in developing economies? evidence from firm-level data. Applied Economics Letters 23 (16), 1145–1148. Andree, B. P. J., A. Chamorro, A. Kraay, P. Spencer, and D. Wang (2020). Predicting food crises. Andrews, M. (2020). How do institutions of higher education affect local invention? evidence from the establishment of us colleges. Evidence from the Establishment of US Colleges (March 28, 2020). 134 Arteaga, C. (2018). The Effect of Human Capital on Earnings: Evidence from a Reform at Colombia’s Top University. Journal of Public Economics 157, 212–225. Asghar, R. and B. McCaig (2023). Trade, structural change and labour market transi- tions in vietnam. Athey, S. (2018). The impact of machine learning on economics. In The economics of artificial intelligence: An agenda, pp. 507–547. University of Chicago Press. Athey, S. and G. W. Imbens (2006). Identification and Inference in Nonlinear Difference- in-Differences Models. Econometrica 74 (2), 431–497. Auffhammer, M. and R. Steinhauser (2012). Forecasting the path of us co2 emissions using state-level information. Review of Economics and Statistics 94 (1), 172–185. Baccini, L., G. Impullitti, and E. J. Malesky (2019). Globalization and State Capitalism: Assessing Vietnam’s Accession to the WTO. Journal of International Economics 119, 75–92. Barrera-Osorio, F. and H. Bayona-Rodr´ıguez (2019). Signaling or Better Human Capi- tal: Evidence from Colombia. Economics of Education Review 70, 20–34. Barrett, C. B. (2010). Measuring food insecurity. Science 327 (5967), 825–828. Barrett, C. B., R. Bell, E. C. Lentz, and D. G. Maxwell (2009). Market information and food insecurity response analysis. Food Security 1 (2), 151–168. Barrett, C. B., J. Upton, E. Tennant, and K. Florella (2021). Household resilience, and rural food systems: Evidence from southern and eastern africa. Available at SSRN 3992671 . Baylis, K., L. Fan, and L. Nogueira (2019, January). Agricultural market liberaliza- tion and household food insecurity in rural China. American Journal of Agricultural Economics 101 (1), 250–269. BBC (2020). Coronavirus: Vietnamese farmers cry as exports stuck at border. https: //www.bbc.com/vietnamese/vietnam-51353578. [Online; accessed 4th May 2021]. 135 BBC (2021). Vietnam: Increasing risk of a new wave of covid-19 infection? https: //www.bbc.com/vietnamese/vietnam-56939029. [Online; accessed 4th May 2021]. Beaudry, P., M. Doms, and E. Lewis (2010). Should the personal computer be considered a technological revolution? evidence from us metropolitan areas. Journal of political Economy 118 (5), 988–1036. Bellemare, M. F., K. Chua, J. Santamaria, and K. Vu (2020). Tenurial security and agricultural investment: Evidence from vietnam. Food Policy 94, 101839. Ben-David, A. (2007). A lot of randomness is hiding in accuracy. Engineering Applica- tions of Artificial Intelligence 20 (7), 875–885. Ben-David, A. (2008). About the relationship between roc curves and cohen’s kappa. Engineering Applications of Artificial Intelligence 21 (6), 874–882. Be´ne´, C., D. Bakker, M. J. Chavarro, B. Even, J. Melo, and A. Sonneveld (2021). Global assessment of the impacts of covid-19 on food security. Global Food Security 31, 100575. Besamusca, J., K. Tijdens, M. Keune, and S. Steinmetz (2015). Working women world- wide. age effects in female labor force participation in 117 countries. World Develop- ment 74, 123–141. Bickel, G., M. Nord, C. Price, W. Hamilton, and J. Cook (2000, March). Guide to measuring household food security, Revised 2000,. Alexandria VA: U.S. Department of Agriculture, Food and Nutrition Service,. Bilinski, A. and L. A. Hatfield (2018). Nothing to See Here? Non-Inferiority Approaches to Parallel Trends and Other Model Assumptions. arXiv preprint arXiv:1805.03273 . Bitler, M., H. W. Hoynes, and D. W. Schanzenbach (2020). The social safety net in the wake of COVID-19. National Bureau of Economic Research. Bleemer, Z. and A. Mehta (2022). Will studying economics make you rich? a regression discontinuity analysis of the returns to college major. American Economic Journal: Applied Economics 14 (2), 1–22. 136 Bloom, D. E., D. Canning, G. Fink, and J. E. Finlay (2009). Fertility, female labor force participation, and the demographic dividend. Journal of Economic growth 14, 79–101. Blumenstock, J., G. Cadamuro, and R. On (2015). Predicting poverty and wealth from mobile phone metadata. Science 350 (6264), 1073–1076. Blundell, R., D. A. Green, and W. Jin (2022). The UK as a Technological Follower: Higher Education Expansion and the College Wage Premium. The Review of Eco- nomic Studies 89 (1), 142–180. Borusyak, K., X. Jaravel, and J. Spiess (2021). Revisiting event study designs: Robust and efficient estimation. arXiv preprint arXiv:2108.12419 . Brown, C., M. Ravallion, and D. van de Walle (2018). A poor means test? econometric targeting in africa. Journal of Development Economics 134 (C), 109–124. Brown, C. S., M. Ravallion, and D. Van De Walle (2020). Can the World’s Poor Protect Themselves from the New Coronavirus? National Bureau of Economic Research. Callaway, B. and P. H. Sant’Anna (2021). Difference-in-differences with multiple time periods. Journal of Econometrics 225 (2), 200–230. Card, D. and T. Lemieux (2001). Can falling supply explain the rising return to college for younger men? a cohort-based analysis. The quarterly journal of economics 116 (2), 705–746. Carneiro, P., K. Liu, and K. G. Salvanes (2023). The Supply of Skill and Endoge- nous Technical Change: Evidence from a College Expansion Reform. Journal of the European Economic Association 21 (1), 48–92. Cengiz, D., A. Dube, A. Lindner, and B. Zipperer (2019). The effect of minimum wages on low-wage jobs. The Quarterly Journal of Economics 134 (3), 1405–1454. Chaaban, J., H. Ghattas, A. Irani, and A. Thomas (2018). Targeting mechanisms for cash transfers using regional aggregates. Food security 10 (2), 457–472. 137 Che, Y. and L. Zhang (2018). Human capital, technology adoption and firm perfor- mance: Impacts of china’s higher education expansion in the late 1990s. The Eco- nomic Journal 128 (614), 2282–2320. Christian, P., E. Kandpal, N. Palaniswamy, and V. Rao (2019). Safety nets and natural disaster mitigation: evidence from cyclone phailin in odisha. Climatic Change 153 (1), 141–164. Clemens, M. A., E. G. Lewis, and H. M. Postel (2018). Immigration restrictions as active labor market policy: Evidence from the mexican bracero exclusion. American Economic Review 108 (6), 1468–87. Correia, S. (2015). Singletons, cluster-robust standard errors and fixed effects: A bad mix. Technical Note, Duke University 7. Coxhead, I. and R. Shrestha (2017). Globalization and school–work choices in an emerg- ing economy: Vietnam. Asian Economic Papers 16 (2), 28–45. Dang, H.-A. and L. Giang (2020). Turning vietnam’s covid-19 success into economic recovery: A job-focused analysis of individual assessments on their finance and the economy. Dang, H.-A., P. Glewwe, K. Vu, and J. Lee (2021). What explains vietnam’s exceptional performance in education relative to other countries? analysis of the 2012 and 2015 pisa data. Dang, H.-A. H. and P. W. Glewwe (2018). Well begun, but aiming higher: A review of vietnam’s education trends in the past 20 years and emerging challenges. The journal of development studies 54 (7), 1171–1195. Dang, H.-A. H., M. Hiraga, and C. V. Nguyen (2022). Childcare and maternal employ- ment: evidence from vietnam. World Development 159, 106022. Daw, J. R. and L. A. Hatfield (2018). Matching and Regression to the Mean in Difference-in-Differences Analysis. Health services research 53 (6), 4138–4156. De Chaisemartin, C. and X. d’Haultfoeuille (2020). Two-way fixed effects estimators with heterogeneous treatment effects. American Economic Review 110 (9), 2964–96. 138 De Chaisemartin, C. and X. d’Haultfoeuille (2018). Fuzzy Differences-in-Differences. The Review of Economic Studies 85 (2), 999–1028. Deitchler, M., T. Ballard, A. Swindale, and J. Coates (2010). Validation of a measure of household hunger for cross-cultural use: Food and nutrition technical assistance ii project (fanta-2). washington, dc: Fanta-2. Del Carpio, X., C. Nguyen, H. Nguyen, and C. Wang (2013). The impact of minimum wages on employment, wages and welfare: The case of vietnam. Devereux, S., C. Be´ne´, and J. Hoddinott (2020). Conceptualising covid-19’s impacts on household food security. Food Security 12 (4), 769–772. Dietrich, S., V. Giuffrida, B. Martorano, and G. Schmerzeck (2021). Covid-19 policy responses, mobility, and food prices. American Journal of Agricultural Economics. Doan, T., Q. Le, and T. Q. Tran (2018). Lost in transition? declining returns to education in vietnam. The European Journal of Development Research 30 (2), 195– 216. Duflo, E. (2001). Schooling and labor market consequences of school construction in indonesia: Evidence from an unusual policy experiment. American economic re- view 91 (4), 795–813. Dustmann, C. and A. Glitz (2015). How Do Industries and Firms Respond to Changes in Local Labor Supply? Journal of Labor Economics 33 (3), 711–750. Elsayed, A. and A. Shirshikova (2023). The Women-Empowering Effect of Higher Ed- ucation. Journal of Development Economics, 103101. Evans, A. (2020). The politics of pro-worker reforms. Socio-Economic Review 18 (4), 1089–1111. Evans, A. (2021). Export incentives, domestic mobilization, & labor reforms. Review of International Political Economy 28 (5), 1332–1361. Feeny, S., A. Mishra, T.-A. Trinh, L. Ye, and A. Zhu (2021). Early-life exposure to rainfall shocks and gender gaps in employment: Findings from vietnam. Journal of Economic Behavior & Organization 183, 533–554. 139 Feng, S. and X. Xia (2022). Heterogeneous Firm Responses to Increases in High-Skilled Workers: Evidence from China’s College Enrollment Expansion. China Economic Review 73, 101791. Filmer, D., J. Friedman, E. Kandpal, and J. Onishi (2021). Cash transfers, food prices, and nutrition impacts on ineligible children. The Review of Economics and Statistics, 1–45. Finlay, J. E. (2021). Women’s reproductive health and economic activity: A narrative review. World Development 139, 105313. Freeman, E. A. and G. G. Moisen (2008). A comparison of the performance of threshold criteria for binary classification in terms of predicted prevalence and kappa. Ecological modelling 217 (1-2), 48–58. Freeman, R. B. (2010). Labor regulations, unions, and social protection in developing countries: Market distortions or efficient institutions? Handbook of development economics 5, 4657–4702. Gaddis, I. and S. Klasen (2014). Economic development, structural change, and women’s labor force participation. Journal of Population Economics 27 (3), 639–681. Gatto, M. and A. H. M. S. Islam (2021). Impacts of covid-19 on rural livelihoods in bangladesh: Evidence using panel data. PloS one 16 (11), e0259264. Gelli, A., N.-L. Aberman, A. Margolies, M. Santacroce, B. Baulch, and E. Chirwa (2017). Lean-season food transfers affect children’s diets and household food security: evidence from a quasi-experiment in malawi. The Journal of nutrition 147 (5), 869– 878. Gentilini, U., M. Almenfi, I. Orton, and P. Dale (2020). Social protection and jobs responses to covid-19. Glewwe, P. (2004). An overview of economic growth and household welfare in vietnam in the 1990s. Economic growth, poverty, and household welfare in Vietnam 1. Goldin, C. and L. F. Katz (2010). The race between education and technology. harvard university press. 140 Goldsmith-Pinkham, P., I. Sorkin, and H. Swift (2020). Bartik instruments: What, when, why, and how. American Economic Review (forthcoming). Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment tim- ing. Journal of Econometrics 225 (2), 254–277. GSO (2020a, July). Report on the impact of the COVID-19 on labour and eployment situation in Vietnam . Accessed on Sep 1st 2020 via https://gso.gov.vn/default_ en.aspx?tabid=462&idmid=2&ItemID=19678. GSO (2020b, May). Báo cáo Kết quả Khảo sát đánh giá tác động của dịch COVID-19 đến hoạt động sản xuất, kinh doanh của doanh nghiệp [Firm survey summary report on the impact of COVID-19]. Accessed on Sep 1st 2020 via https://www.gso.gov. vn/default.aspx?tabid=382&idmid=2&ItemID=19623. GSO (2021, June). Report on the impact of the COVID-19 Pandemic on labour and eployment situation of the 2nd Quarter of 2021. Accessed on July 7th 2021 via https://www.gso.gov.vn/en/data-and-statistics/2021/ 07/report-on-impact-of-covid-19-pandemic-on-labour-and-employment/ /-of-the-second-quarter-of-2021/. Gundersen, C., M. Hake, A. Dewey, and E. Engelhard (2020). Food insecurity during covid-19. Applied Economic Perspectives and Policy . Gundersen, C., B. Kreider, and J. Pepper (2011). The economics of food insecurity in the united states. Applied Economic Perspectives and Policy 33 (3), 281–303. Gundersen, C., B. Kreider, and J. V. Pepper (2017, July). Partial identification methods for evaluating food assistance programs: A case study of the causal impact of SNAP on food insecurity. American Journal of Agricultural Economics 99 (4), 875–893. Gupta, A., H. Zhu, M. K. Doan, A. Michuda, and B. Majumder (2021). Economic im- pacts of the covid- 19 lockdown in a remittance-dependent region. American Journal of Agricultural Economics 103 (2), 466–485. Hanushek, E. A. (2016). Will more higher education improve economic growth? Oxford Review of Economic Policy 32 (4), 538–552. 141 Hausman, N. (2020). University innovation and local economic growth. The Review of Economics and Statistics, 1–46. Hidrobo, M., J. Hoddinott, N. Kumar, and M. Olivier (2018). Social protection, food security, and asset formation. World Development 101, 88–103. Hirvonen, K., A. de Brauw, and G. T. Abate (2021). Food consumption and food secu- rity during the covid-19 pandemic in addis ababa. American journal of agricultural economics 103 (3), 772–789. Hook, J. L. and E. Paek (2020). National family policies and mothers’ employment: How earnings inequality shapes policy effects across and within countries. American Sociological Review 85 (3), 381–416. Huang, B., M. Tani, Y. Wei, and Y. Zhu (2022). Returns to education in china: Evidence from the great higher education expansion. China Economic Review , 101804. ILO (2014). Maternity and paternity at work: Law and practice across the world. In- ternational Labour Office. ILO (2016). 2016 Report on informal employment in Vietnam. International Labour Office. Imbens, G. W. and J. M. Wooldridge (2009). Recent Developments in the Econometrics of Program Evaluation. Journal of Economic Literature 47 (1), 5–86. International Monetary Fund (2020). World economic outlook, Oct 2020: A long and difficult ascent. International Monetary Fund. James, G., D. Witten, T. Hastie, and R. Tibshirani (2013). An introduction to statistical learning, Volume 112. Springer. Jensen, R. T. and N. H. Miller (2010, November). A revealed preference approach to measuring hunger and undernutrition. NBER Working Paper No. 16555. Jones, A. D., A. Shrinivas, and R. Bezner-Kerr (2014). Farm production diversity is as- sociated with greater household dietary diversity in malawi: Findings from nationally representative data. Food Policy 46, 1–12. 142 Katz, L. F. and K. M. Murphy (1992). Changes in relative wages, 1963–1987: supply and demand factors. The quarterly journal of economics 107 (1), 35–78. Khanna, G. (2023). Large-scale education reform in general equilibrium: Regression discontinuity evidence from india. Journal of Political Economy 131 (2), 000–000. Klasen, S. (2019). What explains uneven female labor force participation levels and trends in developing countries? The World Bank Research Observer 34 (2), 161–197. Klasen, S., T. T. N. Le, J. Pieters, and M. Santos Silva (2021). What drives female labour force participation? comparable micro-level evidence from eight developing and emerging economies. The Journal of Development Studies 57 (3), 417–442. Krafft, C. (2020). Why is fertility on the rise in egypt? the role of women’s employment opportunities. Journal of Population Economics 33 (4), 1173–1218. Kreibaum, M. and S. Klasen (2015). Missing men: differential effects of war and so- cialism on female labour force participation in vietnam. Technical report, Discussion Papers. Kuku, O., C. Gundersen, and S. Garasky (2011). Differences in food insecurity between adults and children in zimbabwe. Food Policy 36 (2), 311 – 317. Kuma, T., M. Dereje, K. Hirvonen, and B. Minten (2019). Cash crops and food security: Evidence from ethiopian smallholder coffee producers. The Journal of Development Studies 55 (6), 1267–1284. La Porta, R. and A. Shleifer (2014). Informality and development. Journal of economic perspectives 28 (3), 109–126. Lentz, E., H. Michelson, K. Baylis, and Y. Zhou (2019). A data-driven approach im- proves food insecurity crisis prediction. World Development 122, 399–409. Lentz, E. C. and C. B. Barrett (2013). The economics and nutritional impacts of food assistance policies and programs. Food Policy 42, 151–163. Lentz, E. C., C. B. Barrett, M. I. Go´mez, and D. G. Maxwell (2013). On the choice and impacts of innovative international food assistance instruments. World Develop- ment 49, 1–8. 143 Levinsohn, J. and A. Petrin (2003). Estimating production functions using inputs to control for unobservables. The review of economic studies 70 (2), 317–341. Lewbel, A., Y. Dong, and T. T. Yang (2012). Comparing features of convenient esti- mators for binary choice models with endogenous regressors. Canadian Journal of Economics/Revue canadienne d’e´conomique 45 (3), 809–829. Lewis, E. (2011). Immigration, skill mix, and capital skill complementarity. The Quar- terly Journal of Economics 126 (2), 1029–1069. Lewis, E. (2013). Immigration and production technology. Annu. Rev. Econ. 5 (1), 165–191. Li, H., P. Loyalka, S. Rozelle, and B. Wu (2017). Human capital and china’s future growth. Journal of Economic Perspectives 31 (1), 25–48. Li, H., Y. Ma, L. Meng, X. Qiao, and X. Shi (2017). Skill complementarities and returns to higher education: Evidence from college enrollment expansion in china. China Economic Review 46, 10–26. Li, S., J. Whalley, and C. Xing (2014). China’s higher education expansion and unem- ployment of college graduates. China Economic Review 30, 567–582. Liu, S. (2015). Spillovers from universities: Evidence from the land-grant program. Journal of Urban Economics 87, 25–41. Liu, S. and X. Yang (2021). Human capital externalities or consumption spillovers? the effect of high-skill human capital across low-skill labor markets. Regional Science and Urban Economics 87, 103620. Liu, Y., C. B. Barrett, T. Pham, and W. Violette (2020). The intertemporal evolution of agriculture and labor over a rapid structural transformation: Lessons from vietnam. Food Policy 94, 101913. Lovenheim, M. F. and J. Smith (2022). Returns to different postsecondary investments: Institution type, academic programs, and credentials. Technical report, National Bureau of Economic Research. 144 Malesky, E. and M. Taussig (2009). Out of the gray: The impact of provincial insti- tutions on business formalization in vietnam. Journal of East Asian Studies 9 (2), 249–290. Martellini, P., T. Schoellman, and J. Sockin (2022). The global distribution of college graduate quality. Federal Reserve Bank of Minneapolis Working Paper 791. Maxwell, D., B. Vaitla, and J. Coates (2014). How do indicators of household food insecurity measure up? an empirical comparison from ethiopia. Food Policy 47, 107 – 116. Mazzolari, F. and G. Ragusa (2013). Spillovers from high-skill consumption to low-skill labor markets. Review of Economics and Statistics 95 (1), 74–86. McCaig, B. and N. Pavcnik (2013). Moving Out of Agriculture: Structural Change in Vietnam. Technical report, National Bureau of Economic Research. McCaig, B. and N. Pavcnik (2015). Informal employment in a growing and globalizing low-income country. American Economic Review 105 (5), 545–50. McCaig, B. and N. Pavcnik (2018). Export markets and labor allocation in a low-income country. American Economic Review 108 (7), 1899–1941. McGuinness, S., E. Kelly, T. T. P. Pham, T. T. T. Ha, and A. Whelan (2021). Returns to education in vietnam: A changing landscape. World Development 138, 105205. McMillan, M., D. Rodrik, and C. Sepulveda (2017). Structural Change, Fundamentals and Growth: A Framework and Case Studies. Technical report, National Bureau of Economic Research. Melly, B. and G. Santangelo (2015). The Changes-in-Changes Model with Covariates. Universita¨t Bern, Bern. Ministry of Health (2021). New information from the ministry of health about the third wave. https://ncov.moh.gov.vn/web/guest/-/6847426-2117. [Online; accessed 4th May 2021]. 145 Ministry of Industry and Trade (2020). Báo cáo tình hình sản xuất công nghiệp và hoạt động thương mại tháng 7 và 7 tháng đầu năm 2020, giải pháp thực hiện trong thời gian tới [Report on industrial production and trade in July and the first 7 months of 2020: Policy implication]. Accessed on Sep 1st 2020 via https://www.moit.gov.vn/ web/guest/bao-cao-tong-hop1. Minnesota Population Center (2019). Integrated public use microdata series, Interna- tional: Version 7.2 [dataset]. Minneapolis, MN: IPUMS. https://doi.org/10. 18128/D020.V7.2. Mishra, K. and J. Rampal (2020). The covid-19 pandemic and food insecurity: A viewpoint on india. World Development 135, 105068. Moretti, E. (2004a). Estimating the social return to higher education: evidence from longitudinal and repeated cross-sectional data. Journal of econometrics 121 (1-2), 175–212. Moretti, E. (2004b). Workers’ education, spillovers, and productivity: evidence from plant-level production functions. American Economic Review 94 (3), 656–690. Morisset, J., V. T. Dinh, Q. H. Doan, D. M. Pham, D. Mandani, O. Pimhidzai, K.-A. Kaiser, D. V. Do, A. F. Alatabani, and J. Yang (2020). Taking stock : What will be the new normal for Vietnam? - The Economic impact of COVID-19. Ac- cessed via http://documents.worldbank.org/curated/en/101991595365511590/ Taking-Stock-What-will-be-the-New-Normal-for-Vietnam-The-Economic/ /-Impact-of-COVID-19. National Institute of Nutrition (2007). Vietnamese food composition table. Hanoi, Vietnam: Ministry of Health. Newell, R. G., B. C. Prest, and S. Sexton (2018). The gdp-temperature relationship: implications for climate change damages. Resour. Future Work. Pap. Newman, C., J. Rand, T. Talbot, and F. Tarp (2015). Technology transfers, foreign investment and productivity spillovers. European Economic Review 76, 168–187. 146 Nguyen, C. V. (2017). Do minimum wages affect firms’ labor and capital? evidence from vietnam. Journal of the Asia Pacific Economy 22 (2), 291–308. Nguyen, P. H., S. Kachwaha, A. Pant, L. M. Tran, S. Ghosh, P. K. Sharma, V. D. Shastri, J. Escobar-Alegria, R. Avula, and P. Menon (2021). Impact of covid-19 on household food insecurity and interlinkages with child feeding practices and cop- ing strategies in uttar pradesh, india: a longitudinal community-based study. BMJ open 11 (4), e048738. Nikulkov, A., C. B. Barrett, A. G. Mude, and L. M. Wein (2016). Assessing the impact of us food assistance delivery policies on child mortality in northern kenya. PloS one 11 (12), e0168432. Niles, M. T. and M. E. Brown (2017). A multi-country assessment of factors related to smallholder food security in varying rainfall conditions. Scientific reports 7 (1), 1–11. Olley, G. S. and A. Pakes (1996). The dynamics of productivity in the telecommunica- tions equipment. Econometrica 64 (6), 1263–1297. Ost, B., W. Pan, and D. Webber (2018). The returns to college persistence for marginal students: Regression discontinuity evidence from university dismissal policies. Journal of Labor Economics 36 (3), 779–805. Parajuli, D., D. K. Vo, J. Salmi, and N. T. A. Tran (2020). Improving the performance of higher education in vietnam: Strategic priorities and policy options. Paslakis, G., G. Dimitropoulos, and D. K. Katzman (2020). A call to action to address covid-19–induced global food insecurity to prevent hunger, malnutrition, and eating pathology. Nutrition reviews. Patrinos, H. A., P. V. Thang, and N. D. Thanh (2018). The economic case for education in vietnam. World Bank Policy Research Working Paper (8679). Pettit, B. and J. Hook (2005). The structure of women’s employment in comparative perspective. Social Forces 84 (2), 779–801. 147 Phan, D. and I. Coxhead (2013). Long-run costs of piecemeal reform: wage inequality and returns to education in vietnam. Journal of Comparative Economics 41 (4), 1106–1122. Picchioni, F., L. F. Goulao, and D. Roberfroid (2021). The impact of covid-19 on diet quality, food security and nutrition in low and middle income countries: A systematic review of the evidence. Clinical Nutrition. Porzio, T., F. Rossi, and G. Santangelo (2022). The Human Side of Structural Trans- formation. American Economic Review 112 (8), 2774–2814. Rand, J. and F. Tarp (2020). Micro, small, and medium enterprises in Vietnam. Oxford University Press. Ratcliffe, C., S.-M. McKernan, and S. Zhang (2011, July). How much does the sup- plemental nutrition assistance program reduce food insecurity? American Journal of Agricultural Economics 93 (4), 1082–1098. Reardon, T., M. F. Bellemare, D. Zilberman, et al. (2020). How covid-19 may disrupt food supply chains in developing countries. IFPRI book chapters, 78–80. Rossin-Slater, M. (2017). Maternity and family leave policy. Technical report, National Bureau of Economic Research. Ruan, J., Q. Cai, and S. Jin (2021). Impact of covid-19 and nationwide lockdowns on vegetable prices: Evidence from wholesale markets in china. American journal of agricultural economics. Ruhm, C. J. (1998). The economic consequences of parental leave mandates: Lessons from europe. The quarterly journal of economics 113 (1), 285–317. Russell, L. C., L. Yu, and M. J. Andrews (2022). Higher Education and Local Educa- tional Attainment: Evidence from the Establishment of US Colleges. The Review of Economics and Statistics, 1–32. Sa´nchez, A. and A. Singh (2018). Accessing higher education in developing countries: Panel data analysis from india, peru, and vietnam. World Development 109, 261–278. 148 Savy, M., S. Fortin, Y. Kameli, S. Renault, C. Couderc, A. Gamli, K. Amouzou, M. Perenze, and Y. Martin-Pre´vel (2020). Impact of a food voucher program in alleviating household food insecurity in two cities in senegal during a food price cri- sis. Food Security , 1–14. Schanzenbach, D. and A. Pitts (2020). How much has food insecurity risen? evidence from the census household pulse survey. Institute for Policy Research Rapid Research Report . Schoellman, T. (2012). Education quality and development accounting. The Review of Economic Studies 79 (1), 388–417. Selwaness, I. and C. Krafft (2021). The dynamics of family formation and women’s work: what facilitates and hinders female employment in the middle east and north africa? Population Research and Policy Review 40 (3), 533–587. Strang, L. and M. Broeks (2017). Maternity leave policies: trade-offs between labour market demands and health benefits for children. Rand Health Quarterly 6 (4). Swindale, A. and P. Bilinsky (2006). Household dietary diversity score (hdds) for mea- surement of household food access: Indicator guide. Swindale, A. and P. Bilinsky (2016). Household dietary diversity score (HDDS) for measurement of household food access: Indicator guide (v.2). Washington, D.C.: FHI 360/FANTA. Tarp, F. (2017). Growth, Structural Transformation, and Rural Change in Viet Nam: A Rising Dragon on the Move. Oxford University Press. The World Bank (2018). Climbing the ladder: Poverty reduction and shared prosperity in Vietnam. The World Bank (2020). In Global outlook: Pandemic, recession: The global economy in crisis. The World Bank (2020). Migration and development brief 33 October 2020–Phase II: COVID-19 Crisis through a Migration Lens. 149 The World Bank (2020, September). Vietnam macro monitoring (english). Accessed via http://documents1.worldbank.org/curated/en/213631600443823992/pdf/ Vietnam-Macro-Monitoring.pdf. The World Bank (2020). Improving the Performance of Higher Education in Vietnam: Strategic Priorities and Policy Options. Thi, H. T., M. Simioni, and C. Thomas-Agnan (2018). Assessing the nonlinearity of the calorie-income relationship: An estimation strategy –With new insights on nutritional transition in Vietnam. World Development 110, 192–204. Thomas, M. (2016). The impact of mandated maternity benefits on the gender differ- ential in promotions: Examining the role of adverse selection. Tran, A. (2013). Ties that bind: Cultural identity, class, and law in Vietnam’s labor resistance. Southeast Asia Program Publications. Tran, A. N. and S. Jeppesen (2016). Smes in their own right: The views of managers and workers in vietnamese textiles, garment, and footwear companies. Journal of business ethics 137, 589–608. Tuoi Tre online (2020). Stop all flights between vietnam and china starting from this evening. [Online; accessed 4th May 2021]. Uribe, A. M. T., C. O. Vargas, and N. R. Bustamante (2019). Unintended consequences of maternity leave legislation: The case of colombia. World Development 122, 218– 232. Vaitla, B., J. Coates, L. Glaeser, C. Hillbruner, P. Biswal, D. Maxwell, et al. (2017). The measurement of household food security: Correlation and latent variable analysis of alternative indicators in a large multi-country dataset. Food Policy 68 (C), 193–205. Vaitla, B., J. Coates, and D. Maxwell (2015). Comparing household food consumption indicators to inform acute food insecurity phase classification. Washington, DC: FHI 360. Valero, A. (2021). Education and Economic Growth. 150 Vandenbussche, J., P. Aghion, and C. Meghir (2006). Growth, distance to frontier and composition of human capital. Journal of economic growth 11 (2), 97–127. Verpoorten, M., A. Arora, N. Stoop, and J. Swinnen (2013). Self-reported food insecu- rity in africa during the food price crisis. Food Policy 39, 51 – 63. Vietnam National Administration of Tourism (2021a). International tourism revenue. https://vietnamtourism.gov.vn/index.php/statistic/receipts. Vietnam National Administration of Tourism (2021b). International tourism statistics. http://www.vietnamtourism.gov.vn/index.php/statistic/international? txtkey=&year=2019&period=t12. Vietnambiz (2020a). Pork prices in vietnam. https://vietnambiz.vn/ gia-thit-heo-hom-nay-15-8-giam-10-tai-cua-hang-vissan-20200815084310709. html. [Online; accessed 16th August 2020]. Vietnambiz (2020b). Rice prices in vietnam. https://vietnambiz.vn/gia-gao.html. [Online; accessed 16th August 2020]. von Carnap, T. (2021). Remotely-sensed market activity as a short-run economic indi- cator in rural areas of developing countries. Available at SSRN 3980969 . Vu, L. H. and A. T. Nguyen (2018). Analysis of access and equity in the vietnamese higher education system. VNU Journal of Science: Policy and Management Stud- ies 34 (4), 65–80. Webb, P., J. Coates, E. A. Frongillo, B. L. Rogers, A. Swindale, and P. Bilinsky (2006). Measuring household food insecurity: why it’s so important and yet so difficult to do. The Journal of nutrition 136 (5), 1404S–1408S. Wolfson, J. A. and C. W. Leung (2020). Food insecurity and covid-19: Disparities in early effects for us adults. Nutrients 12 (6), 1648. World Bank (1995). Viet Nam: Economic Report on Industrialization and Industrial Policy. World Bank, DC. 151 World Food Programme (2015). Community-based targeting guide. Kenya: World Food Programme. Xing, C., P. Yang, and Z. Li (2018). The medium-run effect of china’s higher education expansion on the unemployment of college graduates. China Economic Review 51, 181–193. Yang, J., P. Panagoulias, and G. Demarchi (2020). Monitoring covid-19 impacts on households in vietnam, report no. 1. Zhou, Y., E. Lentz, H. Michelson, C. Kim, and K. Baylis (2021). Machine learning for food security: Principles for transparency and usability. Applied Economic Perspec- tives and Policy . Zimmerman, S. D. (2014). The returns to college admission for academically marginal students. Journal of Labor Economics 32 (4), 711–754. Appendix A Higher Education Expansion, Labor Market, and Firm Productivity A.1 Additional Tables and Figures Table A.1: Summary statistics by year and district Control Treatment 2006-2011 2012-2018 2006-2011 2012-2018 Log TFP 2.512 2.530 2.530 2.541 (0.141) (0.141) (0.131) (0.143) Labor productivity 17.686 17.962 17.812 17.986 (0.972) (1.133) (0.943) (1.231) Capital-labor ratio 19.839 20.014 20.007 20.197 (1.225) (1.332) (1.168) (1.265) Log total employment 2.431 2.364 2.330 2.222 (1.322) (1.530) (1.235) (1.464) 152 153 Table A.2: Summary statistics by year and district Control Treatment 2011 2015-2019 2011 2015-2019 % of adults who complete college or higher 0.064 0.086 0.157 0.206 (0.054) (0.073) (0.110) (0.127) % of adults who are employed 0.325 0.400 0.526 0.596 (0.136) (0.165) (0.125) (0.110) % of non-college adults who are employed 0.285 0.353 0.454 0.513 (0.129) (0.164) (0.106) (0.104) % of college-educated adults who are employed 0.924 0.891 0.903 0.893 (0.095) (0.130) (0.076) (0.058) Log monthly wage 10.111 10.299 10.329 10.643 (0.218) (0.362) (0.227) (0.257) Log monthly wage of non-college workers 10.034 10.245 10.199 10.568 (0.223) (0.386) (0.200) (0.238) Log monthly wage of college-educated workers 10.399 10.664 10.577 10.848 (0.226) (0.254) (0.235) (0.259) Skill premium 0.360 0.408 0.378 0.280 (0.252) (0.363) (0.193) (0.179) 154 Figure A.1: Event study estimation (cohort) for other labor market outcomes (a) Formal employment (b) Occupational wage rank (c) Industry skill intensity rank Note: The graphs display event study estimation for the effects of the higher education expansion on other labor market outcomes. Formal employment is defined as having a job that provides social insurance. Occupational wage rank measures how well-paid a given occupation is on the wage ladder; higher rank means having a higher- paid occupation. Industry skill intensity rank measures the how many college-educated workers in a given industry relative to others. All models control for age, age squared, and gender. Standard errors are clustered at the province-level and 95% confidence intervals are displayed. 155 Figure A.2: Event Study for Worker-Level Effects Using Different Control Groups (a) Complete college (b) Employment (c) Monthly wage (d) Agricultural employment (e) Manufacturing employment (f) Service employment Note: Event study estimation for the effects of the higher education expansion on individual-level outcomes using different control groups. All models control for age, age squared, and gender. Standard errors are clustered at the province-level and 95% confidence intervals are displayed. 156 A.2 Estimating total factor productivity at the firm level Consider the following Cobb-Douglas production function for firm i in year t: V Ait = βlLit + βkKit + ωit + uit where V Ait is the annual value-added, Lit is total labor, Kit is capital, measured as the value of assets at the beginning of the year (Newman et al., 2015), and ωit is the unobserved productivity shock. Given that OLS is typically biased as both Lit and Kit are likely affected by the unobserved productivity shock, we first assume that firms’ investment decision is a function of labor, capital, and productivity shock, i.e., Iit = ft(Lit,Kit, ωit), 1 which makes ωit observable in the production function (by inverting ft): V Ait = βlLit + βkKit + f −1 t (Lit,Kit, Iit) + uit This approach by Ackerberg et al. (2015) (ACF) is different from two other convention approaches to estimate production functions, namely Olley and Pakes (1996) (OP) and Levinsohn and Petrin (2003) (LP), who do not include labor input as part of firms’ investment decision. Not allowing labor input to enter the investment function means that Lit is a deterministic function of capital and investment and, hence, would be functionally dependent on the inverse function of investment; in other words, the coefficient of labor would not be identified. Assuming that the productivity shock follows a first order Markov process, we can write ωit−1 = g(ωit−1) + ζit where g(ωit−1) is the predictable component and ζit is the unpredictable/innovation component of productivity (Olley and Pakes, 1996). We also assume the following capital formation process: Kit = (1 − δ)Kit−1 + Iit−1. These assumptions give us E[ζit|Iit−1] = 0 and E[ζit|Kit] = 0 (since Kit is determined at t− 1). Lastly, we assume that E[ζit|Lit−1] = 0 (Ackerberg et al., 2015). Given this set of moment conditions, we can estimate βl and βk. Given that this approach requires panel data, we aggregate the firm-level variables to the district-by-year-by-industry level. We then estimate the production function by sectors and present the estimation results in Table A.3. Once we obtain the estimates 1Investment is measured as annual change in value of fixed and long-term assets plus accumulated depreciation (Newman et al., 2015). 157 for βl and βk, we can use these estimates to calculate ωit for each firm in each year, which is also our measure of total factor productivity (TFP). Table A.3: Production function estimation results Capital Labor Agriculture 0.271*** (0.098) 1.143*** (0.392) Mining 0.222*** (0.031) 1.063*** (0.048) Manufacturing 0.342*** (0.031) 0.919*** (0.041) Waste and electricity 0.380*** (0.065) 0.980*** (0.126) Construction 0.198*** (0.028) 0.948*** (0.028) Wholesale and retail 0.201*** (0.036) 1.386*** (0.055) Transportation 0.057 (0.038) 1.103*** (0.055) Hospitality 0.222*** (0.071) 0.682*** (0.186) Information and communication 0.314*** (0.062) 1.035*** (0.105) Finance, banking, and real estate 0.175*** (0.024) 1.348*** (0.088) Science and technology 0.123*** (0.035) 1.384*** (0.050) Administrative and support 0.156*** (0.041) 1.144*** (0.046) Education, health, and social support -0.116 (0.191) 1.535*** (0.296) Entertainment -0.112 (0.166) 1.677*** (0.268) Other services 0.236 (0.250) 1.220** (0.584) Appendix B Maternity Benefits Mandate and Women’s Choice of Work B.1 Additional Tables and Figures Figure B.1: Reasons for not working, 2014–18 0.0 0.2 0.4 0.6 0.8 Age 25-44 Age 45-54 % Male 0.0 0.2 0.4 0.6 0.8 Age 25-44 Age 45-54 % Female Studying Housework Old/retired Disabled Cannot find a job Others Source: Authors’ calculations based on the VHLSS 2014–18 for individuals aged 22–65. 158 159 Figure B.2: DiD estimates for effects by district–age-group birth-rate bin by gender (a) Formal employment -0.10 0.00 0.10 0.20 1 2 3 4 5 Birth rate group Male Female Estimated coefficients and 95% CI (b) Informal waged employment -0.15 -0.10 -0.05 0.00 0.05 1 2 3 4 5 Birth rate group Male Female Estimated coefficients and 95% CI (c) Not working -0.10 -0.05 0.00 0.05 1 2 3 4 5 Birth rate group Male Female Estimated coefficients and 95% CI (d) Agricultural household work -0.20 -0.10 0.00 0.10 0.20 1 2 3 4 5 Birth rate group Male Female Estimated coefficients and 95% CI (e) Non-agricultural household work -0.20 -0.10 0.00 0.10 0.20 1 2 3 4 5 Birth rate group Male Female Estimated coefficients and 95% CI (f) Log monthly income -0.20 0.00 0.20 0.40 1 2 3 4 5 Birth rate group Male Female Estimated coefficients and 95% CI Note: The plots report the results from estimating the DiD model in Equation 3.5, in which treatment effects are allowed to vary by birth-rate groups. Birth rates vary by district and age, and are binned into five equal groups. All models control for urban, ethnicity, household size, number of children under age 10 in the household, educational attainment, and marital status. The model is estimated separately for men and women aged 25–45 in the VHLSS 2010–18 sample. Standard errors are clustered at the commune–year level and sampling weights are applied in the regressions. Source: Authors’ calculations based on the VHLSS 2010–18. 160 B.2 Overview of the Vietnamese labor market composi- tion Compared to other low- to middle-income countries, Viet Nam has a relatively high female labor force participation rate and gender equality (Klasen et al., 2021). In this section, we provide an overview of the demographic compositions and changes of the labor market in Vietnam during 2010 to 2018 using the Vietnam Household Living Standard Survey (VHLSS). The labor force participation rate among women aged 25–54 is slightly lower than that of men in the same age group. The female labor force participation rate of college- educated women is roughly 95 per cent and that of non-college-educated women is roughly 92 per cent; these rates were relatively stable during the 2010–18 period, as indicated in Figure B.3. The labor force participation rate among college-educated and non-college-educated men is roughly 97 per cent, but for college-educated men it increased to 98 per cent in 2018. Married individuals are more likely to work than are unmarried individuals, but participation among unmarried men increased from 88.2 per cent to 90.7 per cent during 2014–17. Figure B.3: Labor force participation of men and women aged 25–54 in 2018 by college education and marital status (a) By education 0.86 0.88 0.90 0.92 0.94 0.96 0.98 2010 2012 2014 2016 2018 Year Non-college men College-educated men Non-college women College-educated women Labor force participation (b) By marital status 0.86 0.88 0.90 0.92 0.94 0.96 0.98 2010 2012 2014 2016 2018 Year Unmarried men Married men Unmarried women Married women Labor force participation Note: The sample includes all men and women aged 25–54. Source: Authors’ calculations based on the Viet Nam Household Living Standard Survey (VHLSS) 2010–18 (see text for description). 161 The labor market composition is also remarkably different across gender and college education, as illustrated in Figure B.4. Most non-college-educated men and women work for the household business, which is typically unpaid, while college-educated men and women mainly work in the formal sector (defined as wage employment that provides social insurance). Only a small share of non-college-educated women are casual wage workers (waged employment without social insurance) relative to non-college-educated men. The labor market composition is relatively stable over time, although the share of non-college-educated men and women who work in the formal sector appears to increase over time. Figure B.4: Labor market composition in 2010-2018 by gender and college education (a) Non-college women 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 2010 2012 2014 2016 2018 Employment status by year Not working Family business Casual work Formal employment (b) College-educated women 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 2010 2012 2014 2016 2018 Employment status by year Not working Family business Casual work Formal employment (c) Non-college men 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 2010 2012 2014 2016 2018 Employment status by year Not working Family business Casual work Formal employment (d) College-educated men 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 2010 2012 2014 2016 2018 Employment status by year Not working Family business Casual work Formal employment Note: The sample includes all men and women aged 25–54. Source: Authors’ calculations based on the VHLSS 2010–18. The gender gap in formal employment, defined as the difference between the share of men with formal employment and the share of women with formal employment, 162 appears to have declined across the country during this period, as indicated in Figure B.5. The Red River Delta has the highest and the Central Highlands has the lowest gender gap in 2010. By 2018, most regions had already reversed the gender gap, except for the Southeast region. This pattern is caused by the share of women working in the formal sector rising faster than the share of men in the formal sector.1 One potential explanation for this shift in the gender gap is the fact that the share of women who are college-educated also rises faster than the share of men who are college-educated in these regions. Figure B.5: Gender gap in formal employment by region and year -0.06 -0.04 -0.02 0.00 0.02 0.04 2010 2012 2014 2016 2018 Gender gap Red River Delta -0.06 -0.04 -0.02 0.00 0.02 0.04 2010 2012 2014 2016 2018 Gender gap Northern Midland and Mountainous -0.06 -0.04 -0.02 0.00 0.02 0.04 2010 2012 2014 2016 2018 Gender gap Central Coast -0.06 -0.04 -0.02 0.00 0.02 0.04 2010 2012 2014 2016 2018 Gender gap Central Highlands -0.06 -0.04 -0.02 0.00 0.02 0.04 2010 2012 2014 2016 2018 Gender gap Southeast -0.06 -0.04 -0.02 0.00 0.02 0.04 2010 2012 2014 2016 2018 Gender gap Mekong River Delta Note: The sample includes all men and women aged 25–54. Source: Authors’ calculations based on the VHLSS 2010–18. 1According to the VHLSS data, the share of women with household unpaid work decreases faster than the share of men with household unpaid work. As a result, there is an increase in the gender gap in household unpaid work during 2010–18. Appendix C Income Shock and Food Insecurity during the Pandemic C.1 Additional Tables and Figures Table C.1: First-stage estimates for income effects on food insecurity Specification (1) (2) (3) (4) (5) (6) Panel A: Bartik IV 1.361*** 1.587*** 1.567*** 0.246*** 0.368*** 0.440*** (0.044) (0.047) (0.063) (0.068) (0.071) (0.075) [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] N 46443 46443 27462 46443 46443 46443 Panel B: Non-ag. Bartik IV 1.444*** 1.669*** 1.902*** 0.189*** 0.472*** 0.421*** (0.050) (0.055) (0.074) (0.072) (0.078) (0.081) [0.000] [0.000] [0.000] [0.009] [0.000] [0.000] N 46443 46443 27462 46443 46443 46443 Province FE ✓ ✓ ✓ ✓ District FE ✓ Household FE ✓ Year FE ✓ ✓ Province FE x linear trends ✓ Province x Year FE ✓ The table reports the first-stage results from the IV estimations in Tables 4.2 and 4.6. Standard errors are clustered at the district level and are reported in parentheses. P-values are reported in brackets. The household fixed effects model does not include 2009 controls. All models control for urban, household size, and the fraction of households with a postsecondary education. *** p < 0.01, ** p < 0.05, * p < 0.1 163 164 Figure C.1: Predicted probability distributions by actual food insecurity status Note: The graph shows the distributions of the predicted probability of food insecurity by household’s actual food insecurity status. The predicted probability of food insecurity is obtained from estimating the IV regression on the 2010-2018 VHLSS data. See text for more details. (a) IV (b) IV Probit Figure C.2: Difference in prevalence and Cohen’s Kappa statistics to choose the optimal threshold for high-risk households for the linear probability model Note: For each value c between 0th to 99th percentile, we use c as a threshold to classify the “high-risk” household (based on the predicted probability) and calculate the difference between predicted prevalence, i.e. the share of “high-risk” households minus the share of households that are actually food insecure, and the Cohen’s Kappa statistic. We plot the differences in prevalence and the kappa statistics for all thresholds in this graph. Predicted probability is generated from the IV and IV Probit models as explained in Section 4.3.3. 165 (a) Validation using 2016 data (b) Validation using 2018 data Figure C.3: Predicted and actual food insecurity of 2018 for IV probit and linear IV Note: Predicted food insecurity measures each province’s share of households with high predicted food insecurity risk; Actual food insecurity measures the share of households with actual food insecurity in each province using the VHLSS sample in year y. Predicted food insecurity risk is obtained by estimating the effect of income on food insecurity using the pre-y VHLSS sample and predicting using the LFS sample in y. C.2 Validity of food insecurity measurement In this section, we discuss the validity of the paper’s main measure of food insecurity. We compare the main measure of food insecurity of this study, that is, when a household’s staple calorie share is above 84%, with two other food insecurity measures that are commonly used: the measure using the HDDS and the self-reported food insecurity. Given that food insecurity is strongly related to the ability to afford everyday meals, a valid food insecurity measure would strongly correlate with household income and wealth. One may also expect that rural households and ethnic minority households would be more likely to be food insecure as they are more disadvantaged than urban 166 and Kinh households. In Figure C.4, we plot the percentage of households classified as food insecure using different measures by household income deciles (Figure (a)) and household wealth deciles (Figure (b)) for urban and rural households separately. In Figure (c), we also plot the percentage of food-insecure households by the household head’s ethnicity. All three measures of food insecurity decrease as household income and wealth in- crease, and this is true for both rural and urban households. However, self-reported food insecurity is considerably lower than the other two measures. This discrepancy is partly due to selection bias: households only answer the question about food insecurity if they have an official poverty status, so poor households and near-poor households without such a status would not be considered food insecure. Another reason is that the staple calorie share approach and the HDDS approach focus on the quality of diet besides quantity, so food-insecure households may have a similar number of meals but a very different quality than food-secure households. Our staple calorie share approach is more similar to the HDDS approach for this reason as well. We also find that all three measures of food insecurity are higher among ethnic minority households than Kinh households. These patterns are also consistent when we aggregate the data at the province level. In Figure C.5, we graph the scatter plots of different food insecurity measures against poverty at the province level, which tells a similar story—food insecurity measured by the staple calorie share and by HDDS is strongly and positively correlated with poverty, while self-reported food insecurity is positively correlated with poverty but not as strongly. These results suggest that the food insecurity measure using the staple calorie share is valid since it is strongly correlated with income, wealth, and poverty, similar to the other common measures using the HDDS and self-reported data. The staple calorie share approach is better than the HDDS approach because it uses a theoretically derived cutoff for food insecurity, and it is better than the self-reported measure because it does not suffer from the selection bias described above. 167 (a) Income (b) Wealth (c) Ethnicity Figure C.4: Food-insecure household by household income and wealth deciles The figure reports the share of food-insecure households by income deciles, wealth index deciles, and ethnicity. A lower decile means lower income or lower wealth. The wealth index is constructed using principal component analysis for electricity, piped water, air conditioner, computer, washing machine, refrigerator, television, and radio. 168 C.3 Comparison between different targeting approaches An important question to ask is whether using the targeting method proposed in this study is much better than using simpler methods such as targeting poor households, targeting districts that suffer the largest income reductions, or targeting districts with the largest increases in poverty. In this section, we calculate the inclusion error rate (IER) and exclusion error rate (EER) for each of these three approaches to measure the extent to which these alternative methods include wrong households/districts and exclude right households/districts (Brown et al., 2018). Consider the poor household targeting approach: the government sends food aid to households below the national poverty line before the pandemic. Let poor2019i be a binary variable that indicates whether household i is poor in 2019, and let ̂insecurityi be a binary variable that indicates whether the household has high food insecurity risk using the prediction approach of this study.1 Let wi denote the household’s sampling weight from the LFS. The IER is the proportion of poor, but food-secure households are targeted and can be calculated as IER = ∑N i wi.1(poori = 1| ̂insecurityi = 1)∑N i wi.1(poori = 1) . In contrast, the EER is the proportion of food-insecure households that are not targeted because they are not poor before the pandemic, and it is calculated as EER = ∑N i wi.1( ̂insecurityi = 1|poori = 0)∑N i wi.1( ̂insecurityi = 1) . The IER of the poor household targeting approach is 56.38% (95% CI: 55.84, 56.93), and the EER is 11.05% (95% CI: 10.46, 11.63). In other words, 56.38% of targeted households are actually not food insecure. Only 11.05% of food-insecure households would be missed using this method. 1Recall that food insecurity risk is the predicted value from Equation 4.2 using the post-pandemic income, and high-risk households are those with risk above the 85th percentile cutoff. 169 Figure C.5: Food insecurity and poverty by province and year The figure shows scatter plots for shares of food-insecure households and shares of poor households at the province-year level for the 2010–2018 data. Each dot is a province-year observation. Figure (a) shows food insecurity using the staple calorie share approach, (b) shows food insecurity using the HDDS approach, and (c) shows self-reported food insecurity. Next, we use the same approach to assess two other alternative approaches: targeting districts with the highest income losses and targeting districts with the largest poverty increases. To do this, we estimate changes in income and poverty due to COVID-19 for each district using the same method to predict changes in food insecurity. Specifically, we use the 2019 LFS household data and the industry-specific income shocks to calculate the new income for each household due to these shocks, and then we estimate the percentage change in income for each district. We also use the original income and new income to identify the poverty status, that is, monthly income per capita below the national poverty line, for each household before and after the income shock. We then 170 estimate the percentage point change in poverty for each district. Figure C.6 shows the scatter plots of district-level changes in food insecurity (our main result) against changes in income and changes in poverty. To see why using income targeting and poverty targeting may lead to ineffective targeting, suppose the government chooses to target districts in the top 10% of those that experienced an increase in food insecurity; that is, their predicted food insecurity increases are in the 90th percentile. In other words, the government would want to target all districts to the right of the red dashed line in the plots. Now suppose that the government does not use food insecurity targeting, but instead targets districts in the top 10% of those that saw a reduction in income; that is, their predicted income reductions are in the 90th percentile. In other words, the government would target all districts above the green solid line in Figure (a). In this case, all districts to the left of the red dashed line and above the green solid line would be wrongly targeted as they experience a food insecurity increase that is below the 90th percentile. In contrast, all districts to the right of the red dashed line and below the green solid line would be missed because they experience food insecurity increase in the 90th percentile but would not be targeted. The correctly targeted districts are the dots in the first quadrant of the graph. In this scenario, the government would wrongly target 68 districts, miss 68 districts, and correctly target 12 districts. We apply the same IER and EER formulas for districts instead of households and calculate each district’s as one over the district’s population. The IER is 75.96% (95% CI: 66.80, 85.13) and the EER is 72.31% (95% CI: 62.37, 82.25). 171 (a) Comparing with income targeting (b) Comparing with poverty targeting Figure C.6: Food insecurity change targeting versus income change targeting and poverty change targeting Note: Figure (a) shows a scatter plot of the predicted percentage reduction in income against the predicted percentage point increase in food insecurity. Figure (b) shows a scatter plot of the predicted percentage point increase in poverty against the predicted percentage point increase in food insecurity. Each dot represents a district. The red dashed line shows the 90th percentile of the increase in food insecurity. The green solid line shows the 90th percentile of the income reduction in Figure (a) and the 90th percentile of poverty increase in Figure (b). Now suppose that the government targets districts in the top 10% of districts that experienced an increase in poverty; that is, their predicted poverty increases are in the 90th percentile—all districts above the green solid line in Figure (b). Similarly, this targeting would mistarget all districts to the left of the red dashed line and above the green solid line but would miss all districts to the right of the red line and below the green line. In this scenario, the government would wrongly target 59 districts, miss 20 districts, and correctly target 60 districts. The IER here would be 90.35% (95% CI: 85.64, 95.05) and the EER 77.82% (95% CI: 68.64, 87.00). These simple exercises suggest that although targeting using pre-pandemic poverty statuses of households or using income losses is simpler, the error rates associated with them are very high; using them means the government would spend a significant amount of resources on districts it does not intend to and miss the districts that actually need those resources. 172 C.4 Comparison between IV-based prediction and 2020 data A major concern of our predictive approach is that we exploit local exposure to industry- wide labor demand shocks to household incomes to identify the effect of income on food insecurity. We also rely on external information about sector-wide income shocks and assume that all workers within the same sector have an equal chance to receive such shocks. The income shocks from the pandemic might be very different in nature. If the pandemic income shocks are driven by unemployment in a nonrandom way, that is, more vulnerable workers lose their jobs and income, then the IV-based prediction is no longer accurate. We address this concern by comparing our IV-based prediction with the food inse- curity measure generated from 2020 household income data (using the 2020 LFS data). If there is a large difference between our prediction and the food insecurity measure that generated the actual data, we can conclude that the income shocks from our IV estimation are not very compatible to make predictions about income shocks caused by a pandemic. We plot each district’s share of high-risk households based on 2020 income data and the share of high-risk households based on our prediction in Figure C.7 (a) along with a 45-degree line as the benchmark. In Figure (b), we plot the distribution of the percentage point difference (in absolute value) between the two variables. 173 (a) IV (b) IV Probit Figure C.7: Differences in share of “high-risk” households based on prediction and share of “high-risk” households based on 2020 income Note: Figure (a) shows districts’ share of high-risk households based on 2020 income data and districts’ shares of high-risk households based on our prediction. Figure (b) shows the distribution of differences between these two shares. We observe in Figure (a) that our IV-based prediction is fairly consistent with the food insecurity measure generated from the actual 2020 income data, although we also observe a few outlier districts. In Figure (b), we find that the difference between the two variables is 10 percentage points or less for 82.18% of districts. In other words, our predicted food insecurity measure is fairly close to the food insecurity measure based on the actual income data in 2020. This finding suggests our approach is compatible with predicting food insecurity during a pandemic, even though the income shocks caused by the pandemic might be driven mainly by unemployment or furlough.