Panneer Selvam, Harish2021-08-162021-08-162021-05https://hdl.handle.net/11299/223107University of Minnesota M.S.M.E. thesis. May 2021. Major: Mechanical Engineering. Advisor: William Northrop. 1 computer file (PDF); ix, 75 pages.On-board diagnostics (OBD) data contain valuable information including real-world measurements of vehicle powertrain parameters. These data can be used to gain a richer data-driven understanding of complex physical phenomena like emissions formation during combustion. In this thesis, a physics-based artificial intelligence framework is developed to predict and analyze trends in engine-out NOx emissions of diesel and diesel-hybrid heavy-duty vehicles. This framework differs from black box machine learning models presented in previous literature because it incorporates engine combustion parameters that allow physical interpretation of the results. Based on chemical kinetics and the characteristics of diffusive combustion, NOx emissions from compression ignition engines primarily depend non-linearly on three parameters: adiabatic flame temperature, intake oxygen concentration, and combustion time duration. Here, these parameters were calculated from available OBD data. Non-linear regression coupled with a novel Divergent Window Co-occurrence (DWC) Pattern Detection algorithm is observed to be an effective method to predict NOx emissions and analyze driving patterns from the OBD data where prediction errors are high. The proposed framework is validated for generalizability with a separate vehicle OBD dataset, a sensitivity analysis is performed on the prediction model, and its predicted values are compared with that from a black-box deep neural network. The results show that NOx emissions predictions using the proposed model has around 55% better root mean square error, and around 60% higher mean absolute error compared to the baseline NOx prediction model from previously published work. This framework serves as a transparent and interpretable physics-based model to predict NOx emissions using OBD data as input. Furthermore, linearizing the physics-based NOx equation provides an opportunity to evaluate several machine learning regression techniques. The results show that an ensemble learning bagging-type model like random forest regression (RFR) is highly effective in predicting engine-out NOx emissions. We also show that real-world OBD data has high heterogeneity with clustered co-occurrences of vehicle parameters. In terms of accuracy, the developed RFR model provides an average of 53% improvement in R2 value and 42% better mean absolute error (MAE) for NOx emissions predictions compared to non-linear regression models, and provided the opportunity to interpret the results because of its linkage to physical parameters. We also perform a feature importance analysis for the RFR Model and compare prediction results with black box deep neural network and non-linear regression models. Based on its high accuracy and interpretability, the developed RFR model has potential for use in on-board NOx prediction in engines of varying displacement and design.enDriving pattern analysisEmissions modelingMachine learningNeural networksOn-board diagnsotics dataphysics-based AIPhysics-Based Artificial Intelligence Models for Vehicle Emissions PredictionThesis or Dissertation