Guo, Chao2019-02-122019-02-122018-09https://hdl.handle.net/11299/201697University of Minnesota Ph.D. dissertation. September 2018. Major: Computer Science. Advisor: Stergios Roumeliotis. 1 computer file (PDF); xi, 222 pages.Most mobile devices (e.g., cell phones, tablets, and wearable computers) are equipped with GPS receivers, which, in conjunction with city and planet-scale maps, provide unprecedented way-finding capabilities to both humans and robots. Such systems, however, are not able to operate properly in GPS-denied areas, such as indoors, underwater, and underground. This thesis presents an alternative real-time 3D localization approach based on inertial and visual sensors, which are small-size, low-cost, and can be found on most modern mobile devices. In particular, it seeks to address four key challenges for enabling vision-aided inertial navigation (VINS) on mobile devices. Firstly, since the VINS accuracy is often limited by the low-quality visual-inertial measurements provided by mobile devices, additional sensors, such as RGBD cameras, wheel encoders, and LIDARs, have been employed to reduce the positioning error. Their measurements, however, are expressed in their own reference frame and fusing them with VINS requires finding the 6 degrees of freedom (dof) transformation between each of these sensors and the inertial measurement unit (IMU) or camera. The first main contribution of this thesis are algorithms for precisely determining the extrinsic calibration parameters between camera-LIDAR, camerawheel encoders, and IMU-RGBD camera pairs. Furthermore, the observability properties of the proposed calibration systems are analyzed, so as to determine whether and under what conditions (e.g., number of measurements, required sensor motion) the 6 dof extrinsic parameters are observable. The second limitation of most VINS is that they assume the camera and IMU measurements are recorded at the same time instants. On mobile devices, however, visual and inertial sensors provide data with a time-varying time delay because of the time synchronization (TS) (i.e., time offset between the IMU and camera clocks) and rolling-shutter (RS) (i.e., image rows are exposed successively) issues. Since the RS and TS typically introduce larger image distortions as compared to those due to image noise, neglecting their impact will significantly reduce the VINS’s accuracy and even cause it to diverge. In order to address this issue, this thesis introduces a linear-complexity extended Kalman-filter (EKF)-based VINS that estimates the values of and compensates for the RS/TS effect online. Moreover, it is shown that the RS/TS delays become observable given camera observations to at least two point features. Moreover, in the absence of global positioning information, the VINS’s estimation errors will accumulate over time. One way to provide corrections is to construct a map, comprising of visual features, and localize the device by matching the currently-viewed features with the mapped ones. If the mobile device is to be used, for most of the time, within the same area, a desirable solution is to create a high-accuracy map offline, and then use it for the online localization. For doing so, and since the device may not have sufficient resources for recording data corresponding to a large area, this thesis introduces a cooperative mapping (CM) algorithm for creating a map from visual-inertial datasets collected by multiple users at different times. The proposed CM algorithm is proven to be batch-least-squares (BLS) optimal, while it is parallelizable, and has lower processing and memory requirements as compared to the standard BLS. Furthermore, it is resource-aware, as it allows to trade mapping accuracy for processing speed. Lastly, when the mobile device enters an unknown area where a previously constructed feature map is not available, to correct the drift of VINS estimates, a map has to be created online. Due to its nonlinearity, a BLS visual-inertial mapping problem is typically solved iteratively, where in each iteration a linearized system is established and solved. In this thesis, two algorithms are presented where each speeds up one of these two operations. First, an IMU preintegration algorithm is proposed which allows the majority part of the IMU Jacobians to be computed only once through all the iterations. This method is shown to dramatically cut down the linear system establishing time, while losing negligible estimation accuracy. Second, since the solving cost of the linearized mapping system increases with the number of the system’s states, this work provides a various of consistent approximations for reducing the dimensionality of the problem while preserving the sparsity of the resulting, reduced system. In particular, these approximate solutions have different trade-off between the mapping accuracy and processing cost, and thus is able to serve as a toolbox where a more appropriate one can be selected according to the current conditions and requirements. In summary, this thesis aims to provide accurate, robust, and efficient solutions to the key components of a high-precision IMU-camera based localization and mapping system, ranging from sensor calibration, to real-time visual-inertial odometry, and offline and online mappings. Other than way findings, the research presented is also essential for a variety of other robotics applications, such as virtual or augmented reality, autonomous warehousing, and self-driving carsenHigh-precision 3D Localization and Mapping on Mobile DevicesThesis or Dissertation