Browsing by Subject "computer vision"

Now showing 1 - 8 of 8

Lectures on moving frames
(2009-01-21) Olver, Peter J.
This article surveys the equivariant method of moving frames, along with a variety of applications to geometry, differential equations, computer vision, numerical analysis, the calculus of variations, and invariant curve flows.
Low-Signal Passive Non-Line-of-Sight Imaging
(2023-12) Hashemi, Connor
In recent years, significant progress has been made in passive non-line-of-sight imaging, which looks around corners at hidden objects using just scattered light off a rough surface. Since passive non-line-of-sight imaging can obtain information about the surrounding environment that was previously deemed irrecoverable, it has many far-reaching applications, such as improving autonomous vehicles, aiding search-and-rescue operations, and performing military surveillance. While the bulk of progress has focused on improving the resolution and capabilities of existing non-line-of-sight imaging algorithms, most methods assume a high signal-to-noise ratio and low amounts of background signals. These conditions are impossible to find outside of highly-controlled laboratories and do not correspond to real-world applications. This thesis considers "low-signal'' passive non-line-of-sight imaging, where the desired imaging signal is either swamped by stochastic noise or structured background signals that interfere with conventional non-line-of-sight reconstructions. These scenarios mimic what is found in real-world environments. To image in these difficult low-signal scenarios, this thesis delves into two main works: unmixing spectral content of the scattered light and denoising thermal scattered imagery. As a result of these methods, non-line-of-sight imaging can be performed in many scenarios previously deemed too difficult, and future low-signal imaging can build off the principles laid in this thesis.
Metaverse in the Wild: Modeling, Adapting, and Rendering of 3D Human Avatars from a Single Camera
(2022-06) Yoon, Jae Shin
Metaverse is poised to enter our daily lives as new social media. One positive application would be tele-presence that allows users to interact with others through the photorealistic 3D avatars using AR/VR headsets. Such tele-presence requires high fidelity 3D avatars, depicting fine-grained appearance, e.g., pore, hair, wrinkle on face, from any viewpoint. Previous works have utilized a system of multiview cameras to generate the 3D avatars, which enables measuring appearance and 3D geometry of a subject. Deploying such large camera systems in our daily environment, however, is often difficult in practice due to the requirement of camera infrastructure with precisely controlled lighting. In this dissertation, I will develop a computational model that can reconstruct a 3D human avatar from a single camera whose quality is equivalent to that from multi-camera system by learning from data. The main challenge for learning to reconstruct a 3D avatar from a single camera comes from the lack of 3D ground truth data. A distribution of human geometry and appearance is extremely diverse, depending on a number of parameters such as identity, shape (slim vs. fat), pose, apparel style, viewpoint, and illumination. While a data-driven model requires to learn from the data that can span such diversity, no such data exists to date. I address this challenge by developing a set of self-supervised algorithms that allow learning a generalizable visual representation of dynamic humans to reconstruct a 3D avatar from a single camera; to adapt the 3D avatar to unconstrained environment; and to render fine-grained appearance of the 3D avatar. [Learning to reconstruct a 3D avatar from a single view image.]Large 3D ground truth data are required to learn a visual representation which describes the geometry and appearance of dynamic humans. I collect a large corpus of training data from a number of people using a multi-camera system which allows measuring a human with minimum occlusion. 107 synchronized HD cameras capture 772 subjects across gender, ethnicity, age, and garment style with assorted body poses. From the multiview image streams, I reconstruct 3D mesh models to represent human geometry and appearance without missing parts. By learning the images and reconstruction results, the AI model can generate a complete 3D avatar from a single view image. [Learning to adapt the learned 3D avatar to general unconstrained scenes.]The quality of the learned 3D avatar is often degraded when the visual statistics of the testing data largely deviates from that of the training data, e.g., the lighting in the controlled lab environment (training) is very different from the unconstrained outside environment (testing). To mitigate such domain mismatch, I introduce a new learning algorithm that can adapt the learned 3D avatars to unconstrained scenes by enforcing the spatial and temporal appearance consistency, i.e., the appearance of the generated 3D avatar should be consistent with the one observed from the image of unconstrained scenes and the one generated from the previous time. Applying these consistency to a short sequence of testing images makes it possible to refine the visual representation without any 3D ground truth data, allowing to generate high-fidelity 3D avatars from everywhere. [Learning to render fine-grained appearance of the 3D avatars from diverse people.]High quality geometry is the main requirement for fine-grained appearance rendering of a 3D avatar. However, the learned visual representation is designed to reconstruct such geometry only for the limited number of people (e.g., a single subject) due to the lack of 3D ground truth data, which no longer exists for other subjects out of training data. I bypass this problem by introducing a pose transfer network that learns to render fine-grained appearance without high quality geometry. Specifically, a pose encoder encodes the pose information from a 3D body model that represents the coarse surface geometry of general undressed humans, and an appearance decoder generates the fine-grained appearance (sharp 2D silhouette and detailed local texture) which is reflective of the encoded body pose for a specific subject seen from a single image. We further embed the 3D motion representation to the encoder in a form of temporal derivatives of 3D body models observed from a video, which allows the decoder to augment the physical plausibility by rendering the motion-dependent texture, i.e., wrinkle and shade on the clothing that are motivated by human movements. Eliminating the requirement of the high quality geometry brings out strong generalization of the rendering model to anybody from a single image or video. In the experiment, I demonstrate that the reconstructed 3D avatar is accurate and temporally smooth; the learned visual representation is highly generalizable to diverse scenes and people; and the rendering results of the 3D avatars is photorealistic compared to previous 3D human modeling and rendering methods. Beyond social tele-presence, enabling various applications is also possible: I apply the learned human visual representation to creating bullet time effect, image relighting, virtual navigation of a 3D scene with people, motion transfer and video generation from a still image.
Molecular Simulation and Design of High-χ Low-N Block Oligomers for Control of Self-Assembly
(2022-02) Shen, Zhengyuan
Multi-component oligomer systems are exciting candidates for nanostructured functional materials, due to the wide variety of their self-assembled morphologies with extremely small feature size. However, experimentally screening through the vast design space of molecular architectures can be extremely laborious. Therefore, guidance from predictive modeling is essential to reduce the synthetic effort. This dissertation discusses the predictive design of self-assembling block oligomer systems using molecular simulations, and the development of computer vision models for automated morphology detection for simulation trajectories. Work presented in this thesis creates a roadmap for efficient computational screening of shape-filling molecules, thus accelerating the design and discovery of nanostructured functional materials. First, with the aid of experimentally-validated force fields, molecular dynamics simulations were exploited to design: 1) a series of symmetric triblock oligomers that can self-assemble into ordered nanostructures with sub-1 nm domains and full domain pitches as small as 1.2 nm, 2) Blends of a lamellar-forming diblock oligomer and a cylinder-forming miktoarm star triblock oligomer leading to stable gyroid networks over a large composition window. Similarities and distinctions between the self-assembly phase behavior of these block oligomers and block polymers are discussed. Second, existing simulation data were used to train deep learning models based on three-dimensional point clouds and voxel grids. The pretrained neural networks can readily detect equilibrium morphologies, and also give rich insights of emerging patterns throughout new simulations with different system sizes and molecular dimensions.
Oral History Interview with Allen R. Hanson
(Charles Babbage Institute, 2022-02) Hanson, Allen
This interview was conducted by CBI for CS&E in conjunction with the 50th Anniversary of the University of Minnesota Computer Science Department (now Computer Science and Engineering, CS&E). Professor Hanson briefly discusses his early education and interests through his graduate education completing his doctorate at Cornell (dissertation was on games and prediction problems). Most of the interview focuses on his career and he was one of the early faculty members of the newly formed Computer Science Department at the University of Minnesota. He discusses the early department, interaction, and teaching, and research. His research focused heavily on vision and computing, pattern recognition, and AI. He partnered on early research with University of Massachusetts Amherst’s Ed Riseman and later left University of Minnesota to join the CS faculty of UMass and lead the lab in this collaboration. Among other topics, he outlines his evolving research, applications in medicine, autonomous vehicles and other areas, as well as reflects on a range of issues on research funding, and computing and society. Finally, he briefly discusses Applied Imaging, Dataviews and concurrent enterprises he led/helped to lead.
Small UAV Position and Attitude, Raw Sensor, and Aerial Imagery Data Collected over Farm Field with Surveyed Markers
(2015-02-25) Mokhtarzadeh, Hamid; Colten, Todd; mokh0006@umn.edu; Mokhtarzadeh, Hamid
Imagery and sensor data from a commercial small Uninhabited Aerial Vehicle flown over an agriculture field on the morning of October 22, 2014 have been logged and documented. The field includes 16 surveyed ground control points laid out in a 4x4 square serving as known ground control points. This data set serves to study both challenges and opportunities of UAV-based remote sensing for precision agriculture applications. The raw sensor data can be used for navigation system design and analysis. The imagery and logged aircraft state can be used for image processing as well as remote sensing analysis. It is being shared to served as a documented data set for testing new concepts and ideas.
Towards Natural Underwater Human-Robot Interaction: Pointing Gesture Recognition for Autonomous Underwater Vehicles
(2021-05) Walker, Andrea Maree
Underwater robotics is a motivating field of research with a wide variety of both industrial and scientific applications. In particular, the development of autonomous underwater vehicles to assist divers in performing difficult, dangerous, or undesirable tasks has the potential to expand our abilities in the aquatic domain while reducing the risks presented to divers. For a diver and an autonomous underwater vehicle to work in collaboration, there must be an established interaction protocol; the study of such protocols is central to the field of human-robot interaction. In the underwater domain, the attenuation of both electromagnetic signals and sound limits these traditional communication protocols, leaving machine vision as the primary perception methodology. Thus gestures become a natural choice for diver-robot communication. Since pointing gestures are represented and recognized in cultures around the world, they serve as a foundational, natural gesture for divers in a demanding aquatic environment. Thus, in this work we lay the groundwork for implementing a pointing gesture recognition algorithm for use onboard autonomous underwater vehicles. Specifically, we contribute a human study of individuals performing four classes of pointing gestures, three datasets developed to study pointing gestures, and an analysis of four state-of-the-art object detection frameworks for recognizing pointing gestures in the aquatic domain.
Weighted Differential Invariant Signatures and Applications to Shape Recognition
(2016-10) Senou, Jessica
The weighted differential invariant signature is developed to deliver more geometrical information than the signature, by combining the signature manifold with invariant measurements that capture the size of local continuous and discrete symmetries. As a consequence, the weighted signature becomes an attractive tool for the task of distinguishing between signature congruent submanifolds, which have the property that they are globally inequivalent, yet possess identical signature manifolds. Properties and relationships between such submanifolds are discussed and how these affect the weighted signature.

University Digital Conservancy

Browse by Subject

Browsing by Subject "computer vision"