Browsing by Subject "Neural network"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Distributed Edge Computing Infrastructure with Low Hardware Cost, Performance Evaluation and Reliability(2018-08) Li, BingzheEdge computing is one method of pushing the applications, data and computation components away from the centralized system. Each edge computing component has its own computation ability and storage capacity. And they also can communicate with the center server through the cloud and Internet to send the pre-computed data to the center. With increasing the interests of Internet of Things (IoT), more and more edge computing nodes will be attached to the cloud and will build a distributed edge computing infrastructure. In the infrastructure, the center server with storage systems is capable of managing the data and storing the captured data into its storage systems. Moreover, tens or hundreds of edge computing nodes are attached to the infrastructure cloud. Each edge computing node has the computation ability to pre-compute some data captured by the back-end sensors and sends the processed data to the center server via the infrastructure cloud. In this thesis, three aspects of the distributed edge computing infrastructure are investigated, the low hardware cost design for edge computing node, performance evaluation for the central server and the reliability of the central server. First, for each edge computing component, the hardware cost is an extremely important factor due to the limited power supply and computation ability. Stochastic computing is a promising technology to achieve the low power and area designs. Taking neural networks as the applications at edge computing nodes, we proposed different arithmetic operations in stochastic domain for neural networks to achieve the low hardware cost design for those edge computing components. Second, with adding more and more edge computing nodes, the network and storage traffic will be increased tremendously in the central server side. Therefore, it is significant to know how much workloads (number of edge computing nodes) the central server can tolerate for system designers. In our work, we proposed replayer tools which are capable of replaying traces to the target system in order to measure the performance of the target system. By doing so, it will be clear to know the ability of the system whether it can tolerate the workload or not and guide the system designer to add more resource to the cenral server such as more storage devices and changing higher frequency CPUs. Finally, the central server in the infrastructure receives the data sent from each edge computing node and stores them into its storage system. Therefore, it is important to protect the data and to achieve the high performance. In this thesis, we proposed new RAID-6 codes to improve the write and degraded read performance while keeping the reliability of the central server.Item Low Power Approximate Hardware Design For Multimedia and Neural Network Applications(2020-06) Sharmin Snigdha, FarhanaIn today's data- and computation-driven society, day-to-day life depends on devices such as smartphones, laptops, smart watches, and biosensors/image sensors connected to computational engines. The computationally intensive applications that run on these devices incur high levels of chip power dissipation, and must operate under stringent power constraints due to thermal or battery life limitations. On future hardware platforms, a large fraction of computation power will be spent on error-tolerant multimedia applications such as signal processing tasks (on audio, video, or images) and artificial intelligence (AI) algorithms for recognizing voice and image data. For such error-tolerant applications, approximate computation has emerged as a new paradigm that provides a pragmatic approach for trading off energy/power for computational accuracy. A powerful method for implementing approximate computing is by performing logic-level or architecture-level hardware modifications. The effectiveness of an approximate system depends on identifying potential modes of approximation, accurate modeling of injected error as a function of the approximation, and optimization of the system to maximize energy savings for user-defined quality constraints. However, current approaches to approximate computation involve ad hoc trial-and-error based methods that do not consider the effect of approximations on system-level quality metrics. Additionally, prior methods for approximate computation have provided little or no scope for modulating the design based on user- and application-specific error budgets. HASH(0x4210e28) This thesis proposes adaptive frameworks for energy-efficient approximate computing, leveraging the target application characteristics, system architecture, and input information to build fast, power-efficient approximate circuits under a user-defined error budget. The work is focused on two well-established, widely-used, and computationally intensive applications: multimedia and artificial intelligence. For multimedia systems, where minor errors in audio, image, and video are imperceptible to the human senses, approximate computations can be very effective in saving energy without significant loss in the quality of results. AI applications are also good candidates for approximation as they have inherent error-resilience feedback mechanisms embedded into their computations. This thesis demonstrates methodologies for approximate computing on representative platforms from the multimedia and AI domains, namely, the widely used JPEG architecture, and various architectures for deep learning. The first part of the thesis develops a methodology for designing approximate hardware for JPEG that is input-independent, i.e., it aims to meet the specified error budgets for any inputs. The error sensitivities of various arithmetic units within the JPEG architecture with respect to the quality of the output image are first modeled, and a novel optimization problem is then formulated, using the error sensitivity model, to maximize power savings under an error budget. The optimized solution provides 1.5x-2.0x power savings over the original accurate design, with negligible image quality degradation. However, the degree of approximation in this approach must necessarily be chosen conservatively to stay within the error budget over all possible input images. The second part of the thesis designs an image-dependent approximate computation process that uses image-specific input statistics to dynamically increase the approximation level over the image-independent approach, thereby reducing its conservatism. This approach must overcome several challenges: circuitry for real-time extraction of input image statistics must be inexpensive in terms of both power and computation time, and schemes for translating abstracted image information into dynamically chosen approximation levels in hardware must be devised. The approach devises a simplified heuristic to estimate the input data distribution. Based on this distribution, a dynamic approximate architecture is developed, altering the approximation levels for input images in real-time. Over a set of benchmarks, the input-dependent approximation provides an average of 31% additional power improvement, as compared to the input-independent approximation process. The final part of the thesis addresses the use of approximate computing for convolutional neural networks (CNNs), which have achieved unprecedented accuracy on many modern AI applications. The inherent error-resilience and large computation requirements imply that CNN hardware implementations are excellent candidates for approximate computation. A systematic framework is developed to dynamically reduce the computation in the CNN based on its inputs. The approach is motivated by the observation that for a specific input class, during both the training and testing phases, some features tend to be activated together while others are unlikely to be activated. A dynamic selective feature activation framework, SeFAct, is proposed for energy-efficient CNN hardware accelerators to early predict an input class and only perform necessary computations. For various state-of-the-art neural networks, the results show that energy savings of 20%-25% are achievable, after accounting for all implementation overheads, with small loss in accuracy. Moreover, a trade-off between accuracy and energy savings may be characterized using the proposed approach.