رکورد قبلیرکورد بعدی

" Efficient Inference using Deep Convolutional Neural Networks on Resource-Constrained Platforms "


Document Type : Latin Dissertation
Language of Document : English
Record Number : 1104799
Doc. No : TLpq2269078980
Main Entry : Ghiasi, Soheil
: Motamedi, Mohammad
Title & Author : Efficient Inference using Deep Convolutional Neural Networks on Resource-Constrained Platforms\ Motamedi, MohammadGhiasi, Soheil
College : University of California, Davis
Date : 2019
student score : 2019
Degree : Ph.D.
Page No : 123
Abstract : Deep Convolutional Neural Networks (CNNs) exhibit remarkable performance in many pattern recognition, segmentation, classification, and comprehension tasks that were widely considered open problems for most of the computing history. For example, CNNs are shown to outperform humans in certain visual object recognition tasks. Given the significant potential of CNNs in advancing autonomy and intelligence in systems, the Internet of Things (IoT) research community has witnessed a surge in demand for CNN-enabled data processing, technically referred to as inference, for critical tasks, such as visual, voice and language comprehension. Inference using modern CNNs involves billions of operations on millions of parameters, and thus their deployment requires significant compute, storage, and energy resources. However, such resources are scarce in many resource-constrained IoT applications. Designing an efficient CNN architecture is the first step in alleviating this problem. Use of asymmetric kernels, breadth control techniques, and reduce-expand structures are among the most important approaches that can effectively decrease CNNs parameter budget and their computational intensity. The architectural efficiency can be further improved by eliminating ineffective neurons using pruning algorithms, and quantizing the parameters to decrease the model size. Hardware-driven optimization is the subsequent step in addressing the computational demands of deep neural networks. Mobile System on Chips (SoCs), which usually include a mobile GPU, a DSP, and a number of CPU cores, are great candidates for CNN inference on embedded platforms. Depending on the application, it is also possible to develop customized FPGA-based and ASIC-based accelerators. ASIC-based acceleration drastically outperforms other approaches in terms of both power consumption and execution time. However, using this approach is reasonable only if designing a new chip is economically justifiable for the target application. This dissertation aims to bridge the gap between computational demands of CNNs and computational capabilities of embedded platforms. We contend that one has to strike a judicious balance between functional requirements of a CNN, and its resource requirements, for an IoT application to be able to utilize the CNN. We investigate several concrete formulations of this broad concept, and propose effective approaches for addressing the identified challenges. First, we target platforms that are equipped with reconfigurable fabric, such as Field Programmable Gate Arrays (FPGA), and offer a framework for generation of optimized FPGA-based CNN accelerators. Our solution leverages an analytical approach to characterization and exploration of the accelerator design space through which, it synthesizes an efficient accelerator for a given CNN on a specific FPGA. Second, we investigate the problem of CNN inference on mobile SoCs, propose effective approaches for CNN parallelization targeting such platforms, and explore the underlying tradeoffs. Finally, in the last part of this dissertation, we investigate utilization of an existing optimized CNN model to automatically generate a competitive CNN for an IoT application whose objects of interest are a fraction of categories that the original CNN was designed to classify, such that the resource requirement of inference using the synthesized CNN is proportionally scaled down. We use the term resource scalability to refer to this concept and propose solutions for automated synthesis of context-aware, resource-scalable CNNs that meet the functional requirements of the target IoT application at fraction of the resource requirements of the original CNN.
Subject : Computer engineering
: Electrical engineering
کپی لینک

پیشنهاد خرید
پیوستها
عنوان :
نام فایل :
نوع عام محتوا :
نوع ماده :
فرمت :
سایز :
عرض :
طول :
2269078980_9847.pdf
2269078980.pdf
پایان نامه لاتین
متن
application/pdf
5.45 MB
85
85
نظرسنجی
نظرسنجی منابع دیجیتال

1 - آیا از کیفیت منابع دیجیتال راضی هستید؟