|
" Automatic Information Extraction from Camera-Trap Images Using Deep Learning "
Norouzzadeh, Mohammad Sadegh
Clune, Jeff
Document Type
|
:
|
Latin Dissertation
|
Language of Document
|
:
|
English
|
Record Number
|
:
|
1105512
|
Doc. No
|
:
|
TLpq2312284296
|
Main Entry
|
:
|
Clune, Jeff
|
|
:
|
Norouzzadeh, Mohammad Sadegh
|
Title & Author
|
:
|
Automatic Information Extraction from Camera-Trap Images Using Deep Learning\ Norouzzadeh, Mohammad SadeghClune, Jeff
|
College
|
:
|
University of Wyoming
|
Date
|
:
|
2019
|
student score
|
:
|
2019
|
Degree
|
:
|
Ph.D.
|
Page No
|
:
|
122
|
Abstract
|
:
|
Our ability to study and conserve ecosystems directly depends on how much information we have about them. Motion-activated cameras also known as camera traps are cheap and non-intrusive tools to gather millions of images from wildlife. However, extracting useful information such as species, count, and the behavior of animals from the collected images is often done manually, and it is so slow and expensive that a lot of invaluable information is not extracted and thus remain untapped. This manual labor is the main roadblock in the widespread usage of camera-trap arrays. I devoted my Ph.D. dissertation to reducing the manual burden of information extraction from camera-trap images using advanced machine learning methods. For the first step, I demonstrated that such information can be automatically extracted by deep learning, a cutting-edge type of artificial intelligence. I trained deep convolutional neural networks to identify, count, and describe the behaviors of 48 species in the 3.2-million-image Snapshot Serengeti dataset. Our deep neural networks automatically identify animals with over 94.9% accuracy, and we expect that number to improve rapidly in years to come. More importantly, if my system classifies only images it is confident about, it can automate animal identification for 99.3% of the data while still performing at the same 96.6% accuracy as that of crowdsourced teams of human volunteers. This automation saves more than 8.4 years (i.e., over 17,000 hours at 40 hours per week) of human labeling effort on this 3.2-million-image dataset. Although I achieved outstanding results on the Snapshot Serengeti dataset, the accuracy of results highly depends on the amount, information-richness, quality, and diversity of the available data to train the models. Many camera-trap projects do not have a large, detailed set of available labeled images and hence cannot benet from my suggested machine learning techniques. In the second part of my dissertation, I combined the power of advanced machine learning algorithms and human intelligence to build a scalable, fast, and accurate active learning system to maximally reduce the amount of manual work to identify and count animals in camera-trap images. I showed that my proposed procedure could achieve more than 90.9% accuracy on the SS dataset with as little as 14,000 labels, which matches state of the art results while saving over 99.5% of human labor for labeling. Those efficiency gains highlight the importance of using deep neural networks to automate data extraction from camera-trap images, suggesting that deep learning could enable the inexpensive, unobtrusive, high-volume, and even real-time collection of a wealth of information about vast numbers of animals in the wild.
|
Subject
|
:
|
Animal sciences
|
|
:
|
Computer science
|
|
:
|
Deep learning
|
|
:
|
Ecology
|
| |