رکورد قبلیرکورد بعدی

" A Data Augmentation Approach to Short Text Classification "


Document Type : Latin Dissertation
Language of Document : English
Record Number : 897940
Doc. No : TL9cn7k2xq
Main Entry : Jung, Ha Yoon
Title & Author : A Data Augmentation Approach to Short Text Classification\ ROSARIO, RYAN ROBERTWu, Yingnian
Date : 2017
student score : 2017
Abstract : Text classification typically performs best with large training sets, but short texts are very common on the World Wide Web. Can we use resampling and data augmentation to construct larger texts using similar terms? Several current methods exist for working with short text that rely on using external data and contexts, or workarounds. Our focus is to test a new preprocessing approach that uses resampling, inspired by the bootstrap, combined with data augmentation, by treating each short text as a population and sampling similar words from a semantic space to create a longer text. We use blog post titles collected from the Technorati blog aggregator as experimental data with each title appearing in one of ten categories. We first test how well the raw short texts are classified using a variant of SVM designed specifically for short texts as well as a supervised topic model and an SVM model that uses semantic vectors as features. We then build a semantic space and augment each short text with related terms under a variety of experimental conditions. We test the classifiers on the augmented data and compare performance to the aforementioned baselines. The classifier performance on augmented test sets outperformed the baseline classifiers in most cases.
Added Entry : ROSARIO, RYAN ROBERT
Added Entry : UCLA
کپی لینک

پیشنهاد خرید
پیوستها
عنوان :
نام فایل :
نوع عام محتوا :
نوع ماده :
فرمت :
سایز :
عرض :
طول :
9cn7k2xq_459091.pdf
9cn7k2xq.pdf
پایان نامه لاتین
متن
application/pdf
23.55 MB
85
85
نظرسنجی
نظرسنجی منابع دیجیتال

1 - آیا از کیفیت منابع دیجیتال راضی هستید؟