خط مشی دسترسیدرباره ماپشتیبانی آنلاین
ثبت نامثبت نام
راهنماراهنما
فارسی
ورودورود
صفحه اصلیصفحه اصلی
جستجوی مدارک
تمام متن
منابع دیجیتالی
رکورد قبلیرکورد بعدی
Document Type:Latin Dissertation
Language of Document:English
Record Number:53193
Doc. No:TL23147
Call number:‭1469108‬
Main Entry:Omar Mukhtar
Title & Author:AlphaRank: A new smoothing algorithm based on combination of link analysis techniques and frequency based methodsOmar Mukhtar
College:State University of New York at Buffalo
Date:2009
Degree:M.S.
student score:2009
Page No:81
Abstract:Smoothing a probability distribution so that it generalizes well is a hard machine learning problem. It is particularly challenging when building a statistical language model with insufficient training data. We have developed a new smoothing algorithm (called AlphaRank) to overcome the data sparseness problem by viewing language as a large graph where each word is a vertex and the probability of using another word is determined by the edge weight connecting two words (vertices). Thus, instead of using frequency based rules as is done in prior work, we propose a graph based method to smooth a statistical language model. Our method combines features of context-dependent probability estimators such as n-grams and features from context-independent probability estimators such as the steady-state distribution of a discrete time-step Markov chain. We have tested on a large collection of Arabic newswire articles and compared with previous approaches using the perplexity measure and found our method to be superior.
Subject:Applied sciences; Language, literature and linguistics; Computational linguistics; Language modeling; Smoothing; Linguistics; Artificial intelligence; Computer science; 0984:Computer science; 0290:Linguistics; 0800:Artificial intelligence
Added Entry:V. Govindaraju
Added Entry:State University of New York at Buffalo