|
" An Automatic Similarity Detection Engine Between Sacred Texts Using Text Mining and Similarity Measures "
Salha Hassan Muhammed Qahl
Fokoue, Ernest
Document Type
|
:
|
Latin Dissertation
|
Language of Document
|
:
|
English
|
Record Number
|
:
|
803264
|
Doc. No
|
:
|
TL48047
|
Call number
|
:
|
1641125320; 1570850
|
Main Entry
|
:
|
Sookdial, Vijay T.
|
Title & Author
|
:
|
An Automatic Similarity Detection Engine Between Sacred Texts Using Text Mining and Similarity Measures\ Salha Hassan Muhammed QahlFokoue, Ernest
|
College
|
:
|
Rochester Institute of Technology
|
Date
|
:
|
2014
|
Degree
|
:
|
M.S.
|
field of study
|
:
|
Applied Statistics
|
student score
|
:
|
2014
|
Page No
|
:
|
104
|
Note
|
:
|
Committee members: Chen, Linlin; Parody, Robert
|
Note
|
:
|
Place of publication: United States, Ann Arbor; ISBN=978-1-321-40085-4
|
Abstract
|
:
|
Is there any similarity between the contexts of the Holy Bible and the Holy Quran, and can this be proven mathematically? The purpose of this research is using the Bible and the Quran as our corpus, we explore the performance of various feature extraction and machine learning techniques. The unstructured nature of text data adds an extra layer of complexity in the feature extraction task, and the inherently sparse nature of the corresponding data matrices makes text mining a distinctly difficult task. Among other things, We assess the difference between domain-based syntactic feature extraction and domain-free feature extraction, and then use a variety of similarity measures like Euclidean, Hillinger, Manhattan, cosine, Bhattacharyya, symmetries kullback-leibler, Jensen Shannon, probabilistic chi-square and clark. For a similarity to identify similarities and differences between sacred texts.
|
Subject
|
:
|
Mathematics; Statistics; Computer science
|
Descriptor
|
:
|
Pure sciences;Applied sciences;Data mining;Machine learning;Sacred texts;Similarity measures
|
Added Entry
|
:
|
Fokoue, Ernest
|
Added Entry
|
:
|
Applied StatisticsRochester Institute of Technology
|
| |