TY - GEN
T1 - Real-time effective framework for unstructured data mining
AU - Lomotey, Richard K.
AU - Deters, Ralph
PY - 2013
Y1 - 2013
N2 - Today, the enterprise landscape faces voluminous amount of data. The information gathered from these data sources are useful for improving on product and services delivery. However, it is challenging to perform knowledge discovery in database (KDD) activities on these data sources because of its unstructured nature. Previous studies have proposed the hierarchical clustering methodology since it enhances human readability and provides clear dependency structure through topics, term and document organization. But, the methodology can be resource intensive and time consuming. In order to improve on the terms extraction process, we propose a tool called RSenter that searches through interconnected Hyperlinks and NoSQL database (specifically, CouchDB). We evaluate the tool based on search algorithms such as parallelization, random walk (or linear search), pessimistic search, and optimistic search. The tool shows high accuracy and optimality in view of the search time.
AB - Today, the enterprise landscape faces voluminous amount of data. The information gathered from these data sources are useful for improving on product and services delivery. However, it is challenging to perform knowledge discovery in database (KDD) activities on these data sources because of its unstructured nature. Previous studies have proposed the hierarchical clustering methodology since it enhances human readability and provides clear dependency structure through topics, term and document organization. But, the methodology can be resource intensive and time consuming. In order to improve on the terms extraction process, we propose a tool called RSenter that searches through interconnected Hyperlinks and NoSQL database (specifically, CouchDB). We evaluate the tool based on search algorithms such as parallelization, random walk (or linear search), pessimistic search, and optimistic search. The tool shows high accuracy and optimality in view of the search time.
UR - http://www.scopus.com/inward/record.url?scp=84893472210&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893472210&partnerID=8YFLogxK
U2 - 10.1109/TrustCom.2013.131
DO - 10.1109/TrustCom.2013.131
M3 - Conference contribution
AN - SCOPUS:84893472210
SN - 9780769550220
T3 - Proceedings - 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2013
SP - 1081
EP - 1088
BT - Proceedings - 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2013
T2 - 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2013
Y2 - 16 July 2013 through 18 July 2013
ER -