TY - GEN
T1 - Terms extraction from unstructured data silos
AU - Lomotey, Richard K.
AU - Deters, Ralph
PY - 2013
Y1 - 2013
N2 - The major challenge that the big data era brings to the services computing landscape is debris of unstructured data. The high-dimensional data is in heterogeneous formats, schemaless, and requires multiple storage APIs is some cases. This situation has made it almost impractical to apply existing data mining techniques which are designed for schema-based data sources in a knowledge discovery in database (KDD) process. In this paper, a tool called TouchR is proposed which algorithmically relies on the Hidden Markov Model (HMM) to extract terms from data silos; specifically, distributed NoSQL databases- which we model as network graph. Our use case graph consists of storage nodes such as CouchDB, Neo4J, DynamoDB etc. The evaluation of TouchR shows high accuracy for terms extraction and organization.
AB - The major challenge that the big data era brings to the services computing landscape is debris of unstructured data. The high-dimensional data is in heterogeneous formats, schemaless, and requires multiple storage APIs is some cases. This situation has made it almost impractical to apply existing data mining techniques which are designed for schema-based data sources in a knowledge discovery in database (KDD) process. In this paper, a tool called TouchR is proposed which algorithmically relies on the Hidden Markov Model (HMM) to extract terms from data silos; specifically, distributed NoSQL databases- which we model as network graph. Our use case graph consists of storage nodes such as CouchDB, Neo4J, DynamoDB etc. The evaluation of TouchR shows high accuracy for terms extraction and organization.
UR - http://www.scopus.com/inward/record.url?scp=84883174345&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84883174345&partnerID=8YFLogxK
U2 - 10.1109/SYSoSE.2013.6575236
DO - 10.1109/SYSoSE.2013.6575236
M3 - Conference contribution
AN - SCOPUS:84883174345
SN - 9781467355971
T3 - Proceedings of 2013 8th International Conference on System of Systems Engineering: SoSE in Cloud Computing and Emerging Information Technology Applications, SoSE 2013
SP - 19
EP - 24
BT - Proceedings of 2013 8th International Conference on System of Systems Engineering
T2 - 2013 8th International Conference on System of Systems Engineering: SoSE in Cloud Computing and Emerging Information Technology Applications, SoSE 2013
Y2 - 2 June 2013 through 6 June 2013
ER -