TY - GEN
T1 - Towards knowledge discovery in big data
AU - Lomotey, Richard K.
AU - Deters, Ralph
PY - 2014
Y1 - 2014
N2 - Analytics-as - A-Service (AaaS) has become indispensable because it affords stakeholders to discover knowledge in Big Data. Previously, data stored in data warehouses follow some schema and standardization which leads to efficient data mining. However, the Big Data epoch has witnessed the rise of structured, semi-structured, and unstructured data, a trend that motivated enterprises to employ the NoSQL data storages to accommodate the high-dimensional data. Unfortunately, the existing data mining techniques which are designed for schema-oriented storages are non-applicable to the unstructured data style. Thus, the AaaS though still in its infancy, is gaining widespread attention for its ability to provide novel ways and opportunities to mine the heterogeneous data. In this paper, we discuss our AaaS tool that performs terms and topics extraction and organization from unstructured data sources such as NoSQL databases, textual contents (e.g., websites), and structured sources (e.g. SQL). The tool is built on methodologies such as tagging, filtering, association maps, and adaptable dictionary. The evaluation of the tool shows high accuracy in the mining process.
AB - Analytics-as - A-Service (AaaS) has become indispensable because it affords stakeholders to discover knowledge in Big Data. Previously, data stored in data warehouses follow some schema and standardization which leads to efficient data mining. However, the Big Data epoch has witnessed the rise of structured, semi-structured, and unstructured data, a trend that motivated enterprises to employ the NoSQL data storages to accommodate the high-dimensional data. Unfortunately, the existing data mining techniques which are designed for schema-oriented storages are non-applicable to the unstructured data style. Thus, the AaaS though still in its infancy, is gaining widespread attention for its ability to provide novel ways and opportunities to mine the heterogeneous data. In this paper, we discuss our AaaS tool that performs terms and topics extraction and organization from unstructured data sources such as NoSQL databases, textual contents (e.g., websites), and structured sources (e.g. SQL). The tool is built on methodologies such as tagging, filtering, association maps, and adaptable dictionary. The evaluation of the tool shows high accuracy in the mining process.
UR - http://www.scopus.com/inward/record.url?scp=84903607376&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84903607376&partnerID=8YFLogxK
U2 - 10.1109/SOSE.2014.25
DO - 10.1109/SOSE.2014.25
M3 - Conference contribution
AN - SCOPUS:84903607376
SN - 9781479925049
T3 - Proceedings - IEEE 8th International Symposium on Service Oriented System Engineering, SOSE 2014
SP - 181
EP - 191
BT - Proceedings - IEEE 8th International Symposium on Service Oriented System Engineering, SOSE 2014
PB - IEEE Computer Society
T2 - 8th IEEE International Symposium on Service Oriented System Engineering, SOSE 2014
Y2 - 7 April 2014 through 11 April 2014
ER -