TY - GEN
T1 - Data mining from NoSQL document-append style storages
AU - Lomotey, Richard K.
AU - Deters, Ralph
PY - 2014
Y1 - 2014
N2 - The modern data economy, which has been described as "Big Data", has changed the status quo on digital content creation and storage. While data storage has followed the schema-dictated approach for decades, the recent nature of digital content, which is widely unstructured, creates the need to adopt different storage techniques. Thus, the NoSQL database systems have been proposed to accommodate most of the content being generated today. One of such NoSQL databases that have received significant enterprise adoption is the document-append style storage. The emerging concern and challenge however is that, research and tools that can aid data mining processes from such NoSQL databases is generally lacking. Even though document-append style storages allow data accessibility as Web services and over URL/I, building a corresponding data mining tool deviates from the underlying techniques governing web crawlers. Also, existing data mining tools that have been designed for schema-based storages (e.g., RDBMS) are misfits. Hence, our goal in this work is to design a unique data analytics tool that enables knowledge discovery through information retrieval from document-append style storage. The tool is algorithmically built on the inference-based Apriori, which aids us to achieve optimization of the search duration. Preliminary test results of the proposed tool also show high accuracy in comparison to other approaches that were previously proposed.
AB - The modern data economy, which has been described as "Big Data", has changed the status quo on digital content creation and storage. While data storage has followed the schema-dictated approach for decades, the recent nature of digital content, which is widely unstructured, creates the need to adopt different storage techniques. Thus, the NoSQL database systems have been proposed to accommodate most of the content being generated today. One of such NoSQL databases that have received significant enterprise adoption is the document-append style storage. The emerging concern and challenge however is that, research and tools that can aid data mining processes from such NoSQL databases is generally lacking. Even though document-append style storages allow data accessibility as Web services and over URL/I, building a corresponding data mining tool deviates from the underlying techniques governing web crawlers. Also, existing data mining tools that have been designed for schema-based storages (e.g., RDBMS) are misfits. Hence, our goal in this work is to design a unique data analytics tool that enables knowledge discovery through information retrieval from document-append style storage. The tool is algorithmically built on the inference-based Apriori, which aids us to achieve optimization of the search duration. Preliminary test results of the proposed tool also show high accuracy in comparison to other approaches that were previously proposed.
UR - http://www.scopus.com/inward/record.url?scp=84926225130&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84926225130&partnerID=8YFLogxK
U2 - 10.1109/ICWS.2014.62
DO - 10.1109/ICWS.2014.62
M3 - Conference contribution
AN - SCOPUS:84926225130
T3 - Proceedings - 2014 IEEE International Conference on Web Services, ICWS 2014
SP - 385
EP - 392
BT - Proceedings - 2014 IEEE International Conference on Web Services, ICWS 2014
A2 - De Roure, David
A2 - Thuraisingham, Bhavani
A2 - Zhang, Jia
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 21st IEEE International Conference on Web Services, ICWS 2014
Y2 - 27 June 2014 through 2 July 2014
ER -