TY - JOUR
T1 - Normalizing object-centric process logs by applying database principles
AU - Kumar, Akhil
AU - Soffer, Pnina
AU - Tsoury, Arava
N1 - Funding Information:
Thanks to Wil M. P. van der Aalst and Alessandro Berti for making their object-centric log on retail orders available for testing. We also thank the reviewers for their valuable inputs. This work was supported in part by a summer research support grant to the first author from the Smeal College of Business at Penn State University.
Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2023/5
Y1 - 2023/5
N2 - Much work has been done in process mining in the last two decades, where the focus of most efforts has been on unearthing the process models from log traces where each trace could be related to a unique case identifier that pertains to a single instance, such as an online customer order, a production order, a patient visit, etc. The case identifiers in these cases are customer order number, production order number, patient id, respectively, and there is a one-to-one relationship between the case identifier and the log data. On the other hand, in so-called object-centric (OC) logs, multiple objects are associated in one log record giving rise to many-to-many relationships among these objects and leading to ambiguities and redundancies in the log data. Hence, these logs become very difficult to analyze in their raw form as single linear files and it is important to convert them into database models. In this paper, we show how OC logs can be structured into a STAR and a fully normalized database schemas. The two schemas are compared and the benefits of our approach for log processing and ensuring log integrity are discussed.
AB - Much work has been done in process mining in the last two decades, where the focus of most efforts has been on unearthing the process models from log traces where each trace could be related to a unique case identifier that pertains to a single instance, such as an online customer order, a production order, a patient visit, etc. The case identifiers in these cases are customer order number, production order number, patient id, respectively, and there is a one-to-one relationship between the case identifier and the log data. On the other hand, in so-called object-centric (OC) logs, multiple objects are associated in one log record giving rise to many-to-many relationships among these objects and leading to ambiguities and redundancies in the log data. Hence, these logs become very difficult to analyze in their raw form as single linear files and it is important to convert them into database models. In this paper, we show how OC logs can be structured into a STAR and a fully normalized database schemas. The two schemas are compared and the benefits of our approach for log processing and ensuring log integrity are discussed.
UR - http://www.scopus.com/inward/record.url?scp=85151353113&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85151353113&partnerID=8YFLogxK
U2 - 10.1016/j.is.2023.102196
DO - 10.1016/j.is.2023.102196
M3 - Article
AN - SCOPUS:85151353113
SN - 0306-4379
VL - 115
JO - Information Systems
JF - Information Systems
M1 - 102196
ER -