As Extensible Markup Language (XML) is emerging as the data format of the Internet era, there are increasing needs to efficiently store and query XML data. One path to this goal is transforming XML data into relational format in order to use relational database technology. Although several transformation algorithms exist, they are incomplete in the sense that they focus only on structural aspects and ignore semantic aspects. In this paper, we present the semantic knowledge that needs to be captured during transformation to ensure a correct relational schema. Further, we show an algorithm that can (1) derive such semantic knowledge from a given XML Document Type Definition (DTD) and (2) preserve the knowledge by representing it as semantic constraints in relational database terms. By combining existing transformation algorithms and our constraints-preserving algorithm, one can transform XML DTD to relational schema where correct semantics and behaviors are guaranteed by the preserved constraints. Experimental results are also presented.
All Science Journal Classification (ASJC) codes
- Information Systems and Management