xsdbXML goals and motivation
The xsdb framework provides a flexible and well defined infrastructure to allow tabular data to be published, retrieved, and combined over the Internet.
Goals
The xsdb framework makes all of the following assertions true.
- Database queries over web distributed data:
- Databases may be broken up into multiple files or servers on multiple machines and queried as a single resource.
- Simple Publication:
- Publishing a queriable collection of data (a context) can be as simple as placing an XML document on a web server.
- Sophisticated Publication:
- Large and complex databases may also be published using server software which provides indexing and other optimizations.
- Heterogeneity:
- Published data collections may be built using parts of remotely defined data collections.
- External Data:
- A data context may make reference to another arbitrary web object.
- Open formats and definitions:
- Databases may be constructed and queried using standard formats and standard web protocols using any programming language in any computational environment.
- Simple formats
- The content of a database or query may be expressed in a manner which is easy to parse and interpret (both for human readers and for computer programs). Data, queries and query responses are represented using the same language of expressions.
- Clear definition
- The meaning of database entries and queries are defined using simple mathematical definitions.
Motivation
The following scenarios provide some examples of how xsdb might be used.
- Simple Publication:
- A small business may publish its hours of operation, contact information, products and services offered and other business data in a structured, machine readable form by placing a single static xsdb document on a web server. The document may then be queried directly by xsdb interpreters or combined with other xsdb databases automatically.
- Sophisticated Publication:
- A large organization can provide its business information using a web server which dynamically generates information upon request in xsdb format. This server can be linked to an inventory database or other external sources to enable up to the minute information.
- Heterogeneity:
- A web site or other data publisher might combine business information from large and small organizations by combining xsdb format data provided by the organizations into a combined database.
- External Data:
- Company logos represented as images or Marketing documents represented as PDF or Word files may be embedded in xsdb documents as unformatted data or referenced as externally linked data.
Background and Comparisons
The xsdb framework is similar to a number of other models and languages. The following provides a brief discussion of similarities and differences.
- HTML:
- The xsdb language was designed to be similar to HTML in order to make it easier to understand by people already familiar with HTML. Beyond the surface similarity the goals of HTML and xsdb are very different: HTML is a markup language for representing textual documents; xsdb is a language and framework for declaring data, specifying queries and delivering responses to queries.
- Relational Algebra, Boolean Algebra, and SQL:
- Boolean algebra and relational technology are the direct ancestors of the xsdb framework. The xsdb query semantics may be characterized as a relational algebra expressed and generalized within a boolean algebra. The most immediate difference between conventional relational technology and the xsdb framework are
- xsdb permits non-first-normal form values (tuples or groups of tuples as attribute values). This permits data to be organized in natural manner within xsdb documents.
- xsdb treats data as logical expressions rather than as collections of tables in order to allow more flexible combination of data from heterogeneous data sources.
- xsdb expresses data, queries and query responses all within the same language. This makes it possible to build a data context by using queries into other data contexts among other advantages.
Although the xsdb framework is more general than conventional relational technology it may be useful to provide SQL or other more conventional front-ends to xsdb back ends.
- RSS and RDF:
- These formats are designed to "integrate a variety of applications from library catalogs and world-wide directories to syndication and aggregation of news, software, and content to personal collections of music, photos, and events using XML as an interchange syntax." They do not appear to be appropriate for more conventional database archiving and retrieval.
- XQuery, XPath and XSLT:
- These languages express operations over general XML documents interpreted as trees. The xsdb framework extracts information by combining xsdb documents interpreted as logical assertions (sets, mappings, and sequences). Although it is possible to express similar operations under both of these approaches, the different style of data interpretations are better for different purposes. For example XSLT is better for translating a simple HTML document into a more complex document, whereas xsdb is better for deriving tabular information from other tabular information (for example, determining the sales per department from a sequence of invoice records).
End of xsdbXML goals and motivation
return to index