return to index
xsdb project page with download links
xFeedMe xsdb resources

Guide for using the XsdbXML framework.

Introduction

This document explains the installation and use of the XsdbXML framework from an operational perspective and explains the primary features of the system.

Requirements, Installation and Self Test

There are four completely separate implementations of the xsdbXML framework at the time of this writing: a python implementation, a javascript implementation, a java implementation, and a CSharp/.NET implementation. For information on installing, testing and using the Python implementation please see the Python implementation notes. For information on installing, testing and using the javascript implementation please see the javascript implementation notes. For information on installing, testing and using the CSharp/.NET implementation please see the CSharp/.NET implementation notes. For information on installing, testing and using the java implementation please see the java implementation notes.

Declaring information.

The most simple way to declare information is to create a simple xsdb format file. For example the following contents of file "accesses.xsdb" lists information about the number of times html pages have been viewed on a web server in a given month.
<context>
<title> page access counts by month </title>
<or>
<and> <s at="page">index.html</s> <i at="month">1</i> <i at="hits">2100</i> </and>
<and> <s at="page">index.html</s> <i at="month">2</i> <i at="hits">3300</i> </and>
<and> <s at="page">index.html</s> <i at="month">3</i> <i at="hits">1950</i> </and>
<and> <s at="page">products.html</s> <i at="month">1</i> <i at="hits">15</i> </and>
<and> <s at="page">products.html</s> <i at="month">2</i> <i at="hits">650</i> </and>
<and> <s at="page">products.html</s> <i at="month">3</i> <i at="hits">98</i> </and>
<and> <s at="page">people.html</s> <i at="month">1</i> <i at="hits">439</i> </and>
<and> <s at="page">people.html</s> <i at="month">2</i> <i at="hits">12</i> </and>
<and> <s at="page">people.html</s> <i at="month">3</i> <i at="hits">665</i> </and>
</or>
</context>
Here the first line asserts that index.html was viewed 2100 times during month 1 (January). The above format is visually similar to an html table. The "entries" in the table consist of "atomic attribute restrictions" which may be of the following forms
<s at="page">index.html</s> Attribute page has the string value index.html.
<id at="ssnum">111-22-3333</id> Attribute ssnum has the string value 111-22-3333 and it's an id.
<i at="month">11</i> Attribute month has the integer value 11.
<f at="width">0.334</f> Attribute width has the floating point value 0.33.
<a at="children"> expression </a> Attribute children has the complex value represented by expression (see below).

Data files such as "accesses.xsdb" may be kept on a local file system or they may also be placed on web servers to allow xsdbXML queries to query the data files using the HTTP protocol from anywhere on the Internet. (Advanced servers may also derive xsdb format information dynamically, but that is beyond the scope of the present discussion.)

Running queries

Before delving into the details of what queries mean let's discuss the various ways a query can be evaluated using the python implementation. The following simple query extracts the accesses listed above only for the page index.html.
  <query>
    <consult href="accesses.xsdb">
      <s at="page">index.html</s>
    </consult>
  </query>
Running queries from the operating system prompt: We may run this query from the operating system prompt using the Python query.py script. The query.py script permits the literal query in the command itself
  prompt> query.py --expr "<consult href='accesses.xsdb'><s at='page'>index.html</s></consult>"
(note that the query tag is omitted above). This invocation generates the output
  <or>
    <and>
       <i at="hits">3300</i>
       <i at="month">2</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <i at="hits">1950</i>
       <i at="month">3</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <i at="hits">2100</i>
       <i at="month">1</i>
       <s at="page">index.html</s>
    </and>
  </or>
The query.py script also accepts a query as the contents of a file. If the contents of index_query.xsdb contains the query expression given above we may invoke the query using the file as follows
  prompt> query.py index_query.xsdb
and the output will be similar.

The query.py program is only available when the python implementation is installed. See the Python implementation notes for details.

Graphical query interface: We may also execute queries using the graphical user interface xsdbQueryGUI.exe provided by the C-sharp/.NET implementation.

xsdbQueryGUI input screen.

xsdbQueryGUI result display.
For more information on this program see the CSharp/.NET implementation notes.

Example queries

The distribution includes a large number of sample queries within the xsdbXML/examples directory. For example look at the contents of the files
     xsdbXML/examples/yahoo_query.xsdb
     xsdbXML/examples/Accesses/totalhits_query.xsdb
     xsdbXML/examples/SupplierParts/supplier_query.xsdb
     xsdbXML/examples/BooksForSale/book_query.xsdb
     ...etcetera...
The discussion below explains the usage of many of these queries.

Deriving information from a single table.

We begin the detailed discussion of queries by describing ways to get simple information out of a single table. The most simple query extracts the entire content of the table with no restrictions or other specifications, like the query below:
  <query>
    <consult href="accesses.xsdb"/>
  </query>
From the user's perspective the "consult" tag is replaced by the contents of the file accesses.xsdb when the query is executed. The consultation could extract the contents of an xsdb document served by a web server by referring to a URL such as <consult href="http://my.server.edu/data/accesses.xsdb"/>

The response to the query lists the entire contents of the table with rows listed in an arbitrary order and columns listed alphabetically by name.

Sorting rows and ordering columns

A query may specify the order for the rows and columns of the result. For example the following query lists the accesses ordered by month and within month by decreasing hits. The column order is also specified.
  <query sort="month, hits desc" names="month hits page">
    <consult href="accesses.xsdb"/>
  </query>
Above "hits desc" specifies that the hits should be ordered in descending order (instead of the default ascending order). The evaluated query gives:
<or>
    <and>
       <i at="month">1</i>
       <i at="hits">2100</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <i at="month">1</i>
       <i at="hits">439</i>
       <s at="page">people.html</s>
    </and>
    <and>
       <i at="month">1</i>
       <i at="hits">15</i>
       <s at="page">products.html</s>
    </and>
    <and>
       <i at="month">2</i>
       <i at="hits">3300</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <i at="month">2</i>
       <i at="hits">650</i>
       <s at="page">products.html</s>
    </and>
    <and>
       <i at="month">2</i>
       <i at="hits">12</i>
       <s at="page">people.html</s>
    </and>
    <and>
       <i at="month">3</i>
       <i at="hits">1950</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <i at="month">3</i>
       <i at="hits">665</i>
       <s at="page">people.html</s>
    </and>
    <and>
       <i at="month">3</i>
       <i at="hits">98</i>
       <s at="page">products.html</s>
    </and>
</or>

Eliminating columns.

The select operator eliminates columns. For example the query
  <query>
    <select names="month page">
       <consult href="accesses.xsdb"/>
    </select>
  </query>
generates the result
  <or>
    <and>
       <i at="month">3</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <i at="month">2</i>
       <s at="page">products.html</s>
    </and>
    <and>
       <i at="month">1</i>
       <s at="page">products.html</s>
    </and>
    <and>
       <i at="month">2</i>
       <s at="page">people.html</s>
    </and>
    <and>
       <i at="month">1</i>
       <s at="page">people.html</s>
    </and>
    <and>
       <i at="month">3</i>
       <s at="page">people.html</s>
    </and>
    <and>
       <i at="month">1</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <i at="month">3</i>
       <s at="page">products.html</s>
    </and>
    <and>
       <i at="month">2</i>
       <s at="page">index.html</s>
    </and>
  </or>
The select (and all other operations) will eliminate any duplicate rows automatically. For example the query
  <query>
    <select names="month">
       <consult href="accesses.xsdb"/>
    </select>
  </query>
Only lists each month once
  <or>
       <i at="month">1</i>
       <i at="month">2</i>
       <i at="month">3</i>
  </or>
Note that the select operation requires all selected columns to have values by default. For example
  <query>
     <select names="firstName lastName">
     <or>
         <and> <s at="firstName">Joe</s> <s at="lastName">Smith</s> </and>
         <and> <s at="firstName">Jane</s> </and>
         <and> <s at="lastName">Doe</s> </and>
     </or>
     </select>
  </query>
generates
    <and>
       <s at="firstName">Joe</s>
       <s at="lastName">Smith</s>
    </and>
where the rows with missing columns are eliminated. The undefined="true" attribute for the select tag overrides this behaviour. For example
  <query>
     <select names="firstName lastName" undefined="true">
     <or>
         <and> <s at="firstName">Joe</s> <s at="lastName">Smith</s> </and>
         <and> <s at="firstName">Jane</s> </and>
         <and> <s at="lastName">Doe</s> </and>
     </or>
     </select>
  </query>
generates all rows regardless of whether columns are missing
<or>
    <s at="lastName">Doe</s>
    <s at="firstName">Jane</s>
    <and>
       <s at="firstName">Joe</s>
       <s at="lastName">Smith</s>
    </and>
</or>

Renaming columns

The rename operation changes the name of a column. For example
  <query>
    <rename from="month page" to="MonthNumber WebPage">
       <consult href="accesses.xsdb"/>
    </rename>
  </query>
Changes the names "month" and "page" to "MonthNumber" and "WebPage" respectively, leaving all other names alone, producing
  <or>
    <and>
       <i at="MonthNumber">3</i>
       <s at="WebPage">people.html</s>
       <i at="hits">665</i>
    </and>
    <and>
       <i at="MonthNumber">2</i>
       <s at="WebPage">people.html</s>
       <i at="hits">12</i>
    </and>
    ... other entries omitted ...
  </or>

Choosing rows

A query can generate only the rows of interest by providing a restriction to match against the rows. For example
  <query>
     <and>
       <i at="month">2</i>
       <consult href="accesses.xsdb"/>
     </and>
  </query>
 
generates only the rows for month 2.
  <or>
    <and>
       <i at="hits">3300</i>
       <i at="month">2</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <i at="hits">650</i>
       <i at="month">2</i>
       <s at="page">products.html</s>
    </and>
    <and>
       <i at="hits">12</i>
       <i at="month">2</i>
       <s at="page">people.html</s>
    </and>
  </or>
The query condition can be complex. For example
  <query>
     <and>
       <or>
          <i at="month">2</i>
          <i at="month">3</i>
       </or>
       <exclude>
          <s at="page">people.html</s>
       </exclude>
       <consult href="accesses.xsdb"/>
     </and>
  </query>
Generates rows for either months 2 or 3, except for rows for the people page.
  <or>
    <and>
       <i at="hits">98</i>
       <i at="month">3</i>
       <s at="page">products.html</s>
    </and>
    <and>
       <i at="hits">3300</i>
       <i at="month">2</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <i at="hits">1950</i>
       <i at="month">3</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <i at="hits">650</i>
       <i at="month">2</i>
       <s at="page">products.html</s>
    </and>
  </or>
Note that positive conditions which cannot be decided are assumed to be true and appended to each row. For example
  <query>
     <and>
       <i at="month">2</i>
       <s at="author">jim@bingo.com</s>
       <consult href="accesses.xsdb"/>
     </and>
  </query>
Adds the assumption that the author was jim because there is no evidence to the contrary.
  <or>
    <and>
       <s at="author">jim@bingo.com</s>
       <i at="hits">12</i>
       <i at="month">2</i>
       <s at="page">people.html</s>
    </and>
    <and>
       <s at="author">jim@bingo.com</s>
       <i at="hits">3300</i>
       <i at="month">2</i>
       <s at="page">index.html</s>
    </and>
    <and>
       <s at="author">jim@bingo.com</s>
       <i at="hits">650</i>
       <i at="month">2</i>
       <s at="page">products.html</s>
    </and>
  </or>

Grouping rows

The group predicate collects rows that match on specific attributes. For example the following groups rows by month
  <query>
     <group at="rows" names="month">
        <consult href="accesses.xsdb"/>
     </group>
  </query>
producing
  <or>
    <and>
       <i at="month">1</i>
        <a at="rows">
         <or>
          <and>
           <i at="hits">439</i>
           <i at="month">1</i>
           <s at="page">people.html</s>
          </and>
          <and>
           <i at="hits">15</i>
           <i at="month">1</i>
           <s at="page">products.html</s>
          </and>
          <and>
           <i at="hits">2100</i>
           <i at="month">1</i>
           <s at="page">index.html</s>
          </and>
         </or>
        </a>
    </and>

    <and>
       <i at="month">2</i>
        <a at="rows">
         <or>
          <and>
           <i at="hits">3300</i>
           <i at="month">2</i>
           <s at="page">index.html</s>
          </and>
          <and>
           <i at="hits">650</i>
           <i at="month">2</i>
           <s at="page">products.html</s>
          </and>
          <and>
           <i at="hits">12</i>
           <i at="month">2</i>
           <s at="page">people.html</s>
          </and>
         </or>
        </a>
    </and>

    <and>
       <i at="month">3</i>
        <a at="rows">
         <or>
          <and>
           <i at="hits">98</i>
           <i at="month">3</i>
           <s at="page">products.html</s>
          </and>
          <and>
           <i at="hits">1950</i>
           <i at="month">3</i>
           <s at="page">index.html</s>
          </and>
          <and>
           <i at="hits">665</i>
           <i at="month">3</i>
           <s at="page">people.html</s>
          </and>
         </or>
        </a>
    </and>
  </or>
Above, each of the three top level rows contains two columns, the month number and a rows column (tagged "a" for "aggregate") which in turn contains rows for that month. Aggregate expressions such as this are most useful for computing aggregates as demonstrated below.

Computing aggregates

The calc function allows the computation of aggregates across grouped rows. For example the following query computes the sum of all hits for each month, assigned to the column "totalhits".
  <query names="month totalhits">
  <and>
     <group at="rows" names="month">
        <consult href="accesses.xsdb"/>
     </group>
     <calc at="totalhits" expr="sum(rows.hits)"/>
  </and>
  </query>
Generating the result
  <or>
    <and>
       <i at="month">3</i>
       <i at="totalhits">2713</i>
    </and>
    <and>
       <i at="month">1</i>
       <i at="totalhits">2554</i>
    </and>
    <and>
       <i at="month">2</i>
       <i at="totalhits">3962</i>
    </and>
  </or>
The following query illustrates the use of the sum, min (minimum), max (maximum), avg (average), med (median) and count aggregate operations.
  <query names="rows totalhits minhits maxhits avghits medhits number">
  <and>
     <group at="rows" names="month">
        <consult href="accesses.xsdb"/>
     </group>
     <calc at="totalhits" expr="sum(rows.hits)"/>
     <calc at="minhits" expr="min(rows.hits)"/>
     <calc at="maxhits" expr="max(rows.hits)"/>
     <calc at="avghits" expr="avg(rows.hits)"/>
     <calc at="medhits" expr="med(rows.hits)"/>
     <calc at="number" expr="count(rows.hits)"/>
   </and>
  </query>
Generating the result
  <or>
    <and>
       <i at="month">3</i>
       <i at="totalhits">2713</i>
       <i at="minhits">98</i>
       <i at="maxhits">1950</i>
       <f at="avghits">904.33333333333337</f>
       <i at="medhits">665</i>
       <i at="number">3</i>
    </and>
    <and>
       <i at="month">1</i>
       <i at="totalhits">2554</i>
       <i at="minhits">15</i>
       <i at="maxhits">2100</i>
       <f at="avghits">851.33333333333337</f>
       <i at="medhits">439</i>
       <i at="number">3</i>
    </and>
    <and>
       <i at="month">2</i>
       <i at="totalhits">3962</i>
       <i at="minhits">12</i>
       <i at="maxhits">3300</i>
       <f at="avghits">1320.6666666666667</f>
       <i at="medhits">650</i>
       <i at="number">3</i>
    </and>
  </or>

Computing other derived values

In addition to computing aggregates the calc function can compute arbitrary arithmetic values. For example
  <query>
  <and>
    <f at="x">7.0</f>
    <f at="y">-8.0</f>
    <calc at="DistanceToOrigin" expr="sqrt(x*x+y*y)"/>
    <calc at="ThetaRadians" expr="atan(y/x)"/>
    <calc at="ThetaDegrees" expr="ThetaRadians * 180/pi()"/>
  </and>
  </query>
Which produces the result
    <and>
       <f at="DistanceToOrigin">10.63014581273465</f>
       <f at="ThetaDegrees">-48.814074834290352</f>
       <f at="ThetaRadians">-0.85196632717327203</f>
       <f at="x">7.0</f>
       <f at="y">-8.0</f>
    </and>
Note that any calc tag should always follow declarations that establish bindings for any quantities required. For example the following does not succeed:
  <query>
  <and>
    <calc at="ThetaDegrees" expr="ThetaRadians * 180/pi()"/>
    <f at="x">7.0</f>
    <f at="y">-8.0</f>
    <calc at="DistanceToOrigin" expr="sqrt(x*x+y*y)"/>
    <calc at="ThetaRadians" expr="atan(y/x)"/>
  </and>
  </query>
The value for ThetaDegrees cannot be computed because the declaration for ThetaRadians does not preceed the ThetaDegrees computation. The query evaluates to the following partial result
    <and weak="no exact value for attribute 'ThetaRadians'">
       <f at="DistanceToOrigin">10.63014581273465</f>
       <f at="ThetaRadians">-0.85196632717327203</f>
       <f at="x">7.0</f>
       <f at="y">-8.0</f>
    </and>
The "weak" attribute indicates that a computation in the derivation did not complete.

A sequence of calculations can be applied to a sequence of rows, of course, as illustrated by the following

  <query>
  <and>
    <or>
      <and> <f at="x">7.0</f> <f at="y">-8.0</f> </and> 
      <and> <f at="x">1.0</f> <f at="y">1.0</f> </and> 
      <and> <f at="x">1.0</f> <f at="y">-1.0</f> </and> 
      <and> <f at="x">-19.0</f> <f at="y">111.0</f> </and> 
      <and> <f at="x">1.0</f> <f at="y">0</f> </and> 
    </or>
    <calc at="DistanceToOrigin" expr="sqrt(x*x+y*y)"/>
    <calc at="ThetaRadians" expr="atan(y/x)"/>
    <calc at="ThetaDegrees" expr="ThetaRadians * 180/pi()"/>
  </and>
  </query>
Which generates
  <or>
    <and>
       <f at="DistanceToOrigin">10.63014581273465</f>
       <f at="ThetaDegrees">-48.814074834290352</f>
       <f at="ThetaRadians">-0.85196632717327203</f>
       <f at="x">7.0</f>
       <f at="y">-8.0</f>
    </and>
    <and>
       <f at="DistanceToOrigin">1.4142135623730951</f>
       <f at="ThetaDegrees">45.0</f>
       <f at="ThetaRadians">0.78539816339744828</f>
       <f at="x">1.0</f>
       <f at="y">1.0</f>
    </and>
    <and>
       <f at="DistanceToOrigin">1.4142135623730951</f>
       <f at="ThetaDegrees">-45.0</f>
       <f at="ThetaRadians">-0.78539816339744828</f>
       <f at="x">1.0</f>
       <f at="y">-1.0</f>
    </and>
    <and>
       <f at="DistanceToOrigin">112.61438629233834</f>
       <f at="ThetaDegrees">-80.286748975059723</f>
       <f at="ThetaRadians">-1.4012681153369748</f>
       <f at="x">-19.0</f>
       <f at="y">111.0</f>
    </and>
    <and>
       <f at="DistanceToOrigin">1.0</f>
       <f at="ThetaDegrees">0.0</f>
       <f at="ThetaRadians">0.0</f>
       <f at="x">1.0</f>
       <f at="y">0.0</f>
    </and>
  </or>

General conditions

The general conditional test is provided by the cond predicate. For example the query
   <query>
       <and>
       <or>
           <i at="N">10</i>
           <i at="N">13</i>
           <i at="N">14</i>
           <i at="N">17</i>
           <i at="N">19</i>
           <i at="N">20</i>
       </or>
       <cond expr="N%2==0"/>
       </and>
   </query>
chooses the values for N that are divisible by 2:
<or>
       <i at="N">10</i>
       <i at="N">20</i>
       <i at="N">14</i>
</or>
Like the calc function the cond predicate must follow bindings for any values required by the condition.

Expressions for cond and calc

The python implementation allows the cond predicate and the calc function to use the normal python infix operators
   + - * / & % == < > <= >=
as well as the aggregate functions mentioned above, functions from the python math and string modules such as
   sin(x) log(x) find(string, substring)
where constants are written as functions with no arguments, such as
   e() pi() letters()
Also provided are the builtin functions
   int(x) float(x) str(x) round(x) abs(x)

Special conditions

There are a number of special conditional tests that are treated by the inference engine as partial values. These tests have their own syntax and do not need to follow the bindings they require. They are also used in certain optimizations within the inference engine. For example the following selects the values for N that are less than 15
   <query>
     <and>
       <i at="N"><lt>15</lt></i>
       <or>
           <i at="N">10</i>
           <i at="N">13</i>
           <i at="N">14</i>
           <i at="N">17</i>
           <i at="N">19</i>
           <i at="N">20</i>
       </or>
     </and>
   </query>
Generating the result
  <or>
       <i at="N">10</i>
       <i at="N">13</i>
       <i at="N">14</i>
  </or>
The following special conditions may be used with strings or numbers
   <i at="N"><lt>15</lt></i>  -- N is less than 15
   <i at="N"><gt>15</gt></i>  -- N is greater than 15
   <i at="N"><le>15</le></i>  -- N is less than or equal to 15
   <i at="N"><ge>15</ge></i>  -- N is greater than or equal to 15
Strings also may be used with the prefix special condition
   <s at="City">Chi</s> -- City has "chi" as a prefix

Ungrouping

The "inverse" of the group operation is the ungroup operation which splits a row containing a grouping into several rows, one for each element of the group. This operation is most useful when a table is provided in grouped form.
  <query>
    <ungroup at="students">
      <or>
          <and>
              <s at="instructor">Fischman</s>
              <a at="students">
                 <or>
                    <s at="student">Jim</s>
                    <s at="student">Sally</s>
                    <s at="student">Bob</s>
                 </or>
              </a>
          </and>
          <and>
              <s at="instructor">Bader</s>
              <a at="students">
                 <or>
                    <s at="student">John</s>
                    <s at="student">Lisa</s>
                 </or>
              </a>
          </and>
          <and>
              <s at="instructor">Wilk</s>
              <a at="students">
                 <nothing/>
              </a>
          </and>
      </or>
    </ungroup>
  </query>
The above query ungroups the students to associate them with the appropriate instructor. The result is
<or>
    <and>
       <s at="instructor">Fischman</s>
       <s at="student">Sally</s>
    </and>
    <and>
       <s at="instructor">Bader</s>
       <s at="student">Lisa</s>
    </and>
    <and>
       <s at="instructor">Bader</s>
       <s at="student">John</s>
    </and>
    <and>
       <s at="instructor">Fischman</s>
       <s at="student">Jim</s>
    </and>
    <and>
       <s at="instructor">Fischman</s>
       <s at="student">Bob</s>
    </and>
</or>
Note that the instructor Wilk vanishes because there is no associated student.

Deriving information from multiple tables

Very many xsdb queries will combine more than one table. For the purposes of the following discussion consider the following tables. Suppose women.xsdb contains
  <context>
  <title> some women </title>
  <or>
    <s at="name">Jane</s>
    <s at="name">Wendy</s>
    <s at="name">Sandra</s>
    <s at="name">Tess</s>
    <s at="name">Lisa</s>
  </or>
  </context>
and kids.xsdb contains
  <context>
  <title> Moms and children </title>
  <or>
    <and> <s at="mom">Wendy</s> <s at="kid">Jane</s> </and>
    <and> <s at="mom">Wendy</s> <s at="kid">Bob</s> </and>
    <and> <s at="mom">Wendy</s> <s at="kid">Sandra</s> </and>
    <and> <s at="mom">Sandra</s> <s at="kid">Lisa</s> </and>
    <and> <s at="mom">Sandra</s> <s at="kid">Bill</s> </and>
    <and> <s at="mom">Tess</s> <s at="kid">Peter</s> </and>
    <and> <s at="mom">Lisa</s> <s at="kid">Waldo</s> </and>
  </or>
  </context>
Below we discuss a number of queries combining these two tables.

Subqueries

The subquery predicate derives a grouping using a query within a query. For example the follow query uses a subquery to group women with any children they have.
  <query names="name kids">
  <title>women with their kids (even if there aren't any)</title>
  <and>
    <consult href="women.xsdb"/>
    <subquery at="kids" names="name">
        <rename to="name" from="mom">
            <consult href="kids.xsdb"/>
        </rename>
    </subquery>
  </and>
  </query>
Note that the rename operation makes the name column in women.xsdb match the mom column in kids.xsdb. Furthermore the names="name" clause specifies that the "name" column from the outer query is needed for the inner query. The result of this query gives
<or>
    <and>
       <s at="name">Jane</s>
       <a at="kids">
         <nothing/>
       </a>
    </and>

    <and>
       <s at="name">Lisa</s>
       <a at="kids">
          <and>
           <s at="kid">Waldo</s>
           <s at="name">Lisa</s>
          </and>
       </a>
    </and>

    <and>
       <s at="name">Tess</s>
       <a at="kids">
          <and>
           <s at="kid">Peter</s>
           <s at="name">Tess</s>
          </and>
        </a>
    </and>

    <and>
       <s at="name">Wendy</s>
       <a at="kids">
         <or>
          <and>
           <s at="kid">Sandra</s>
           <s at="name">Wendy</s>
          </and>
          <and>
           <s at="kid">Bob</s>
           <s at="name">Wendy</s>
          </and>
          <and>
           <s at="kid">Jane</s>
           <s at="name">Wendy</s>
          </and>
         </or>
       </a>
    </and>

    <and>
       <s at="name">Sandra</s>
       <a at="kids">
         <or>
          <and>
           <s at="kid">Bill</s>
           <s at="name">Sandra</s>
          </and>
          <and>
           <s at="kid">Lisa</s>
           <s at="name">Sandra</s>
          </and>
         </or>
       </a>
    </and>
</or>
Here each woman is group with her children, including Jane who is grouped with no children. We can derive from this query aggregate reports such as women and the number of children they have
  <query names="name nkids">
  <title>women with number of children</title>
  <and>
    <consult href="women.xsdb"/>
    <subquery at="kids" names="name">
        <rename to="name" from="mom">
            <consult href="kids.xsdb"/>
        </rename>
    </subquery>
    <calc at="nkids" expr="count(kids.kid)"/>
  </and>
  </query>
With the result
  <or>
    <and>
       <s at="name">Lisa</s>
       <i at="nkids">1</i>
    </and>
    <and>
       <s at="name">Jane</s>
       <i at="nkids">0</i>
    </and>
    <and>
       <s at="name">Wendy</s>
       <i at="nkids">3</i>
    </and>
    <and>
       <s at="name">Sandra</s>
       <i at="nkids">2</i>
    </and>
    <and>
       <s at="name">Tess</s>
       <i at="nkids">1</i>
    </and>
  </or>
Notice that Jane appears in this query result with a count of zero children.

Unions

We may use the OR combinator to form the collection of names of all people listed in either table
  <query>
  <or>
    <consult href="women.xsdb"/>

    <select names="name">
      <rename from="mom" to="name">
        <consult href="kids.xsdb"/>
      </rename>
    </select>

    <select names="name">
      <rename from="kid" to="name">
        <consult href="kids.xsdb"/>
      </rename>
    </select>
  </or>
  </query>
which produces the result
  <or>
       <s at="name">Waldo</s>
       <s at="name">Bill</s>
       <s at="name">Wendy</s>
       <s at="name">Sandra</s>
       <s at="name">Jane</s>
       <s at="name">Lisa</s>
       <s at="name">Bob</s>
       <s at="name">Peter</s>
       <s at="name">Tess</s>
  </or>

Intersections

We may use the AND combinator to form the names of kids who are also moms
  <query>
  <and>
    <select names="name">
      <rename from="mom" to="name">
        <consult href="kids.xsdb"/>
      </rename>
    </select>

    <select names="name">
      <rename from="kid" to="name">
        <consult href="kids.xsdb"/>
      </rename>
    </select>
  </and>
  </query>
which produces the result
  <or>
       <s at="name">Lisa</s>
       <s at="name">Sandra</s>
  </or>

Joins

The AND combinator also can be used to join two tables, but the columns must be renamed appropriately so that the names match where they make sense. For example the following produces a list of grandmothers, mothers and kids.
   <query names="grandmother mom kid">
   <and>
      <rename from="mom kid" to="grandmother mom">
         <consult href="kids.xsdb"/>
      </rename>
      <consult href="kids.xsdb"/>
   </and>
   </query>
Here the conjunctions will match only where a kid is also a mom. This produces the result
<or>
    <and>
       <s at="grandmother">Sandra</s>
       <s at="mom">Lisa</s>
       <s at="kid">Waldo</s>
    </and>
    <and>
       <s at="grandmother">Wendy</s>
       <s at="mom">Sandra</s>
       <s at="kid">Lisa</s>
    </and>
    <and>
       <s at="grandmother">Wendy</s>
       <s at="mom">Sandra</s>
       <s at="kid">Bill</s>
    </and>
</or>

Exclusions

Exclusions in conjunctions remove sets of rows that are not wanted. The exclude operator is somewhat similar to relational algebra subtraction or "not exists" and "not in" in SQL. For example the following lists the entries from the kids table where the kid is not a woman.
   <query>
      <consult href="kids.xsdb"/>
      <exclude>
         <rename from="name" to="kid">
             <consult href="women.xsdb"/>
         </rename>
      </exclude>
   </query>
This query produces the result
  <or>
    <and>
       <s at="kid">Waldo</s>
       <s at="mom">Lisa</s>
    </and>
    <and>
       <s at="kid">Bob</s>
       <s at="mom">Wendy</s>
    </and>
    <and>
       <s at="kid">Peter</s>
       <s at="mom">Tess</s>
    </and>
    <and>
       <s at="kid">Bill</s>
       <s at="mom">Sandra</s>
    </and>
  </or>
It is important that exclusions only include relevant information, since a row is excluded only if it matches every column in an exclusion. For example the following query WILL NOT determine the women who are not moms.
   <query>
      <and>
           <consult href="women.xsdb"/>
           <exclude>
              <rename from="mom" to="name">
                  <consult href="kids.xsdb"/>
              </rename>
           </exclude>
      </and>
   </query>
This produces the result
   <or weak="unresolved exclusions omitted from query result">
       <s at="name">Jane</s>
       <s at="name">Lisa</s>
       <s at="name">Sandra</s>
       <s at="name">Wendy</s>
       <s at="name">Tess</s>
   </or>
where the "weak" mark indicates that some negative information was omitted from the query result but not logically eliminated. To fix this query we must only include the mom column from the kids.xsdb table within the exclusion as follows
   <query>
      <and>
           <consult href="women.xsdb"/>
           <exclude>
              <rename from="mom" to="name">
                  <select names="mom">
                     <consult href="kids.xsdb"/>
                  </select>
              </rename>
           </exclude>
      </and>
   </query>
This query produces the result
       <s at="name">Jane</s>

Bookmarks

Common subexpressions may be given bookmarked names either in queries or in other xsdb documents. For example the "grandmother query" given above could be rewritten
   <query names="grandmother mom kid">

   <name bookmark="k"> <consult href="kids.xsdb"/> </name>

   <and>
      <rename from="mom kid" to="grandmother mom">
         <reference bookmark="k"/>
      </rename>
      <reference bookmark="k"/>
   </and>
   </query>
Here the name tag defines the bookmark "k" to refer to the contents of kids.xsdb and the two reference tags make use of the bookmark. The bookmark feature is both a convenience to save typing and an optimization, since in many cases the content of a bookmark will only be evaluated once even if it is used many times.

Tag quick reference

The following table summarizes the tags of the xsdbXML framework. Tags such as cond do not require a body -- in which case the missing body is interpreted to be <anything/>.
<nothing/> Empty result.
<anything/> Unrestricted result.
<s at="page">index.html</s> Attribute page has the string value index.html.
<id at="ssnum">111-22-3333</id> Attribute ssnum has the string value 111-22-3333 and it's an id.
<i at="month">11</i> Attribute month has the integer value 11.
<f at="width">0.334</f> Attribute width has the floating point value 0.33.
<and> E1 E2 E3 </and> Conjunction of assertions E1 E2 E3
<or> E1 E2 E3 </or> Disjunction of assertions E1 E2 E3
<exclude> E </exclude> Exclude patterns matching E.
<rename from="A B C" to="X Y Z"> expression </rename> rename attributes A B C to X Y Z in expression.
<select names="A B C" [undefined="true"]> expression </select> Keep only names "A B C" [allow nulls] from expression
<object id="123"> TEXT </id> Associate the identity "123" with the TEXT.
<object id="123" href="URL"/> Associate the identity "123" with the contents of the URL.
<name bookmark="X"> expression </name> Associate the bookmark X with the expression.
<reference bookmark="X"/> Use the expression associated with bookmark X here.
<fn name="f" at="x" args="abc"/> Assign to attribute x the value of evaluating function f with arguments abc in the current context.
<pred name="p" args="abc"> expression </pred> Evaluate predicate p with arguments abc over the expression within the current context.
<consult href="URL"> expression </consult>
<consult href="URL"/>
Substitute the contents of URL here [conjoined with expression].
<group at="X" names="A B C"> expression </group> Group by attributes A B C and assign groupings to attribute X.
<ungroup at="X"> expression </ungroup> Ungroup attribute X.
<cond expr="boolean expression"> expression </cond> Evaluate the boolean expression for the result of expression.
<calc at="X" expr="algebraic expression"/> Evaluate the algebraic expression in the current context.
<subquery at="X" names="A B"> expression </subquery> Evaluate the subquery, importing the names A and B from the outer context. Store the result in attribute X.
End of Guide for using the XsdbXML framework.
return to index