xFeed AJAX data publication documentation

xsdb project page with download links
xFeedMe xsdb resources

xFeed AJAX data publication documentation

Abstract: The xFeed example application of the xsdb package provides an infrastructure which makes it easy to implement "type ahead suggest" drop down selections matching against large or small data sets. The xFeed example illustrates how the xsdb framework can simplify data management and transmission in real applications. All software described here is open source and freely available for use and modification by anyone. This document explains how to install and use xFeed.

Keywords: AJAX, Asyncronous Javascript with XML, data publication, database, xsdb, Python, data feed, open source, web development, active pages.

Audience: This document is intended for internet professionals who would like to add AJAX type ahead suggestion functionality to documents on their web sites. Readers may use the software described here directly or use the information provided here supplimented with the source code to help them develop their own AJAX functionality.

Prerequisites: To understand this document you need to have a basic familiarity with HTML and Javascript. To implement the techniques described in this document you must be able to add web pages to an internet server. To add the optional indexing methods described below you must be able to add Python CGI scripts or Apache web server mod-python components to an internet server. This software uses recent Javascript features and may not work properly in combination with older browsers (see comments below).

Overview

AJAX (asynchronous javascript with XML) is a web methodology which provides advanced highly usable interfaces in web pages. AJAX technology uses features of XML, javascript, HTML, and server side software.

One of the most common applications of AJAX is "type ahead suggestions" where an AJAX component combines some of the information typed into a web form with information provided from a web server in order to suggest likely completions for elements of the web form. This document discusses the xFeed package which provides a simple methodology for implementing type ahead suggestions. Some live demos of pages that use the methodology are the xFeedMe zipcode demo page (which uses btree indexing on the server) and the xFeedMe state information page (which uses a static xsdb data page).

For example the page shown below provides information on books and prices. In the screen-shot the user has typed the book title prefix "apoc" and the AJAX interface has automatically determined the full names of all books that start with that prefix. (This interface shown is available live on the internet at http://www.xfeedme.com/xsdbXML/xFeed/BookMatch.html)


EXAMPLE: When the user types the prefix "apoc" the `xFeed` form watcher offers suggested completions.

In this example the user may select any of one the options in the drop down box to fill in the complete book name and all of the other data fields about that book automatically.

The xFeed package implements this kind of functionality as follows:

When the HTML page containing the form is loaded a "FormWatcher" javascript object is created which periodically looks at input elements of the form to see if entries have changed.
When the FormWatcher detects a change the watcher creates an xsdb "pattern" representing the current information in the form. For example if the user has provided typed "apoc" in the BookTitle input box and "v" in the Authors List input box then the pattern would look like the following:
<and> <s at="TitleAsShown"><prefix>apoc</prefix></s> <s at="AuthorsList"><prefix>v</prefix></s> </and>
This pattern is then sent as part of a query request to an xsdb component on a server.

The xsdb server component evaluates the query and replies with the match shown below (with some of the more verbose values abbreviated)

<or>
<and>
   <s at="ISBN">9004106219</s>
   <s at="TitlePageNames">derk visser</s>
   <s at="ListPriceEuro">112</s>
   <s at="ListPriceDollar">160</s>
   <s at="MainCategory">medieval & early modern studies</s>
   <s at="AuthorsList">visser, d.</s>
   <s at="PublicationDate">1996-08-01t00:00:00.000</s>
   <s at="Phase">in print</s>
   <s at="BIC1">hbbc</s>
   <s at="NumberOfPages">xxxii, 240 pp.</s>
   <s at="SubCategory">medieval history</s>
   <s at="Readership">all those interested in the history of medieval...</s>
   <s at="Cover">cloth with dustjacket</s>
   <s at="ShortAbstract">this study identifies berengaudus of...</s>
   <s at="SeriesTitle">studies in the history of christian traditions</s>
   <s at="BISAC1">his037010</s>
   <s at="Subtitle">the apocalypse commentary of berengaudus...</s>
   <s at="Abstract">this study relates the utopian ...</s>
   <s at="TitleAsShown">apocalypse as utopian expectation (800-1500)</s>
   <s at="VolumeNumber">73</s>
</and>
</or>

The xFeed component then conjoins the reply with the original query and extracts the resulting match.
In this case since there is a single match the xFeed component fills in all form elements with the corresponding values.


EXAMPLE: When the user types the prefix "apoc" for the title and "v" for the author the rest of the form values are filled with the information for the only matching book.

This document shows the steps required to create an xFeed enabled page with auto-completion functionality, which requires the following general steps.

A) INSTALL JAVASCRIPT:: Publish the xsdball.js and formWatcher.js java script modules on your.webserver.com.
B) ENABLE SERVER:: Provide some sort of server component which publishes xsdb formatted data in response to HTTP queries.
C) CREATE THE FORM:: Create and publish an HTML page including a web form and a javascript onload function which initializes and starts a FormWatcher for that form, connecting it to the xsdb server data source.

Please note that because of the "SAME SOURCE" javascript security restriction all components must be installed on the same server (your.webserver.com). Also, it should be mentioned that step (B) can be very simple or very complex. This document will discuss steps A and C and will describe three ways to implement step B.

STEP A: Publishing javascript components

In order to use the xFeed component you must publish the xFeed and xsdb javascript components on your server. These components run on the client browser and not on your server -- but the server must provide them to the client as text.

Get the xFeed distribution (from http://sourceforge.net/project/showfiles.php?group_id=93603, the downloads page of the xsdb sourceforge distribution) and unzip it. The two required files are

	xsdb/xsdbXML/xFeed/formWatcher.js
	xsdb/xsdbXML/xsdbjs/xsdball.js

I publish these files on my servers by placing them at

	[SERVERROOT]/xsdbXML/xFeed/formWatcher.js
	[SERVERROOT]/xsdbXML/xsdbjs/xsdball.js

(The simple way to do this is to copy the xsdbXML directory under the SERVERROOT.)

This makes the files available to clients at the URLs

	http://my.server.com/xsdbXML/xFeed/formWatcher.js
	http://my.server.com/xsdbXML/xsdbjs/xsdball.js

If you put the files in some other server relative location you must make appropriate changes to URLs mentioned in this document to reflect the location you chose.

STEP B option 1: Using a static `xsdb` source file

One of the reasons xFeed is easy to use is that it does not necessarily require any "live" server component in order to work -- for small data sets or for testing purposes you may use a static xsdb data file to provide the server information required by the form. The javascript xsdb implementation is capable of evaluating any xsdb query using static data without any help from the server.

This approach is only advisable for testing and small data sets because for larger data sets the extra processing required by the client (in the same processor thread as the graphical interface) makes the browser user interface slow and jerky.

For example we may publish a file containing information about U.S. States provided in the distribution as

	xsdb/xsdbXML/xFeed/testdata/states.xsdb

By placing the file under the server root at

	[SERVERROOT]/test/states.xsdb

This makes it possible for clients to load the contents of the file

<context>
<title>State names, trees, flowers and capitals of U.S states</title>
<or>
 <and>
  <s at="State">alabama</s>
  <s at="Capital">montgomery</s>
  <s at="StateTree">longleaf pine</s>
  <s at="TreeScientific">pinus palustis</s>
  <s at="StateFlower">camelia</s>
  <s at="FlowerScientific">camellia japonica</s>
 </and>
 <and>
  <s at="State">alaska</s>
  <s at="Capital">juneau</s>
  ...
 </and>
...
</or>
</context>

using the URL

	http://my.server.com/test/states.xsdb

An xFeed enabled web page can extract appropriate data from this file automatically. You can test that the data is available by opening the URL for the xsdb page (the browser will probably try to interpret the page as html, so it will look strange, but if you "view source" you should see the xsdb data).

Note that all values in the xsdb file content such as alaska are in lower case letters. For simplicity at this writing the FormWatcher component only works with lower case letters. Please complain if you need it to handle mixed case also. Until this limitation is lifted please make sure that any static xsdb data files used with a FormWatcher has only lower cased values.

STEP C: Creating an `xFeed` enabled form

An xFeed enabled web page will usually include three components:

A <form> with elements to be automatically completed.
A javascript page onload function that initializes the FormWatcher and starts the form watching process.
References to required javascript modules.

The StatesSimple.html example page has these components in the following general outline:

<html>
<head>
<script language="JavaScript">
// the form watcher must be bound to a global variable
var Watcher = null;

// this function is called after the page is loaded (body.onload())
function OnLoadFunction() {
	... ONLOAD FUNCTION BODY DISCUSSED BELOW ...
}
</script>
</head>

<body onload="OnLoadFunction()">

<form name="StateForm" id="StateForm">
	... FORM ELEMENTS DISCUSSED BELOW ...
</form>

<script src="/xsdbXML/xsdbjs/xsdball.js">
</script>
<script src="/xsdbXML/xFeed/formWatcher.js">
</script>

</body>
</html>

The contents of the form and the OnLoadFunction have been deferred and the required xsdb components "/xsdbXML/xsdbjs/xsdball.js" and "/xsdbXML/xFeed/formWatcher.js" are put at the end of the HTML text to allow the browser to format the HTML before the javascript files are loaded.

The form includes a number of input elements that the FormWatcher will attempt to automatically complete.

<form name="StateForm" id="StateForm">
<input type="reset" value="reset"><br>
State Name: <input type="text" name="State" size="40"> <br>
State Capital: <input type="text" name="Capital" size="40"> <br>
State Tree: <input type="text" name="StateTree" size="40"> <br>
State Tree (scientic name): <input type="text" name="TreeScientific" size="40"> <br>
State Flower: <input type="text" name="StateFlower" size="40"> <br>
State Flower (scientific name): <input type="text" name="FlowerScientific" size="40">
</form>

In this case for simplicity the names of the input elements match the attribute names for the data values in the xsdb source file. Thus the input element named Capital

  <input type="text" name="Capital" size="40"> <br>

is to be associated with the xsdb values associated with the name Capitol like

  <s at="Capital">juneau</s>

The form id (with value StateForm) is also important because the FormWatcher uses the id to locate the form.

The OnLoadFunction() initializes the FormWatcher and specifies the form elements to complete

// the form watcher must be bound to a global variable
var Watcher = null;

// this function is called after the page is loaded (body.onload())
function OnLoadFunction() {
	// create a watcher and bind autofill attributes to form input elements
	Watcher = new FormWatcher("Watcher", "StateForm", "states.xsdb", 55);
	// since the data is static, preload and compile it
	Watcher.preload();
	// identify the input elements to watch
        Watcher.complete("State");
        Watcher.complete("Capital");
        Watcher.complete("StateTree");
        Watcher.complete("TreeScientific");
        Watcher.complete("StateFlower");
        Watcher.complete("FlowerScientific");
	//start watching
	Watcher.watch(2000);
}

Note that we must declare a global variable (in this case named Watcher) to house the form Watcher at the global javascript scope.

The constructor

	Watcher = new FormWatcher("Watcher", "StateForm", "/test/states.xsdb", 55);

creates a FormWatcher associated with the form with the id StateForm and bound to the xsdb data source at relative URL /test/states.xsdb (which resolves to the absolute URL http://your.server.com/test/states.xsdb). The 55 specifies that no drop down completion box should have more than 55 elements.

The directive

	Watcher.preload();

should only be used with static data sources such as /test/states.xsdb. The preload directive preloads the data file text and "precompiles" it to reduce network traffic and unneeded reprocessing.

The directive

        Watcher.complete("State");

directs the Watcher object to attempt autocompletions for the State input element of the StateForm form listed above.

	<input type="text" name="State" size="40">

By default the input element name is assumed to be the same as the data attribute name. The full calling sequence for Watcher.complete

	Watcher.complete(id, noDropDown, bindAttribute)

binds the form element id to the attribute name bindAttribute and if noDropDown is true the watcher will not generate drop down completions for that element (which may be desired if it sometimes contains long values).

Finally the function starts the watcher main loop

	Watcher.watch(2000);

in this case specifying that the watcher should check the form for changes every 2000 milliseconds (every 2 seconds). This starts a "polling loop" where the watcher repeatedly

waits 2 seconds;
if the watched form elements have changed the Watcher compares the new values to a query against the data in http://your.server.com/test/states.xsdb.
The Watcher fills in any data element with only one possible value.
if there are more than 1 but fewer than 55 possibilities for the focus data element (the element the user is typing) the Watcher offers those possibilities in a drop down list.
repeat.

This loop continues so long as the page is active. (The actual implementation is more complex than the outline above.)

And that's all you need (for small data). No special server side software required. You should be able to open the HTML page (http://your.server.com/test/StatesSimple.html), type yu into the State Flower input element, and in a couple seconds all appropriate values for "new mexico" (with state flower "yucca") should be filled in automatically.

If it doesn't work you could try:

Make sure the xFeedMe state information page works for the browser you are testing with, and if it doesn't use a more fully featured browser.
Make sure the data page states.xsdb is where you told the Watcher to find it.
Look for simple javascript errors using the javascript console or other features of your browser.
Panic and mail me for advice providing as much information as possible about the symptoms in your email including the full text of the html page.

Unfortunately you will probably find that as the static data size grows larger the query interface will become jerky and you will want to do more processing at the server, which will require installing some sort of server side component which responds to xsdb queries.

STEP B option 2: Using a CGI script in Python with BTree indexing to respond to xsdb queries

The xFeed package comes with an example server component for replying to xsdb queries posed by a client such as the FormWatcher. This component indexes table files provided as tab delimited text (as explained below) and serves xsdb formatted replies to queries using the indices. Programmers can use this approach directly or implement other approaches using this one as an example if desired. The example is configured for use either using CGI scripts or using mod-python under Apache. This section discusses the CGI script methodology. You may need to modify some or all of the absolute paths mentioned below to reflect your configuration.

Prerequisites: To use the Python CGI script you must have Python installed with a web server configured to support CGI scripts. You also must install the xsdb python implementation as discussed in the xsdb Python documentation.

In my case I have Apache installed on my Win32 workstation with CGI scripts enabled for scripts in the directory C:\Apache\Apache\cgi-bin which appears as the server relative directory http://my.server.com/cgi-bin/. My xsdb and xFeed distributions are placed under C:\xsdbSourceForge\.

Get the data. The xFeed indexer supports source data provided in tabbed delimited text format with a header line. An example of this format is provided in the distribution at xsdb/xsdbXML/xFeed/testdata/price_list.tab containing information about books and book prices. If you open this large file in a text editor such as emacs or WordPad it looks like this:

ISBN	TitleAsShown	Subtitle	TitlePageNames	Abstract	...
9004108157	The Missionary Lives	A Study in Canadian Missionary Biography 	...
9004079270	Medieval Islamic Symbolism and the Paintings in the ...
...

Here many rows and columns have been omitted. The first line provides attribute names for each column (each separated by a single tab character), and subsequent lines provide values for each column (separated by single tabs). Each line is terminated with a new line character. This is a very common format supported by many programs such as MySQL, Access, Excel, to name three.

Prepare the index. Preprocessed indices allow the xFeed cgi script to find records of interest more quickly than it could by scanning the entire file for every access. The following simple python script distributed as xsdb/xsdbXML/xFeed/testdata/indexBooks.py builds indices using data from price_list.tab

import sys
sys.path.append("..") # make sure MatchAssertion module can be found
import MatchAssertion

M = MatchAssertion.TreeMatcher("bookprices")
f = open("price_list.tab")
indexAttributes = ["ISBN", "TitleAsShown", "Subtitle",
                   'AuthorsList', 'MainCategory', 'SubCategory']
M.CreateFromDataLinesWithHeader(f, indexAttributes, verbose=True)
M.close(verbose=True)

When run in the directory xsdb/xsdbXML/xFeed/testdata/indexBooks.py opens the price_list.tab file as f and constructs indices for each of the attribute names in the list

	indexAttributes = ["ISBN", "TitleAsShown", "Subtitle",
                   'AuthorsList', 'MainCategory', 'SubCategory']

in the index files bookprices_0.idx through bookprices_5.idx. These B-tree based indices are now available for use by the CGI script which feeds xsdb data to the FormWatcher client.

The cgi script (provided in the distribution as xsdb/xsdbXML/xFeed/testdata/BookFeed.cgi) makes use of the bFeed.Feeder object which effectively hides most of the steps required to implement a CGI script.

#!c:\python23\python.exe
import sys
mydir = r"C:\xsdbSourceForge\xsdbXML\xFeed"
sys.path.append(mydir) # make sure the bFeed module can be found
import bFeed

f = bFeed.Feeder(mydir+"/testdata/bookprices")
f.doCGI()

Here the first group of statements identify the script as a Python script (to be executed by c:\python23\python.exe) and make sure that the Python system path includes the directory containing the bFeed module before importing that module. The statement

	f = bFeed.Feeder(mydir+"/testdata/bookprices")

constructs a Feeder object f associated with the index files created above and the line

	f.doCGI()

executes the CGI processing to handle a request. In particular the doCGI method

parses the CGI parameters,
looks for a required parameter named q which should contain an xsdb query text,
looks for an optional parameter named at which should name an the current "focus" attribute (the attribute the user is typing),
uses the index files to find records which match the query string,
returns an xsdb format reply listing the matches found as the CGI standard output (of type text/xml).

The actual processing is more complex than this, but the above lists the general idea.

To enable this CGI script move BookFeed.cgi to a cgi-bin directory for your server (in my case to C:\Apache\Apache\cgi-bin. In some cases you may need to do other things like make the cgi-script file executable (see your server documentation) and modify the absolute paths listed in the file content.

Test the cgi script. At this point if all went well you should be able to test your cgi-script by pointing a browser at

http://your.server.com/cgi-bin/BookFeed.cgi?q=<s%20at="ISBN">9004106219</s>

or a similar URL reflecting your configuration. This access requests that BookFeed.cgi find a match to the query <s at="ISBN">9004106219</s>. In my case the CGI script works and the browser presents the following XML.

Now the BookFeed.cgi CGI script is ready to be used by a FormWatcher on an HTML page. The distribution provides xsdb/xsdbXML/xFeed/testdata/CGISimple.html as a simplified example page which uses BookFeed.cgi to update a form. This HTML file is very similar to the StatesSimple.html example given above.

The form in CGISimple.html is similar to the form for StatesSimple.html except that it has a different id, different form elements and some of the form elements are sometimes large enough that they must be presented using a textarea input element.

<form name="BookForm" id="BookForm">
<input type="reset" value="reset"><br>

ISBN: <input type="text" name="ISBN" size="13"> <br>
Title As Shown: <input type="text" name="TitleAsShown" size="40"> <br>
Subtitle: <input type="text" name="Subtitle" size="40"> <br>
list of authors: <input type="text" name="AuthorsList" size="40"> <br>
Main Category: <input type="text" name="MainCategory" size="40"> <br>
Subcategory: <input type="text" name="SubCategory" size="40"> <br>
abstract: <textarea name="Abstract" cols="40" rows="8"></textarea> <br>
short abstract: <textarea name="ShortAbstract" cols="40" rows="3"></textarea> <br>
Readership: <textarea name="Readership" cols="40" rows="3"></textarea> <br>
Reviews: <textarea name="Reviews" cols="40" rows="3"></textarea> <br>
Author CV: <textarea name="CV" cols="40" rows="3"></textarea> <br>
phase of publication: <input type="text" name="Phase" size="20"> <br>
number of pages: <input type="text" name="NumberOfPages" size="13"> <br>
Cover info: <input type="text" name="Cover" size="13"> <br>
List price (euros): <input type="text" name="ListPriceEuro" size="13"> <br>
List price (dollars): <input type="text" name="ListPriceDollar" size="13"> <br>
Publication Date: <input type="text" name="PublicationDate" size="13"> <br>
Series Title: <input type="text" name="SeriesTitle" size="40"> <br>
Volume Number: <input type="text" name="VolumeNumber" size="13"> <br>

</form>

The onload function for the CGISimple.html is similar to the onload function for StatesSimple.html except that the watcher is bound to the correct form name BookForm, the correct data source /cgi-bin/BookFeed.cgi and the data is not preloaded.

// the form watcher must be bound to a global variable
var Watcher = null;

// this function is called after the page is loaded (body.onload())
function onLoadFunction() {
	// create a watcher and bind autofill attributes to form input elements
	Watcher = new FormWatcher("Watcher", "BookForm", "/cgi-bin/BookFeed.cgi", 20);
	Watcher.complete("ISBN")
	Watcher.complete("TitleAsShown")
	Watcher.complete("Subtitle")
	Watcher.complete('AuthorsList')
	Watcher.complete('MainCategory')
	Watcher.complete('SubCategory')
	// also bind attributes that will not have drop down completions
	Watcher.complete("Abstract", true);
	Watcher.complete("ShortAbstract", true);
	Watcher.complete("Readership", true);
	Watcher.complete("Reviews", true);
	Watcher.complete("CV", true);
	Watcher.complete("Phase", true);
	Watcher.complete("NumberOfPages", true);
	Watcher.complete("Cover", true);
	Watcher.complete("ListPriceEuro", true);
	Watcher.complete("ListPriceDollar", true);
	Watcher.complete("PublicationDate", true);
	Watcher.complete("SeriesTitle", true);
	Watcher.complete("VolumeNumber", true);
	//start watching
	Watcher.watch(1000);
}

Also, since many of the attributes are not appropriate for drop down list box completion suggestions the Watcher is bound to them with NoDropDown set true, for example as in

	Watcher.complete("ShortAbstract", true);

Finally the Watcher polling loop is started with the polling interval set to 1000 milliseconds (1 second).

To enable CGISimple.html for your server move it to an appropriate location under your server root, perhaps under htdocs/test so it corresponds to the URL http://your.server.com/test/CGISimple.html. If everything is hooked up properly you should now be able to point a browser at http://your.server.com/test/CGISimple.html, type gods into the "list of authors" input box and see the Watcher automatically fill in all values corresponding to the book with ISBN 9004129553.

STEP B option 3: Using Apache mod_python with BTree indexing to respond to xsdb queries

If you happen to be running an Apache web server with the mod-python component available you can get a further performance enhancement by using a mod-python module in place of a CGI script. The mod-python implementation improves on the CGI implementation since it does not need to reinitialize the Python interpreter and reopen the index files for every access. This section is briefer than the others because mod-python is less common than CGI scripting and if you know how to use it you are probably pretty savvy and don't need many hints :).

The mod-python option is almost identical to the CGI option except that the mod-python module code (provided in the distribution as xsdb/xsdbXML/xFeed/testdata/BookFeed.py) looks something like this

import sys
mydir = "/home/awatters/webapps/mod_python/htdocs/xsdbXML/xFeed"
sys.path.append(mydir)
import bFeed

f = bFeed.Feeder(mydir+"/testdata/bookprices")

def BookFeed(req, q, at=None):
    result = f.doModPythonPublisher(req, q, at)
    return result

Here again the bFeed.Feeder abstraction hides most of the mechanisms needed for the implementation. Another difference is that the URL for the mod-python module differs from the URL for the CGI script. In my case the server relative URL becomes /xsdbXML/xFeed/BookFeed.py/BookFeed. Consequently the HTML page which uses the mod-python book feed must be bound to the new url as

	Watcher = new FormWatcher("Watcher", "BookForm", "/xsdbXML/xFeed/BookFeed.py/BookFeed", 20);

In all other particulars the mod-python option is similar to the CGI option.

Additional comments

This is early release software. I think I've seen situation where the Watcher polling loop quits for mysterious reasons. The suite seems to work with recent browsers including MSIE 6, Firefox 1.5, and Safari 2.0. Your mileage may very.

A number of features of the package are not documented here. Please look to the source if you want to find more options.

For the moment all strings are translated to lower case for simplicity. A relatively simple elaboration (involving shadowing cased values with lowercased copies) would get around this issue, but it has not been implemented at this writing.

There are a lot of possible ways xFeed could be extended. Please offer suggestions to Aaron Watters (aaronwmail-xfeedme@yahoo.com) (me).

References

xFeedMe.com provides several xFeed demos as well as links to other demos and additional recommended readings.
xsdb.sourceforge.net is the central repository for downloads and documentation about xsdb and related technologies including xFeed.
http://sourceforge.net/project/showfiles.php?group_id=93603 is a direct link to the xsdb software downloads page which includes downloads for the xFeed package.
Python.org is the first place to go for information and downloads about the Python programming language.
Apache.org is the source for Apache web server software downloads and information.
ModPython.org has information and downloads for the Apache mod-python module.
http://en.wikipedia.org/wiki/AJAX -- the Wikipedia entry on Ajax provides a good general introduction to what AJAX is.
http://serversideguy.blogspot.com/2004/12/google-suggest-dissected.html the "Google suggest dissected" article and code provided by Chris Justus was a helpful source of hints while developing this software.
For about 20 bucks http://www.zipcodedownload.com/ will send you a zip code database that can be used with xFeed to make zip code completions as demonstrated at the the xFeedMe zipcode demo.

End of xFeed AJAX data publication documentation return to index