SkiCal in DAML+oil
Libby Miller
Greg FitzPatrick
Abstract
DAML+oil is a language for describing ontologies, building
on RDF Schema and XML Schema. It enables you to describe objects such as
events, people, documents and their relationships with precision. It
uses references to XML Schema datatypes to describe integers, dates and
other datatypes.
The iCalendar mime-directory standard is a format for describing
events, to-do lists and journals, including associated date-formats,
timezones, and alarms. It is used in the calendaring and scheduling
applications of most major desktop and PDA personal information managers.
SkiCal is an extension to iCalendar which describes public
events such as concerts, sports competitions and conferences.
DAML+oil language constructs are capable of describing aspects of
SkiCal objects such as price, age restrictions and start and end dates and
times with precision and flexibility. SkiCal makes for a practical and
useful potential application of DAML+oil. In our presentation we describe
the DAML+oil language in general terms, and show some ways in which it
can be of practical use for creating applications which store and
retrieve SkiCal and other event-based information.
RDF and XML
[This section is mostly by danbri, from
http://www.w3.org/2002/03/whyrdf/schemarama, slightly reorganised]
The W3C's Resource Description Framework (RDF) provides a simple
graph-based information model for Web applications that need to exchange
data in a flexible yet predictable way. Since RDF data is typically
encoded and exchanged using XML documents, RDF in effect provides a set of
restrictions or design conventions for well-formed XML documents.
XML provides use with the notion of both wellformed and valid documents. A
valid document conforms to some specific DTD (or schema), whereas a merely
'well formed' document is constrained only to have the basic elements,
attributes and content structure shared by all XML documents. RDF can be
thought of as an attempt to find a middle ground between the strict notion
of 'conformance to a named schema' and the much weaker 'unconstrained tag
soup'. A wellformed XML document that is written in conformance with the
RDF syntax is understood to encode an edge-labeled directed graph, whose
nodes and edges may be labeled with Web resource identifiers (URIs).
RDF is particularly suited for the deployment of XML in the World Wide
Web[@ref], since the distributed nature of the Web often requires us to
mix information from multiple sources and applications within a single
document. While the notion of 'wellformedeness' and XML Namespaces
mechanism provides the basic infrastructure for mixed-namespace XML
markup, RDF supplies a much needed set of constraints and conventions for
using these. RDF can be thought of as offering 'design patterns' for
creators of mixed-namespace documents. By writing mixed-namespace XML in
the style specified by the RDF Syntax recommendation, we reduce some of
the unpredictability and variation associated with the unconstrained use
of well-formed XML.
As a graph structure, RDF is useful even without knowledge of the
specific vocabularies used. While these vocabularies (as documented in RDF
schemas associated with namespaces) can provide useful meta-information to
support more sophisticated applications, RDF was designed to be useful in
the absence of schema information. This characteristic of RDF can be
thought of as a restriction of XML's notion of well-formedness, and as
embodying the notion of 'semi-structured' or schema-less data.
So if we can use RDF query, storage and API without need for RDF schemas,
what use is RDF schema? And what use might we make of more sophisticated
extensions to RDF schema such as DAML+OIL or W3C's new Web Ontology
language?
The RDF information model essentially consists of nodes and arcs
(resources and properties). RDF additionally makes the stronger claim that
these graph data structures are not arbitrary computational constructs,
but encodings of claims about the world. An RDF document is the kind of
thing that (in some context) can be true or false. When dealing with RDF,
as a consequence of this, we often deal with two parallel sets of
terminology. Considered as a graph data structure, we talk of 'nodes and
arcs' (or edges); considered as a representational formalism that makes
claims about the world, we talk of 'classes' of 'resource' and their
'properties'. When we hear that RDF is supposed to be 'semantic', or
'meaningful', it is related to this second set of terminology, and to the
goal that RDF documents should be considered as 'saying things about the
world'.
RDF Schema provides a rather minimalistic basis for RDF vocabulary
description, allowing one to describe nodes as being instances
of classes which can be heirarchically organised. It also allows the
description of properties heirarchically, and enables one to restrict the
types of nodes which can be used at either end of arcs. DAML+oil adds to
these restrictions, making for a more powerful and descriptive language
for describing the way in which objects in the world relate to one
another.
DAML+oil
DAML+oil is a language describing a number constraints you can impose on a
graph structure to describe
- class heirarchies and relationships
- property heirarchies and relationships
- the cardinality of properties
- restrictions on which property can be used where
- datatype objects
- instance data
It is based on and uses the RDF Model and Syntax [RDFMS], and extends the
RDF Schema. [RDFS]
A major difference between XML Schema and DAML+oil (which may appear to
have overlapping functionality) is that an XML Schema defines a class of
XML documents by describing syntactic constraints on those documents,
while DAML+oil (like RDF Schema) is data-orientated, describing
constraints on objects, not on documents. DAML+oil and RDF Schema are
therefore particularly useful for describing relationships between objects
which are described over several different documents, including
relationships between schemas or ontologies.
SkiCal and iCalendar
The iCalendar mime-directory standard is a format for describing
events, to-do lists and journals, including associated date-formats,
timezones, and alarms. It is used in most major desktop and PDA
Calendaring and Scheduling applications.
Here is an example event described in iCalendar, taken from RFC 2445
[ICAL]
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//hacksw/handcal//NONSGML v1.0//EN
BEGIN:VEVENT
DTSTART:19970714T170000Z
DTEND:19970715T035959Z
SUMMARY:Bastille Day Party
END:VEVENT
END:VCALENDAR
This fragment describes an event - a Bastille Day party - starting at 5pm
UTC on 14th July 1997 and ending the following day at 3.59am.
SkiCal is an extension to iCalendar for describing public events,
like concerts, sports events and conferences. The core technology of
SkiCal is the 'WHA-machine'; the structuring of information based on the
six common interrogatives; WHAT, WHEN, WHERE, WHOW, WHY, and WHO.
Here is a SkiCal event taken from SkiCal Internet Draft [SKICAL]
(shortened)
BEGIN:VCALENDAR
VERSION:2.0
SKICALVER:1.0
PRODID:-//HandGenerated/SkiCal//NONSGML v1.0//EN
BEGIN:VEVENT
SKUID:kj08988b@nationalchamberorch.org
SUMMARY:Handel's "Messiah" featuring the National Chamber Orchestra
TITLE:Messiah
DTSTART:19991217T200000
DTEND:19991217T220000
VENUE:Indoors
PERSONS;SKiROLE="conductor":Takao Kanayama
PERSONS;SKiROLE="orchestra":National Chamber Orchestra
PERSONS;SKiROLE="creator":G.F.Handel
PRICE;PRXITEM="SFT:Far side";CURRENCY=USD:17
END:VEVENT
END:VCALENDAR
This describes an event with an identifier, a title, and a start and an
end time, but also information about the price of the event and the people
involved and their roles.
Both SkiCal and iCalendar are designed for the machine interchange of
event data, including storage and scheduling applications. They are both
modeled using objects, properties and datatypes. They use a syntactic
structure called mime-directory, as defined in RFC 2425 [MIME], which is
also used for VCARD [VCARD]. Ordering of properties and values within the
main container constructs is not important. There are some cardinality
constraints over certain properties and their values, and in some places,
implicit class hierarchies between the objects.
Namespace mixing
Both iCalendar and SkiCal are good candidates to mark up in RDF, for
several reasons, the most important of which is the utility of being
able to combine event information with other kinds of information about
people, documents, webpages, geographical locations, and so on.
For example, ICalendar has a business meeting orientation, and is
therefore only concerned with people in their role as calendar users,
namely as the initiators and attendees for meetings.
SkiCal goes further by defining an explicit Persons property, which
enables one to talk about the roles the person might take within an
event, such as conductor, guide, manager. SkiCal is modular: you may have
little concern about the people involved in an event, just as you do not
need to know the driver's name of every bus you ride on. But sometimes
this information is crucial, and SkiCal allows references to be made
to URLs where such information exists for those who need it.
SkiCal and iCalendar data described in RDF could go further than this,
enabling information from several namespaces to be combined, instead of
constraining a person to be an organiser or providing information at the
end of a URL. For example, event information could be combined with a
Vcard [VCARD] or 'friend-of-a-friend' [FOAF] vocabulary to provide more
detailed contact information for the person who might be an organiser of a
meeting or a tour guide for a walking tour. One could also describe web
documents in a similar way. Connecting a well-known vocabulary such as
Dublin Core [DC] to event data would enable the organiser of a meeting to
specify reading material for a meeting. Or one could say, for example,
'find me images which depict all the individuals coming to this meeting,
and also show me what documents they have authored'.
Many connections to other vocabularies could be described by
encoding iCalendar and SkiCal data in RDF using RDF Schema, or simply by
mixing the namespaces within RDF instance data. However, DAML+oil can be
used to describe these connections with more flexibility and also more
precision, for example by specifying how many of a certain property may be
used with a particular type of object, or by creating constraints on the
type of object linked to by a particular property from a certain object
see below). The relationships between objects can be very complex and
highly structured, and aim to represent (some of) the complexity of
relationships in the world, for example, a father is a parent who is also
a male person; a manager_678 is a person working for company X with Y
years of experience.
Datatypes
Distinguishing between different forms of data - integers, text,
date-times, floats is very important for indexing and querying, especially
with respect to dates and times. RDF does not at the time of writing
provide a model or syntax for datatyping, whereas DAML+oil can be used
with XML Schema datatypes.
Practical uses of DAML+oil in SkiCal for iCalendar
This section describes some useful aspects of DAML+oil with examples
drawn from the iCalendar and SkiCal draft DAML+oil ontologies
[ICAL-DAML], [SKICAL-DAML]. SkiCal and iCalendar are lengthy and complex
and so here we pick out three of the more interesting and functional uses
of DAML+oil for these calendar ontologies.
1. Assigning identifiers to objects
In RDF, objects have names and types, and where the names are global, they
represent unique identifiers for the objects. Where they are not global,
it is often helpful to be able to say that if an object has a certain
property, then that property and value together uniquely identify the
object. This assigns an identifier to an object which does not
already have one, or which does not naturally fall into the category of
things that have identifiers, such as people.
DAML+oil has a way of saying this. If we state that
Then this means that two objects with the same value of a SKUID property
are in fact the same object. For example:
kj08988b@nationalchamberorch.org
Handel's "Messiah" featuring the National Chamber Orchestra
kj08988b@nationalchamberorch.org
The first performance of Handel's "Messiah" by the National
Chamber Orchestra for 20 years.
kj08988b@nationalchamberorch.org
19991217T200000
all these property-value pairs are properties of the same VEVENT
object.
A nice example of this is for people, as described in [SMUSH]. People
do not have identifiers like webpages do, but they do have personal email
boxes, which at any given time are in the ownership of one person. Email
addresses also have the advantage that they are fairly well-known
strings. If we designate CAL-ADDRESS in iCalendar as a
daml:UnambiguousProperty which points to an email address, then we can
say that if we find the following information:
Libby Miller
Elizabeth Miller
we can tell that the person with email address libby.miller@bristol.ac.uk
has two names, Libby Miller and Elizabeth Miller. daml:UnambiguousProperty
is much like the subject indicator construct in topic maps [TOPIC], which
allows for aggregation of topics.
Where there is no global identifier for an object, the use of
daml:UnambiguousProperty creates a key which identifies the object.
Often it does not make sense to assign a URI to an event or a person - and
in fact would be a modeling error to use the webpage of an event or the
email adress of a person as a surrogate for such an identifier, but
daml:UniqueProperty can be used as such a surrogate. This still allows
that the same event could have two identifiers, but it does mean that if
we discover two instances of the same value with this property, then
they refer to the same subject.
From a practical point of view, a piece of software that has had
information about daml:UnambigousProperty programmed into it, can use the
values of proprties defined in this way in a schema as keys into the
database for instance data. This makes for effective indexing, storage and
retrieval of instance data in a schema- and daml:UnambigousProperty-aware
database. In particular, it can aid with query processing for fast
retrieval of information. A property which is known to be a
daml:UnambiguousProperty can only retrieve one instance of its subject,
and can therefore prune the search space substantially.
2. Combining namespaces
RDF Schema allows you to combine objects from the same or different
namespaces using the sublassOf and subpropertyOf constraints:
and also domain and range constraints on properties:
DAML has a number of useful constructs that allow the linking of classes
and properties together directly, by stating that they are different
sames for the same thing. Example of these are
daml:samePropertyAs/equivalentTo, daml:inverseOf and daml:sameClassAs.
These can be useful for creating hooks into existing ontologies or
schemas without rewriting all the instance data.
Suppose we have three ontologies, one describing organisational structure,
one publications and another company meetings.
Instead of rewriting all the instance data conforming to these ontologies
so that it all uses the same definition of a Person from a single
ontology, one could simply define the different definitions of Person as
the same class as each other using daml:sameClassAs. Depending on how the
data is stored this could be a very useful shortcut.
DAML also allows the expression of domain, range and subclass and
subproperty relationships with more subtlety and in more detail than in
RDF Schema. In DAML+oil it is possible to say more about the sort of
calendar users expected in iCalendar, which might affect the sorts of
transactions expected by a scheduling program:
every calendar user is a person or a
robot
Another example: instead of saying that the range of a property COMPONENT
must _always_ have value CALCOMPONENT, we can state this restriction just
for a particular class of objects, e.g. for VCALENDAR objects:
A container for calendar components
- in this case also adding the restriction that there be at least one of
the property COMPONENT.
This restriction to a local range means that we can reuse COMPONENT
elsewhere with different restrictions: if we liked we could create an
ical:COMPONENT for CALCOMPONENTS like VEVENTS themselves, perhaps referring
to subevents of a main containing event.
One can take existing ontologies and connect them in minute and precise
detail using daml properties such as daml:disjointWith,
daml:intersectionOf, daml:unionOf, daml:complementOf. ICalendar and SkiCal
are a good example of a pair of ontologies where one depends on terms
defined in another. SkiCal TITLE property has a ical TEXT value; SkiCal
OPTIME has most of the same properties of ical:VEVENT and so we can define
VEVENT and OPTIME as subclasses of a more general class. In principle
DAML+oil can make relationships such as these more precise, but in
practice, at least in this case finding examples of this is hard. It is
relatively straightforward to say that person class is
disjointWith plant class (all plants are not people) or that a father
class represents the intersection of parent class and male things class
(all fathers are both males and parents). But for iCalendar and SkiCal
more vaguely defined relationships seem to be sufficient, and any more
detail is not easily definable.
In addition, attempting to connect two ontologies with such precision
assumes a degree of information about the intentions of the creator of the
ontologies that is probably unwarranted. Person, father, plant are
comparatively easy to define and understand; time, location, role are less
straightforward.
However, if a degree of imprecision is acceptable and still informative,
then simple relationships like daml:sameClassAs can be time-saving and
useful.
3. Using XML Schema datatypes with RDF
Calendaring and scheduling applications need to be able to do operations
on dates and times, such as:
'find all the events that the calendar user Libby is attending
within the next week'
'find all the events that calendar users Greg and Dan are both attending
between June 21st 2002 and August 16th 2003'
'find every time period that Damian is free between 8.30am and 7.00pm GMT
today'
To display calendar information or schedule events, applications need to
be able to makes these types of queries of a source of data, and that
requires knowing that certain objects are the sorts of things that one can
perform a 'greater than', 'less than' or 'between' operation on.
Datatyping can also enable or increase the speed of indexing in databases
(for example in several relational databases, datatyping is the only way
that these types of queries about dates can be performed).
The RDFCore group is currently finalizing a model for expressing
datatypes. DAML+oil has its own methods of using datatypes. It has certain
syntactic structures which point to XML Schema datatypes, both at the
schema and the instance level.
For example, defining iCalendar in DAML+oil, one could say:
In which case sample instance data could look like this:
2002-03-15T15:00Z
and a DAML+oil processor could determine using the schema that the string
2002-03-15T15:00Z represented a XML Schema datatype, and index it
accordingly.
Unfortunately the syntactic description of a date-time in iCalendar is not
an identical subset of ISO 8601 to that used by XML Schema datatypes, and
so translating iCalendar correctly to DAML+oil would involve creating a
new datatype. This can be achieved by creating an XML Schema file with the
correct restrictions and pointing to the datatypes in that.
One distinction that we found difficult to draw clearly at times was
between what might be termed the 'default' datatypes of RDF and the XML
Schema datatypes. Although RDF does not yet have its own datatypes, nodes
must either be Literals or Resources. In some cases it is not clear
whether to say something is a XML Schema datatype or an RDF resource or
literal: for example do you say
ical:DESCRIPTION is a daml:DatatypeProperty with a xsd:String value
or
ical:DESCRIPTION is a daml:ObjectProperty with a rdf:Literal value
similarly, is it:
ical:URI is a daml:DatatypeProperty value xsd:anyURI
or
ical:URI is a daml:ObjectProperty value rdf:Resource?
DAML+oil makes a distinction between properties pointing to DAML+oil
objects and properties pointing to datatypes (which are always XML Schema
datatypes). These value spaces are disjoint, but there is an apparent
overlap, as described above.
In many cases however, XML Schema datatypes are a useful and also
extensible mechanism for datatypes. In SkiCal information about age
restrictions are currently described in structured text, but
we could define an extension to SkiCal specifically for movies
and define our own XML Schema datatypes for the categories of age group
used:
(example adapted from the DAML+oil walkthrough [DAML-WALK])
and then point at this in the SkiCal extension like this:
(schema)
(instance data)
Texas Chain Saw Massacre, The
18
Problems with DAML+oil
The authors found the process of creating a DAML+oil schema for
iCalendar and SkiCal difficult and slow. Part of this was because of the
difficulty of translating between schema specification languages, but a
major problem was with the awkward syntax and limited expressivity of the
DAML+oil language. The syntactic difficulties are mainly due to the
problem inherent in expressing complex constraints within the binary RDF
model. The problems with expressivity are because DAML+oil is a language
designed to harness the tractability of a certain class of Description
Logics (themselves a subset of first Order logic). This makes for a very
constrained functionality that can be puzzling and frustrating to use. A
short position paper by Pat Hayes [HAYES] explains this difficulty very
clearly.
We strongly suspect that the benefits of putting a great deal of energy
into defining the precise relationships between objects between or within
onotologies will be slight for most users. This is particularly true of
relations between ontologies by different authors, where connecting the
ontologies involves hypothesising what the author meant by the ontology. A
machine readable DAML+oil ontology does not capture all the aspects of the
world in the area which it claims to represent, and so there is immense
potential for error, still more so when the ontologies are defined in RDF
Schema or Mime directories or other means less precise than DAML+oil.
However a _small_ subset of DAML+oil can be very useful indeed, with
practical benefits for indexing RDF information for storage and querying,
for querying and storing datatypes and for creating approximate links
between schemas or ontologies.
References
[ICAL] Internet Calendaring and Scheduling Core Object Specification
(iCalendar)
Network Working Group
November 1998
Request for Comments: 2445
Category: Standards Track
F. Dawson, D. Stenerson
http://www.imc.org/rfc2445
[SKICAL] SkiCal Internet Draft
Network Working Group
Internet-Draft
Expires: July 8, 2002
G. FitzPatrick, P. LannerĂ, N. Hjelm
http://www.ietf.org/internet-drafts/draft-many-ical-ski-05.txt
[DAML-WALK] Annotated DAML+OIL (March 2001) Ontology Markup (DAML+oil
walkthu)
Frank van Harmelen, Peter F. Patel-Schneider and Ian Horrocks, editors
(also Lynn Andrea Stein, Dan Connolly, and Deborah McGuinness, editors of
previous versions)
http://www.daml.org/2001/03/daml+oil-walkthru.html
[DAML-REF] Reference description of the DAML+OIL (March 2001) ontology
markup language
Frank van Harmelen, Peter F. Patel-Schneider and Ian Horrocks, editors.
Contributors: Tim Berners-Lee, Dan Brickley, Dan Connolly, Mike Dean,
Stefan Decker, Pat Hayes, Jeff Heflin, Jim Hendler, Ora Lassila, Deb
McGuinness, Lynn Andrea Stein, and others
http://www.daml.org/2001/03/reference.html
[SKI-DAML-NOTES] Notes on defining SkiCal in DAML+oil
2002-03-22
Libby Miller
http://swordfish.rdfweb.org/people/libby/rdfweb/skical/skical-daml-diary.txt
[ICAL-DAML]
Draft iCalendar DAML+oil schema
2002-03-22
Libby Miller
http://swordfish.rdfweb.org/people/libby/rdfweb/skical/ical-daml.daml
[SKICAL-DAML]
Draft SkiCal DAML+oil schema
2002-03-22
Libby Miller
http://swordfish.rdfweb.org/people/libby/rdfweb/skical/skical-daml.daml
[XSD-2]XML Schema Part 2: Datatypes
W3C Recommendation 02 May 2001
Paul V. Biron, Ashok Malhotra
http://www.w3.org/TR/xmlschema-2/
[XSD-0] XML Schema Part 0: Primer
W3C Recommendation, 2 May 2001
David C. Fallside (editor)
http://www.w3.org/TR/xmlschema-0/
[HYBRID] Draft Hybrid RDF Calendar Schema
Michael Arick and Libby Miller
2001-06-18
http://ilrt.org/discovery/2001/06/schemas/ical-full/hybrid.rdf
[RDFS] RDF Schema Specification 1.0
Dan Brickley and R.V. Guha
W3C Candidate Recommendation 27 March 2000
http://www.w3.org/TR/rdf-schema
[RDFMS] RDF Model and Syntax Specification
Ora Lassila and Ralph R. Swick
W3C Recommendation 22 February 1999
http://www.w3.org/TR/REC-rdf-syntax
[DC]
http://dublincore.org/documents/dces/
Dublin Core Metadata Element Set, Version 1.1: Reference Description
1999-07-02
[FOAF]
Friend of a friend RDF vocabulary
http://xmlns.com/foaf/0.1/
Dan Brickley and others
[MIME]
Network Working Group
Request for Comments: 2425
Category: Standards Track
T. Howes, M. Smith, F. Dawson
September 1998
A MIME Content-Type for Directory Information
http://www.imc.org/rfc2425
[VCARD]
Network Working Group
Request for Comments: 2426
Category: Standards Track
F. Dawson, T. Howes
September 1998
vCard MIME Directory Profile
[TOPIC]
XML Topic Maps (XTM) 1.0
Editors: Steve Pepper, Graham Moore
http://www.topicmaps.org/xtm/
[HAYES]
Catching the Dreams
by Pat Hayes
http://www.aifb.uni-karlsruhe.de/~sst/is/WebOntologyLanguage/hayes.htm
[SMUSH]
http://rdfweb.org/2001/01/design/smush.html
RDFWeb notebook: aggregation strategies
Dan Brickley