CTM³

A Strawman Proposal for Compact Topic Maps Syntax


AuthorSteve Pepper, Ontopia
Date2006-03-17
Version0.3

1. Introduction
2. Example
3. Discussion
4. Formal syntax definition
5. Use Case Solutions
6. CTM³ and N3
7. References


1. Introduction

This document describes a strawman proposal called CTM³ for the Compact Topic Maps syntax (CTM) that is to be defined as Part 6 of ISO 13250.

The purpose of CTM is to complement the existing XML interchange syntax (XTM) with a simple, compact notation for:

  1. manually authoring topic maps,
  2. providing human-readable examples, and
  3. providing a common lightweight syntax for use in TMCL and TMQL.

Two text-based notations already exist that have achieved widespread use in the Topic Maps community: the Linear Topic Maps Notation [LTM] and [AsTMa]. This CTM³ proposal attempts to combine the best from each of these and adapt them to the requirements defined for CTM.

CTM³ is also somewhat inspired by the N3 notation, a widely used text-based syntax for RDF (hence, in part, the choice of name for this proposal). Apart from the fact that borrowing (or stealing) elegant features from other languages always makes sense, the similarity between CTM³ and N3 also serves to highlight some of the similiarities and differences between Topic Maps and RDF.

This document starts with a brief example that demonstrates the principal features of CTM³, and this is followed by a discussion of those features. Section 4 provides an EBNF for the language and section 5 contains solutions to the CTM Use Cases in [Use Cases]. Section 6 relates CTM³ to N3 using the example used in [Survey].


2. Example

@ENCODING "UTF-8" .

@PREFIX  music: <http://psi.ontopia.net/music/#> .
@PREFIX  wikip: <http://en.wikipedia.org/wiki/> .
@PREFIX  music2: <http://www.kanzaki.com/ns/music#/> .

@NAMES   foaf:name dc:title .

@ASSOC   music:work music:composed-by music:composer .
@ASSOC   music:work music:written-by music:writer .
@ASSOC   ex:object music:influenced-by ex:agent .
@ASSOC   ex:victim music:killed-by ex:perpetrator ex:method .


wikip:Puccini > music:composer
  foaf:name "Giacomo Puccini" ( "puccini, giacomo" /tm:sort/ ) ;
  bio:dateOfBirth  "1858-12-22"^xsd:date .

:butterfly > music:opera
  "Madama Butterfly" ;
  "Madame Butterfly" /:en :fr/ ;
  music:synopsis  <http://www.metopera.org/synopses/madama.html> ;
  music:composed-by  :puccini ;
  music:written-by :illica, :giacosa ;
  music:influenced-by :iris .

:cio-cio-san > music:character "Cio-cio-san" ;
  music:appears-in :butterfly ;
  music:killed-by  :cio-cio-san  :stabbing .

music:composer >> foaf:Person "Composer" ;
  ~music2:Composer .

3. Discussion

Some things to note about this syntax:

  1. Use of cryptic delimiters and keywords is kept to a minimum.

  2. The syntax is greatly simplified through the use of declarations, including templates for associations.

  3. Topics may be referenced using either item identifiers or subject identifiers:

    • Item identifiers are preceded by a colon (e.g., ':butterfly')
    • Subject identifiers are simply given as URIrefs (e.g. 'wikip:Puccini')

    Note: We need a syntax for subject addresses.

  4. Almost everything can be expressed as a block of statements about a topic. A block is terminated by a period.

  5. Blocks consist mostly of semicolon-delimited pairs of parameters as follows:

    • delimiter/topic (e.g., '> music:composer' and '>> bio:person'),
    • topic/string (e.g., 'bio:dateOfBirth "1858-12-22"'),
    • topic/IRI (e.g., 'music:synopsis <http://www.metopera.org/synopses/madama.html>'), or
    • topic/topic (e.g., 'music:composed-by :puccini').

    Each pair of parameters constitutes a statement about the block's topic, in which the first parameter gives the "property" (name type, occurrence type, association type, identifier "type"), and the second parameter the "value" of the property.

  6. The second parameter of a parameter pair may consist of a comma-separated list of topics, strings, or IRIs in order to indicate multiple statements of the same type. Thus

    music:written-by :illica , :giacosa ;

    is a shorthand for

    music:written-by :illica ;
    music:written-by :giacosa ;
  7. Non-binary associations are handled by statements consisting of one (unary), or 3 or more (n-ary) parameters (e.g. 'music:killed-by :cio-cio-san :stabbing').

  8. Statements are assumed to be occurrences unless the type is defined as a name or association by an @name or @assoc declaration

  9. Association templates allow role types to be omitted.

  10. String values are always delimited by quotes (single or double): "string"

  11. URIrefs are always delimited by less-than and greater-than signs: <http://...>

  12. Type-instance and supertype-subtype have special delimiters ('>' and '>>' respectively).

  13. Subject identifiers and subject locators have special delimiters ('~' and '=', respectively).

  14. Two alternative syntaxes are available for associations that permit associations to be grouped by type (rather than as part of a block belonging to a certain topic). The full syntax caters for multiple signatures; the abbreviated syntax uses the templating mechanism, which promotes the use of single signatures), thus:

    Full
    music:killed–by( :scarpia  > ex:victim
                     :tosca    > ex:perpetrator
                     :stabbing > ex:method )
    
    Abbreviated:
    music:killed–by( :scarpia :tosca :stabbing )
    

A number of issues have yet to be resolved, including:

  1. How to handle reification of topic maps and association roles.

  2. Whether to allow multiple encodings, or just UTF-8?

  3. Whether to include baseuri or not?

  4. Whether to the order of directives should be fixed or variable

  5. Whether it is necessary and/or desirable to prefix local IDs with a colon.


4. Formal syntax definition

The following is a slightly incomplete and not-quite correct EBNF for CTM³.

topicMap    ::= encoding? version? prefix* directive* topic* association*
encoding    ::= '@' 'encoding' WS STRING '.'
version     ::= '@' 'version' WS STRING '.'

directive   ::= topicMapID | mergeMap | include | names | template

topicMapID  ::= '@' 'TOPICMAP' WS reifier '.'
mergeMap    ::= '@' 'MERGEMAP' WS URI (WS topicID)? '.'
include     ::= '@' 'INCLUDE' WS URI '.'

prefix      ::= '@' 'PREFIX' WS NAME ':' WS delimURI '.'
names       ::= '@' 'NAMES' WS (topicID WS)* '.'
template    ::= '@' 'ASSOC' WS NAME ':' WS topicID+ '.'

topicID     ::= itemID | subjectID
itemID      ::= ':' NAME
subjectID   ::= URIref
URIref      ::= delimURI | qname
delimURI    ::= '<' URI '>'
URI         ::= 'http://' STRING
qname       ::= NAME ':' NAME

topic       ::= topicID type? supertype? identifier* locator* statements? '.'

type        ::= '>' topicID+
supertype   ::= '>>' topicID+
identifier  ::= '~' URIref
locator     ::= '=' URIref

statements  ::= statement (';' statement)*
statement   ::= (name | occurrence | assoc) scope? reifier?

name        ::= topicID? STRING variant* ("," STRING variant*)*
variant     ::= '(' resource scope reifier? ')'
occurrence  ::= topicID resource ("," resource)*
assoc       ::= topicID+ ("," topicID+)*

resource    ::= URIref | (STRING datatype?)
datatype    ::= '^' URIref
scope       ::= '/' topicID+ '/'
reifier     ::= '@' topicID

association ::= topicID '(' assocrole (',' assocrole)*  ')' scope? reifier?
assocrole   ::= topicID '>' topicID reifier?

@terminals

NAME      ::= [A-Za-z_][A-Za-z_0-9.-]*
STRING    ::= '"' [^"]* '"'
WS        ::= #x20 | #x9 | #xD | #xA

; Name                       ::= NameStartChar (NameChar)*
; NameChar                   ::= NameStartChar | '-' | '.' | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
; NameStartChar              ::= ':' | [A-Z] | '_' | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]

5. Use Case Solutions

This section lists CTM³ solutions to the use cases given in [Use Cases].

Topic with an Item Identifier (3.2.1.)

:John .

Typed Topic - Using Item Identifiers (3.2.2.)

:john > :person .

Typed Topic - Using Subject Identifiers (3.2.3.)

@PREFIX foo: <http://psi.example.com/> .
foo:john < foo:person .

Multi Typed Topic - Using Item Identifiers (3.2.4.)

:john > :singer , :guitarist .

Multi Typed Topic - Using Subject Identifiers (3.2.5.)

@PREFIX foo: <http://psi.example.com/> .
:john < foo:singer , foo:guitarist .

Topic with an Item Identifier and a Topic Name (3.3.1.)

:John "John Lennon" .

Topic with a Subject Identifier and a Topic Name (3.3.2.)

@PREFIX b: <http://psi.beatles.example.org/> .
b:The_Beatles "The Beatles" .

Topic with a Subject Locator and a Topic Name (3.3.3.)

:beatles-website = "http://beatles.com" ;
  "Official website of The Beatles" .

Typed Topic Name - Using Item Identifiers (3.3.4.)

@NAMES :fullname .
:john :fullname "John Ono Lennon" .

Typed Topic Name - Using Subject Identifiers (3.3.5.)

@PREFIX ex: <http://psi.example.org/> .
@NAMES ex:fullname .
:john ex:fullname "John Ono Lennon" .

Scoped Topic Name - Using Item Identifiers (3.3.6.)

:john "John Winston Lennon" / :fullname / .

Scoped Topic Name - Using Subject Identifiers (3.3.7.)

@PREFIX ex: <http://psi.example.org/> .
@NAMES ex:fullname .
:john "John Ono Lennon" / ex:fullname / .

Multi Scoped Topic Name (3.3.8.)

:beatles "The Beatles" ;
         "Fab Four" / :nickname :short / .

Typed and Scoped Topic Name (3.3.9.)

:john
  :fullname "John Ono Lennon" / yoko / .

Topic Name with a Variant of datatype String (3.3.10.)

:john
  "John Lennon" ( "lennon, john" / tm:sort / ) .
  /* string datatype is default */

Topic Name with a Variant of datatype XML (3.3.11.)

:john "John Lennon" ("John Lennon"^xsd:xml /markup/) .
  /* Assume xsd:xml is datatype for XML encoded strings. Not sure if it is. */

Topic Name with a Variant of datatype IRI (3.3.12.)

:john "John Lennon"
  (<http://en.wikipedia.org/wiki/Image/Jk_beatles_john.jpg>^xsd:anyURI /image/ ) .

Topic Name with a Variant of a non-TMDM datatype (3.3.13.)

:revolution-nine  "Revolution 9"  ( "9"^xsd:integer /number/ )  .

Typed Occurrence of datatype String - Using Item Identifiers (3.4.2.)

:adayinthelife
  :lyrics  "I read the news today, oh boy" .
  /* string datatype is default */

Typed Occurrence of datatype String - Using Subject Identifiers (3.4.3.)

:adayinthelife
  ex:lyrics  "I read the news today, oh boy" .

Scoped Occurrence of datatype String - Using Item Identifiers (3.4.4.)

:adayinthelife
  :lyrics  "I read the news today, oh boy"  /:en/ .

Scoped Occurrence of datatype String - Using Subject Identifiers (3.4.5.)

:adayinthelife
  ex:lyrics  "I read the news today, oh boy"  / lang:en /  .

Occurrence of datatype XML (3.4.6.)

:adayinthelife  :lyrics
  "<html>
   <head>
     <title>A day in the life</title>
   </head>
   <body>
     <p>I read the news today, oh boy</p>
     </body>
   </html>"^xsd:xml  .

Occurrence of datatype IRI (3.4.7.)

:beatles
   :website  <http://beatles.com/>  .

Occurrence of a non-TMDM datatype (3.4.8.)

:pennylane
   :tracknum  "2"^xsd:integer  .

Creating an Association (3.5.1.)

@ASSOC  :work  :created-by  :creator  .
:yesterday  :created-by  :mccartney  .

Scoped Association (3.5.2.)

@ASSOC  :work  :created-by  :creator  .
:yesterday  :created-by  :mccartney  / :unofficial /  .

Supertype-Subtype relationship - Using Item Identifiers (3.5.5.)

:song >> :musical-work

Supertype-Subtype relationship - Using Subject Identifiers (3.5.6.)

ex:song >> ex:musical-work

Reification of a Topic Map (3.6.1.)



Reification of a Topic Name (3.6.2.)

:john "John Ono Lennon" @:name-of-john-lennon .

Reification of a Variant (3.6.3.)

:john "John Lennon"
      ("lennon, john" /:sortname/ @:johns-sortname) .

Reification of an Occurrence (3.6.4.)

:john  :website  <http://johnlennon.com/>  @:lennons-website .
:lennons-website  "Official website of John Lennon" .

Reification of an Association (3.6.5.)

:lennon  :partnership  :mccartney  @:lennon-mccartney .
:lennon-mccartney  "Lennon / McCartney" .

Reification of an Association Role (3.6.6.)

???

Singleline Comment (3.7.1.)

/* this is a comment */

Multiline Comment (3.7.2.)

/*
this is a
multiline
comment
*/

6. CTM³ and N3

This section shows the test case used in [Survey] in LTM and N3 (as published) and in CTM³.

LTM version

[puccini : person   = "Giacomo Puccini"]
[tosca   : opera    = "Tosca"]

{tosca, premiere-date, [[1900-01-14]]}
{tosca, synopsis,      "http://www.azopera.com/learn/synopsis/tosca.shtml"}

composed-by( tosca : work, puccini : composer )

               /* ------------------------------------- */

[person        = "Person"        @"http://psi.ontopia.net/music/#person"]
[composer      = "Composer"      @"http://psi.ontopia.net/music/#composer"]
[opera         = "Opera"         @"http://psi.ontopia.net/music/#opera"]
[work          = "Work"          @"http://psi.ontopia.net/music/#work"]

[premiere-date = "Première date" @"http://psi.ontopia.net/music/#premiere-date"]
[synopsis      = "Synopsis"      @"http://psi.ontopia.net/music/#synopsis"]
[composed-by   = "Composed by"   @"http://psi.ontopia.net/music/#composed-by"]

CTM³ version

@PREFIX music: <http://psi.ontopia.net/music/#> .
@ASSOC  music:work  music:composed-by  music:composer  .

:puccini > music:person ;
   "Giacomo Puccini" .

:tosca > music:opera ;
   "Tosca" ;
   music:premiere-date  "1900-01-14" ;
   music:synopsis       <http://www.azopera.com/learn/synopsis/tosca.shtml> ;
   music:composed-by    :puccini .

               /* ------------------------------------- */

music:person    "Person" .
music:composer  "Composer" .
music:opera     "Opera" .
music:work      "Work" .

music:premiere-date  "Première date" .
music:synopsis       "Synopsis" .
music:composed-by    "Composed by" .

N3 version

@prefix music: <http://psi.ontopia.net/music/#> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .

[ rdf:type music:opera;
  rdfs:label "Tosca";
  music:premiere-date "1900-01-14";
  music:synopsis <http://www.azopera.com/learn/synopsis/tosca.shtml>;
  music:composed-by [
    rdf:type music:person;
    rdfs:label "Giacomo Puccini" ]
] .

               # ---------------------------------------

music:person        rdfs:label "Person" .
music:opera         rdfs:label "Opera" .

music:composed-by   rdfs:label "Composed by" .
music:premiere-date rdfs:label "Première date" .
music:synopsis      rdfs:label "Synopsis" .

7. References

[AsTMa]
AsTMa= Language Definition, http://astma.it.bond.edu.au/astma=-spec-xtm.dbk
[LTM]
The Linear Topic Map Notation: Definition and introduction, http://www.ontopia.net/download/ltm.html
[Survey]
A Survey of RDF/Topic Maps Interoperability Proposals, http://www.w3.org/TR/rdftm-survey/
[Use Cases]
CTM Use Cases, http://www.jtc1sc34.org/repository/0701.pdf