N0615rev - Summary of Voting on JTC 1/SC 34 N 590 - Information Technology - Topic Maps - Canonical Syntax

ISO/IEC JTC 1/SC 34N0615rev

ISO/IEC JTC 1/SC 34

Information Technology --
Document Description and Processing Languages

TITLE:	Summary of Voting on JTC 1/SC 34 N 590 - Information Technology - Topic Maps - Canonical Syntax
SOURCE:	SC34 Secretariat
PROJECT:	FCD 13250-4: Information Technology - Topic Maps - Canonical Syntax
PROJECT EDITOR:	Mr. Eric Freese; Mr. Jaeho Lee
STATUS:	Summary of voting
ACTION:	Based on the ballot responses, this FCD is APPROVED and the project status changes to 40.60. Project Editors are requested to review comments and advise the Secretariat regarding (1) the change to status 40.92, 40.93 or 40.99, and (2) the next project status and anticipated date that project status will change.
DATE:	2005-05-20
DISTRIBUTION:	SC34 and Liaisons
REFER TO:	N0590b - 2005-01-14 - Ballot due 2005-05-14 - Information Technology - Topic Maps - Canonical Syntax N0590 - 2005-01-14 - Information Technology - Topic Maps - Canonical Syntax
REPLY TO:	Dr. James David Mason (ISO/IEC JTC 1/SC 34 Chairman) Y-12 National Security Complex Bldg. 9113, M.S. 8208 Oak Ridge, TN 37831-8208 U.S.A. Telephone: +1 865 574-6973 Facsimile: +1 865 574-1896 Network: masonjd@y12.doe.gov http://www.y12.doe.gov/sgml/sc34/ ftp://ftp.y12.doe.gov/pub/sgml/sc34/ Mr. G. Ken Holman (ISO/IEC JTC 1/SC 34 Secretariat - Standards Council of Canada) Crane Softwrights Ltd. Box 266, Kars, ON K0A-2E0 CANADA Telephone: +1 613 489-0999 Facsimile: +1 613 489-0995 Network: jtc1sc34@scc.ca http://www.jtc1sc34.org

P-Member	APPROVAL OF THE DRAFT AS PRESENTED	APPROVAL OF THE DRAFT WITH COMMENTS AS GIVEN ON THE ATTACHED	DISAPPROVAL OF THE DRAFT FOR REASONS ON THE ATTACHED	DISAPPROVAL (appropriate changes in the text will change vote to APPROVAL)	NO RESPONSE
Canada	X
China					X
Italy	X
Japan				X
Korea	X
Netherlands	X
Norway		X
United Kingdom			X
United States		X

Japan

(1) General

Information item types and their properties should be in consistency between TMDM and Canonicalization. The "locator item" and its properties should be deleted or changed.

(2) Technical

(2.1) All part of the specification
"URI" should be changed to "IRI", because an IRI is a sequence of characters from the Universal Character Set (Unicode/ISO 10646) and allows to use non-Latin scripts in it.

(2.2) Clause 3 Normalisation of Locator References
Definition of Normalisation process (item 1 - 4) is hard to understand. The following examples should be added.

Let us say that the locator to be normalised is 
http://www.isotopicmaps.org/sam/cxtm/sample.xtm#S02?q01.
Let us also assume that the base locator of the topic map is 
http://www.isotopicmaps.org/sam/cxtm/sample.xtm. 
Let us use the value L as the current normalised value of the input locator.

The result of step 1 should be:
P = http://www.isotopicmaps.org/sam/cxtm/sample.xtm
L = http://www.isotopicmaps.org/sam/cxtm/sample.xtm#S02?q01

After step 2 (L starts with P, so remove the entire substring from L)
P = http://www.isotopicmaps.org/sam/cxtm/sample.xtm
L = #S02?q01

Let us now consider the case where the input locator is 
http://www.isotopicmaps.org/sam/sam-model/#d0e592

After step 1
P = http://www.isotopicmaps.org/sam/cxtm/sample.xtm
L = http://www.isotopicmaps.org/sam/sam-model/#d0e592

After step 2 (there is no match found)
P = http://www.isotopicmaps.org/sam/cxtm/sample.xtm
L = http://www.isotopicmaps.org/sam/sam-model/#d0e592

After step 3
P = http://www.isotopicmaps.org/sam/cxtm
L = http://www.isotopicmaps.org/sam/sam-model/#d0e592

After step 2 (again no match found)
P = http://www.isotopicmaps.org/sam/cxtm
L = http://www.isotopicmaps.org/sam/sam-model/#d0e592

After step 3
P = http://www.isotopicmaps.org/sam
L = http://www.isotopicmaps.org/sam/sam-model/#d0e592

After step 2 (L starts with P, so remove the entire substring and 
the '/' separator from L)
P = http://www.isotopicmaps.org/sam
L = sam-model/#d0e592

(2.3) Clause 4.2 Information Type and Basic Type Sort Order
The "Basic Type" should be the "Fundamental Type". Because it is not the "Basic Type" but the "Fundamental Type" in TMDM.

(2.4) Clause 4.8 Canonical Sort Order For Variant Items
The "[resource]" should be the "[datatype]", in order to be consistent with the Equality rule of Variant Items in TMDM.

(2.5) Clause 4.9 Canonical Sort Order For Occurrence Items
The "[resource]" should be the "[datatype]", in order to be consistent with the Equality rule of Occurrence Items in TMDM.

(2.6) Clause 4.11 Canonical Sort Order For Association Role Items
The "[role playing topic]" should be the "[player]", in order to be consistent with the Equality rule of Association Role Items in TMDM.

(2.7) Clause 5.7 Constructing a representation of a topic map information item
The "topic map information item" in title should be the "topic map item", in order to be consistent with other items.

(2.8) Clause 5.15 - 5.20 Constructing a representation of the [....] property
Only 6 properties (reifier, reified, scope, source locators, type and value) are defined at the moment. In addition to them, all properties should be defined.

(3) Editorial

(3.1) All part of the specification
Some words have two kinds of spell, for example "normalize and normalise". The words should be unified.

(3.2) Clause 2. Normative references
"ISO10646-1, ISO 10646-1:2000: ... " and ISO10646-2, ISO 10646-2:2001:" should be
"ISO/IEC 10646: 2003: ... ".

(3.3) Clause 2 Normative references
Version of Unicode should be 4.0.

(3.4) Clause 2 Normative references
"IETF RFC 2396, ... " should be changed to
"IETF RFC 3986, Uniform Resource Identifier (URI): Generic Syntax ... , January 2005, available at <http://www.ietf.org/rfc/rfc3986.txt>".

(3.5) Clause 2. Normative references
"IETF RFC 3987, Internationalized Resource Identifiers (IRIs), Internet Standards Track Specification, January 2005, available at <http://www.ietf.org/rfc/rfc3987.txt>" should be added.

(3.6) Clause 2 Normative references
"XML Infoset, XML Information Set, ... " should be
"XML Information Set (Second Edition), World Wide Web Consortium, 4 February 2004, available at <http://www.w3.org/TR/2004/REC-xml-infoset-20040204/>".

Norway

1	2	(3)	4	5	(6)	(7)
MB¹	Clause No./ Subclause No./ Annex (e.g. 3.1)	Paragraph/ Figure/Table/Note (e.g. Table 1)	Type of com-ment²	Comment (justification for change) by the MB	Proposed change by the MB	Secretariat observations on each comment submitted

NO	1, 2, 3, 4, 5	ge	This part of ISO 13250 must be aligned with part 2.
NO	3, 4, 5	ed	The order of exposition makes the document hard to understand.	Reverse the order of sections 3, 4, and 5.
NO	3	te	The specified algorithm sometimes leads to locators that are different being canonicalized as though they were the same. For example, if the [base locator] property is "file:/home/myhome/mytopicmap.xtm" the following locators "file:/home/myhome/mytopicmap.xtm#foo" "file:/home/myhome/#foo" "file:/home/#foo" "file:/#foo" are all canonicalized to "#foo".	Change the algorithm, for example to produce the following canonical output: "#foo" "../#foo" "../../#foo" "../../../#foo"
NO	5	te	There is no convention for inserting whitespace into CXTM output. This will lead to output that is difficult to debug.	Define conventions for inserting white-space, e.g. line-break after every end- tag in element content.
NO	5.3	te	It is unnecessarily difficult to debug CXTM output because references to elements are by positional value (e.g., <type topicref="1370">) but the positional values are not stated explicitly on the elements themselves.	XML elements which are referred to by positional value should have an attribute that specifies the positional value, e.g. <topic num="1370">, etc.

United Kingdom

The UK cannot approve this draft as it is not aligned with the current text of Part 2.

United States

The title as balloted is "Canonical Syntax." The FCD draft, however, bears the title "Canonicalization."

The earliest committee draft bore the title: "Canonical XTM: A canonical serialization format for topic maps," see N395. The CD ballot document, N484, billed itself as "Canonicalization." (See N454, which is a PDF version of the draft. The current ballot, however, refers to "Canonical Syntax," see N590b.

I think balloting "canonical syntax" in this case is inconsistent with the document itself and in view of its contents, simply wrong. I am not sure that canonicalization is any better, but at least the naming should be consistent.
Introduction

While non-normative, the introduction should limit itself to the subject of the standard and not make claims about topic maps in general. This part is a standard for canonical representation of a particular data model and should confine the scope of its remarks accordingly.

Or as better stated in the section on Scope: "This part of ISO/IEC 13250 specifies an algorithm for the canonicalization of an instance of the Topic Maps Data Model. It defines a canonical ordering for every information item defined by the Topic Maps Data Model and an XML serialisation of the information items and all of their properties. When the XML is serialized in accordance with [XML-C14N], the serialized file is the canonical representation of the Topic Maps Data Model instance."
3 Normalization of Locator References (Note, typo)

In the note to this section, note "CXTM process described by *This* part of" which is taken to be a typo.
4.3 Comparison of Set Property Values (1)

Reads: "Sets sort in order of the number of elements in the collection. A set with fewer elements sorts lower than a set with more elements."

Have not seen "collection" defined anywhere yet. Assume what is meant is that a set of sets is sorted by the number of elements in each set. That is not, however, what is said by the quoted material.
4.3 Comparison of Set Property Values (2)

Reads in part: "The collections then sort in the order of the non-equal items in each collection."

Does collection = set? Is the sort order that of 4.2 or does it vary by the nature of the items in the set? (Assume the latter but it should be stated.
5.3 Encoding of positional values

Reads: "When the position of an item in a list is to be encoded, the encoded value is the index of that item in the list counting from 1 as the index of the first list item."

Question: Is there a reason to depart from the practice of the first item in a list having position 0? Thinking that departing from customary behavior is more likely to cause confusion among implementers.