From rinehuls@access.digex.net  Tue May  6 23:03:29 1997
Received: from access4.digex.net (qlrhmEbBUV1EY@access4.digex.net [205.197.245.195]) by dkuug.dk (8.6.12/8.6.12) with ESMTP id XAA23212 for <sc22docs@dkuug.dk>; Tue, 6 May 1997 23:03:01 +0200
Received: from localhost (rinehuls@localhost)
          by access4.digex.net (8.8.4/8.8.4) with SMTP
	  id QAA18269 for <sc22docs@dkuug.dk>; Tue, 6 May 1997 16:59:37 -0400 (EDT)
Date: Tue, 6 May 1997 16:59:37 -0400 (EDT)
From: "william c. rinehuls" <rinehuls@access.digex.net>
Reply-To: "william c. rinehuls" <rinehuls@access.digex.net>
To: sc22docs@dkuug.dk
Subject: SC22 M2466 - Vote Summary of LB N2364 - CD 14651
Message-ID: <Pine.SUN.3.96.970501172020.11365A-100000@access4.digex.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

____________________beginning of title page ___________________________
ISO/IEC JTC 1/SC22
Programming languages, their environments and system software interfaces
Secretariat: U.S.A.  (ANSI)


ISO/IEC JTC 1/SC22
N2466


May 1997


TITLE:             Summary of Voting on CD Approval for CD 14651 -
                   Information technology - International String
                   Ordering - Method for Comparing Character Strings
                   and Description of a Default Tailorable Ordering


SOURCE:            Secretariat, ISO/IEC JTC 1/SC22


WORK ITEM:         JTC 1.22.30.02.02


STATUS:            N/A


CROSS REFERENCE:   SC22 N2364


DOCUMENT TYPE:     Summary of Voting


ACTION:            To SC22 Member Bodies for information.

                   To WG20 for preparation of a Disposition of Comments
                   Report and a recommendation on the further processing
                   of the CD.


Address reply to:
ISO/IEC JTC 1/SC22 Secretariat
William C. Rinehuls
8457 Rushing Creek Court
Springfield, VA 22153  USA
Tel:  +1 (703) 912-9680
Fax:  +1 (703) 912-2973
email:  rinehuls@access.digex.net


________________end of title page; beginning of overall summary ________

                             SUMMARY OF VOTING ON

Letter Ballot Reference No:  SC22 N2364
Circulated by:               JTC 1/SC22
Circulation Date:            01-20-1997
Closing Date:                04-24-1997 


SUBJECT:  CD Approval for CD 14651 - Information technology 
          International String Ordering - Method for Comparing Character
          Strings and Description of a Default Tailorable Ordering


The following responses have been received on the subject of approval:


"P" Members supporting approval
      without comment                           10


"P" Members supporting approval
      with comment                               1


"P" Members not supporting approval              4


"P" Members abstaining                           2


"P" Members not voting                           7


"O" Members supporting approval
      without comment                            1


"O" Members not supporting approval              1


"O" Members abstaining                           1


Secretariat Action:

The comment accompanying the abstention vote from Germany was:  "There is
no national WG11 (sic) rapporteur."  The comments accompanying the
affirmative vote from Denmark; the comments accompanying the abstention
vote from the United Kingdom; and the comments accompanying the negative
votes from Austria, Israel, Japan, Netherlands, and USA are attached.

WG20 is requested to prepare a Disposition of Comments report and make a
recommendation on the further processing of the CD.

_______________end of overall summary; beginning of detail summary ___
                 ISO/IEC JTC1/SC22  LETTER BALLOT SUMMARY


PROJECT NO:    JTC 1.22.30.02.02

SUBJECT:  CD approval for CD 14651 - Information technology - International
          String Ordering - Method for Comparing Character Strings and
          Description of a Default Tailorable Ordering
          
Reference Document No:  N2364           Ballot Document No:  N2364
Circulation Date:   01-20-1997          Closing Date:  04-24-1997 
                                                              
Circulated To: SC22 P, O, L             Circulated By: Secretariat


                  SUMMARY OF VOTING AND COMMENTS RECEIVED

                      Approve  Disapprove  Abstain Comments   Not Voting
'P' Members

Australia               (X)      ( )         ( )       ( )       ( )
Austria                 ( )      (X)         ( )       (X)       ( )
Belgium                 ( )      ( )         ( )       ( )       (X)
Brazil                  ( )      ( )         ( )       ( )       (X)    
Canada                  ( )      ( )         ( )       ( )       (X)
China                   ( )      ( )         ( )       ( )       (X)
Czech Republic          (X)      ( )         ( )       ( )       ( )
Denmark                 (X)      ( )         ( )       (X)       ( )
Egypt                   ( )      ( )         ( )       ( )       (X)
Finland                 (X)      ( )         ( )       ( )       ( )
France                  (X)      ( )         ( )       ( )       ( )
Germany                 ( )      ( )         (X)       (X)       ( )
Ireland                 ( )      ( )         ( )       ( )       (X)
Japan                   ( )      (X)         ( )       (X)       ( )
Netherlands             ( )      (X)         ( )       (X)       ( )
Norway                  (X)      ( )         ( )       ( )       ( )
Romania                 (X)      ( )         ( )       ( )       ( )
Russian Federation      (X)      ( )         ( )       ( )       ( )
Slovenia                (X)      ( )         ( )       ( )       ( )
Sweden                  ( )      ( )         ( )       ( )       (X)
Switzerland             (X)      ( )         ( )       ( )       ( )
UK                      ( )      ( )         (X)       (X)       ( )
Ukraine                 (X)      ( )         ( )       ( )       ( )
USA                     ( )      (X)         ( )       (X)       ( )

'O' Members

Argentina               ( )      ( )         ( )       ( )       ( )
Bulgaria                ( )      ( )         ( )       ( )       ( )
Cuba                    ( )      ( )         ( )       ( )       ( )
Greece                  ( )      ( )         ( )       ( )       ( )
Hungary                 ( )      ( )         ( )       ( )       ( )
Iceland                 ( )      ( )         ( )       ( )       ( )
India                   ( )      ( )         ( )       ( )       ( )
Indonesia               ( )      ( )         ( )       ( )       ( )
Israel                  ( )      (X)         ( )       (X)       ( )
Italy                   ( )      ( )         ( )       ( )       ( )
Korea Republic          (X)      ( )         ( )       ( )       ( )
New Zealand             ( )      ( )         ( )       ( )       ( )
Poland                  ( )      ( )         ( )       ( )       ( )
Portugal                ( )      ( )         (X)       ( )       ( )
Singapore               ( )      ( )         ( )       ( )       ( )
Thailand                ( )      ( )         ( )       ( )       ( )
Turkey                  ( )      ( )         ( )       ( )       ( )
Yugoslavia              ( )      ( )         ( )       ( )       ( )

____end of detailed summary; beginning of Danish Comments Accompanying
Affirmative Vote__________________________

>From keld@dkuug.dk Tue Apr 29 15:57:22 1997

Here is the danish ballot on CD 14651:

Title: Comments on CD 14651 - International String Ordering

Source: Danish Standards Association

Date: 1997-04-29

Reference: SC22 N2364

The Danish ballot is: Yes, with general and technical comments

The comments are directed towards the english version of the text,
although the same comments can be done wrt. the French text.

1. The overall technical contents of CD 14652 is sound, and as agreed
by the working group, and thus we can accept the document as a CD.

General comments:

2. There is too much emphasis on the "binary sorting string" concept.
The concept of just comparing two strings should be catered for
overall in the document. Some places only sorting on binary prepared
strings are possible, to reach the functionality. Also there should be ample
warnings a number of places on the binary sorting string concept, as it
is culturally dependent, that is it is dependent on the sorting specification
used to produce the binary representation. Storing data in the precompiled
binary string representation should thus be recommended only for monocultural
environments, and that is actually environments that we should advise against,
having internationalization as our goal.

3. Formal description language, such as ISO 11404 or IDL of ISO 13788 (PCTE)
should be used in the specification of the APIs. The description of
the APIs lack a number of specifications now, including description of the
types of the parameters, and specifications of how to bind to programming
languages, that are inherent in the 11404 and 13788 specification languages.
We are willing to help rewriting the API sepcifications in light of this
comment.

4. We recommend that a thin binding method be used, as demonstrated
in other API papers of WG20. We can provide text for this, in conjunction
with text to address the problems mentioned in comment 3.

5. The APIs have 3 parameters, that should not occur in the API, because
all localisation should be done via the locale. These are the parameters
order_accents, order_case and sign_espace of the COMPCAR and CARABIN functions.

6. The LC_COLLATE specification in 14652 format should be readily useable
and referenceable, without need for retailoring. The different options,
as expressed by the parameters of the 3 parameters in our comment 5, should
be available as different LC_COLLATE specifications each with a well-defined
name.

7. The definitions in section 3 should be numbered and not ordered
alfabetically (in either English or French).

8. The definitions are too centered about a precompiled sorting string
concept. Terminology should also be applicable to comparisons on the
string encoding. Terms that should be useable with plain string comparisons
include: equivalence, ordering key, ordering subkey.

9. The technical specifications should be aligned with 14652. especially 
hexadecimal symbolix ellipses "..".

10. The names of the APIs should be less French-oriented.

11. The tables should use names established from the POSIX locale
work, such as ISO/IEC 9945-2 annex G names or 14652 names from the
repertoiremap, especially when not using <Uxxxx> names.

12. A number of scripts have not been ordered properly, such as hiragana and
katakana and thai.

13. A reversability function from binary sort strings to character strings
seems to be missing.

14. There are some spelling errors, and we suggest a spell-checker be used
for production of further documents.

Technical comments:

15. page 5: first paragraph: It is not always required to transform, for example
"4" into a number of strings, sometimes it is only necessary to transform
it into one string. Thus change "requires" to "may require" and "is hence"
to "may thus be".

16. Page 5, last paragraph and following prargraphs: Too much emphasis on the
precompiled sorted character
sting data type. This is not a general type as noted in our comment 2.

17. Page 8, Add after "Scandinavian" "and several other". This incudes languages
like Polish, Finnish, Hungarian, Turkish, and many others.

18. Page 14: "subprogramme" - rather use the word "function". All APIs in this
standard are functions. All references to "subprogrammess" should be 
changed to "functions" in the standard.

19. page 15, first paragraph: we recommend that only uppercase characters be
used in hexadecimal numbers, and this is also the specification in CD 14652.

20. Page 15, last paragraph: it seems like it is a requirement that a LC_COLLATE
specification, like the default, can be tailored on the fly. This is not
recommendable, as it would take quite some processing time, and thus delay
the processing considerable. On the fly tailoring should thus not be a
requirement.

21. Page 16, 5.1.1 last paragraph: use the name of the API (COMPCAR)
instead of the number "API 1".

22. Page 17: last paragraph: the names of the functions should be used for
the binding. Of cause the names of the functions may vary for the
different programming languages, but the names are more than "only
indicative".

23. Page 20: The COMPCAR function seems to miss a result value on
whether the first string was lexiographically less, equal or greater
than the second string. We propose the values -1, 0 and 1 for the three
possiblities, in line with current C practice. Also return values seems to
be missing for the other functions.

24. Page 21: It should not be normatively required that COMPCAR be equivalent
to CARABIN and COMPBIN. CARABIN produces output that is not necessary for
some use of COMPCAR.

25. Page 21, last paragraph: It should not be prescribed that there be
binary strings used for comparisons, in the COMPCAR function. Also the
"default" table mentioned here is the global locale, and not the 14651 default.
This should be clarified, maybe using "global" instead of "default".

26. Page 22: all parameters should be spelled out, and references to other
APIs when defining the parameters should be avoided.

27. Page 25 second paragraph: the default table cannot be used per se, as it
needs tailoring. See our comment 6 on how to solve this.

28. Page 27, first paragraph: this description is very oriented towards
the binary sort string. Descriptions also valid for COMPCAR method
without binary sort strings should be present. We would request a separate
descripti on how COMPCAR can be implemented, especially pointing out that only
comparison of the first (few) characters are necessary in many cases, and
that generating binary sort strings is typically not necessary.

29. Page 27: level 1: Some non-letters, for example Kana, may have more than
one character at the first level.

30. Page 27: note of 5.3.2.1: Combining accents may have ignore at level 1,
and then values at level 2. Should that not lead to full predictability?

31. Page 29: level one: Use the API names instead of "SUBPROGRAMME"

32. Page 29: what is the difference between level 2 and 4? In  traditional
locale invocation there is not that difference, but some other
difference. Maybe level 4 should always be required.

33. Page 31: COLL_WEIGHT_MAX is not a directive of 14652.

34. Page 31: Some scripts are not (yet) in IS 10646, for example the Yi and 
Canadian syllable scripts.

35. Page 31: We should assure that comments are allowable all the places used
here according to 14652, and possibly change 14652 to allow them.

36. Page 41-51: a number of the symbols defined here are also defined later.
Example <a8> defined on page 46 and page 79. This is not allowed according
to 14652 (giving a symbol two weights).

37. Page 111: (4) There needs to be a strong warning  that binary strings stored
cannot be used internationally for culturally correct sorting,
as they are stored in a localized form. Or we should simply advise against it.

38. Page 112: the text seems obsolete, as these concepts have been proven.

39. Page 115: Also list ISO/IEC 9945-2 POSIX shell and utilities, especially
annex G, as a source.

40. Page 118, paragraph 7: There is only a need for 4 levels, not 5. 

41. Page 118, paragraph 7:
Is it necessary to have an extra level for 10646 conformance level 3? Maybe
in some cases but not generally. When sorting the combining characters
per se, there is no need for a further level.

42. Page 119: paragraph 9: We thought this was proven not to be true. Or is this
some implementation guideline (which then should noted as such).

43. Page 120: Annex I should be explained further, especially how it fits into
the internationalization model.

_________________________end of Danish Comments _______________________
____________beginning of UK comments accompanying abstention vote ______ 

> N 2364 	ISO/IEC CD 14651
> 
> The UK ABSTAINS on this ballot, due to lack of participation in this area.
> The UK would however like to bring the following issues to the attention of
> SC 22 : 
> 
> - a tutorial on problems solved is inappropriate for an IS; either the
> document should be a TR or the tutorial moved to an appendix. 
> 
> - the statement on page 10 about information being obtainable from
> Alain LaBonte' is also inappropriate for a formal document.
> 
> 
> There are also a number of minor points:
> 
> - there are a disturbingly high number of elementary typographical
> errors (e.g.  p 18 'starings' (strings); 'compariosn', 'aat'; also mixed
> languages in chbin1, chbin2 heading).  On page 19 there are French
> quotation marks rather than English ones. 
> 
> - p 25 there is a reference to section 5.8, which does not exist.
> 
> - subprogramme is consistently spelled thus, although `subprogram' is
> the correct form in both US and UK (don't know about Canada, Australia
> etc).

______________________________end of UK comments ______________________
_____________beginning of Austrian comments accompanying negative _____

ON (the Austrian NB) votes NO on CD Ballot SC22 N2364
(CD 14651 - Information technology - International String
Ordering - Method for Comparing Character Strings and
Description of a Default Tailorable Ordering) with the
following comments:

(1) It seems doubtful (to say the least) that a reasonable
Default Ordering for all -- or even most -- of the languages
of the world can be found.  Consequently, there is reason to
doubt the usefulness of the proposed International Standard.

(2) The "Tutorial" contained in the Introduction should be
moved to an informative annex; it should not remain in the
main part of the document which would have to be considered
normative.

(3) Even though there is a "Tutorial", the proposed methods
do not seem to be well explained.  It could at least be
expected that one should be able to read and understand the
tables in Annex 1 without having to consult other sources.
For an example, see page 51 where a rather poor comment, in
itself encoded, supposedly explains the structure of the
following tables by cryptically stating:
    "% <Uxxxx> <Base>;<Accent>;<Case>;<Special>"
The sudden change of typeface on the same page seems equally
confusing und unmotivated (except possibly by line length).
Also, it seems that a more detailed description of a
possible practical implementation could prove helpful.

(4) The "Benchmark" in Annex 2 adds to the general confusion
by showing the "sorted" version to be (in excerpt):
    "vice-president's"
    "offices"
    "vice-presidents'"
    "offices"
The problem obviously lies in automatic line breaks and can
easily be corrected, but seems to raise the question whether
similar errors have been introduced in areas which are very
difficult -- if not impossible -- to check.  To mention the
most prominent example, some errors in Annex 1 might never
be found because this part of the document can hardly be
checked exhaustively.

(5) It is rather difficult to determine the necessity of
text that is not present.  ON does therefore not feel able
to decide on Annexes F, G, and H.

(6) The document has obviously been translated from French
to English, which would not be a problem if the process had
been completed.  For a counterexample see the description
of procedures chbin1 and chbin2 on page 18.  Also, the name
of procedure sign_espace (on page 19) seems to be partially
French.

(7) The document does not appear to have been spell-checked.
Some examples:
    p. 19:  "precedenceof" should be "precedence of"
    p.109:  "deafult" should be "default"
    p.114:  "standaredized" should be "standardized"

(8) Anticipating the answer that ON experts should actively
participate in the process of correction and development of
the document in question, ON states that expert resources
in this area are too limited at this time.  However, this
does not imply that any document can be accepted.  Sorry.

___________________ end of Austrian comments _______________________
________beginning of Japanese comments accompanying negative _______

Japan disapproves CD 14651 proposed in SC22 N2364.

The CD is not mature enough to proceed to DIS from view point of  
completeness as a JTC1 standard as follows.
- not precise enough tuned yet from technical view point,
- still not reaching a consensus on the expected ordering result.
- high dependency on ISO/IEC 14652 which is not in CD stage.  and
- style of the document does not meet the JTC1 requirement

Therefore, because of high dependency of this CD on ISO/IEC 14652, Japan  
requests to wait and synchronize the review and ballot of CD 14651 until  
CD 14652 is registered, or to change the scope of the standard to  
"ordering result" only and move API part to i18n API project.

Thus, Japan sees absolutely no reason why we need to proceed to DIS now.


Comment detail.

1. Style  (major editorial)
The CD is very different from the what ISO/JTC1 directive requires,  (and  
also different from the template provided by ITTF and many of JTC1  
standards)  For example, there are very high dependency on font selection  
(usage of bold, slant, point size variation and/or unnecessary type face  
mixture. are prohibited). The Definition clause need to have sub-clause  
for each terms, two groups of annex --one for normative and another for  
informative.  Review and rewrite all text according to ISO/JTC1 directive
and template supplied by ITTF.

2. Relation with ISO/IEC 14652. (General process)
The syntax and semantics of Annex 1 are not defined in this draft and are  
depending on ISO/IEC 14652 which is not available yet.  Synchronize the
project with ISO/IEC 14652 development -- wait for decision until CD 14652  
is available at least,  or, if it is not accepted, move related part of  
the ISO/IEC 14652 into this CD..

3. Tutorial (major editorial)
Heavy tutorial clause at the beginning is not a thing to do, move them to  
appropriate place and rewrite them to fit the new place.  In addition,  
there are many "information only" text in main clauses (such as clause  
5.3).  Remove them out from main (and mostly normative) part of the
standard, and place them (if really necessary) to appropriate related  
place(s).

4. Scope (major technical)
Describe what are this standard defines clearly and straight forward way.   
For example, change the word "a method" to much clear specific word (which  
is API).  Once above change is made, it may affect on the title of the
standard.  Also the word "Default Tailorable Ordering" does not have  
logical meaning.  One possibility of the new title would be "API with
default order for International string ordering".

Last part of 2nd bullet (on an order which is culturally---of that script)  
should be removed because "order which is acceptable culturally" is not a
scope of this standard.  This part should be re-written something like  
"The default order is aiming for easy understanding of non-casual user of  
the script, cultural correctness/acceptance is not a purpose of the
default order.  The correctness/acceptance by the casual (or native)  user
to be provided by tailoring by the user or as a country profile".

Rationale: Above has been an agreement on the project scope from the  
beginning.  There were many discussions of impracticalness of having a  
single default order which may satisfy all of cultures.  The conclusion
has been it is not practical to have such an ideal default order, and it
was said that "this is why tailoring is needed".  Japan, then, did not  
request culturally correctness for ordering.  Same story for French, since
French ordering is so sophisticated no outsider understand it easily,  
therefore, it is not practical to use true French order as international  
default order, it may causes mis-understanding of peoples of other
cultures. Such sophisticated ordering (such as French) can be  
satisfactorily supported by tailoring anyway.  (See clause 4.2.7 of DTR  
11017,  This IS is not i18n per 4.2.6 nor 4.2.4. This IS is aiming 4.2.7)

5.  Definitions (major technical)
5-1, Each definitions should have separated sub-clause number.
5-2. API:  Initial text of "for purpose of..... standard" is not
necessary. 
5-3. equivalence:  Too much, make it almost 1/3 by eliminating  
"informative" texts with in this definition. (for example: last 4 lines)
5-4. field, first order talken, fourth order talken level, level, second  
order talken, transformation, third order talken:  Eliminate "informative"  
explanations.
5-5. posthandling, prehandling : Those definition should be moved  to the  
related clause.
5-6  telephone-book-type transformation:  This term need not be defined  
in Definitions because it appears only once in Introduction (5th para.,  
Page 5).  Although Japan considers that the paragraph is understandable  
in itself, we propose to change the first sentence to:
   More generally, specific requirements exist for a kind of complex  
transformation 
  -- e.g. phonetic transformation adopted in some telephone-book systems 
because telephone-book ordering means differ from culture to culture, so,  
this wording may confuse the user.

6.  Conformance (major technical)
6-1.  Conformance clause(s) should come after the scope clause it should  
not be after the requirements clause.  The location of the conformance  
clause is inviting difficulty of understanding of each conformance levels  
clearly.

Reason (rationale) why conformance clause should be clause 2:
If requirement is simple and no leveling are employed, the conformance  
clause can be any place in theory. Note that ISO/IEC directive part-3  
does not require "conformance clause" even.  However, in case of ISO/IEC  
CD 14651, the condition is different, it should be clause 2.
Since 14651 is a very complicated multilevel standard. the scope clause  
can not cover all what "scope' clause should say.  The conformance, in  
particular, the clean and clear "levels" descriptions are acting, in
reality, as a sub-scope clauses as well as real conformance descriptions.   
If it does not come after "scope" clause, it is almost impossible for the  
user of the standard to understand "what are defined in this standard and  
how to read the standard efficiently and accurately".
6-2,  Conformance clause should have exact pointer(s) for the conformance  
requirement (clause and sub-clause numbers).  Umbrella conformance for  
buried requirements with in main clauses (like this CD) should not be
used.   (Current CD is too unkindly for reader)
6-3.  In case of leveled conformance, provide a sub-clause to explain  
what those levels are much straight way.  (Too many indirect explanation  
now).
6-3-1.  Conformance level-1 should be defined as "Generic API only.    And  
should not make some of the parameters as "option".  The option causes  
in-compatibility problems between conforming level-1 APIs.   Further  
define two options (not parameter option s), one for COMPCAR and another
for COMPBIN + CARABIN.
6-3-2  Conformance level-2 should be defined and stated as "Generic API  
and table format"
6-3-3  Conformance level-3:  Change prehandling to requirement for string
input as normative.  Thus prehandling is out of scope of this standard  
(remove 5.1.2 at least).   Then, change the description of this
conformance level accordantly.  By the way, in current text, normative
clause (5.1.2) is reefers informative annex. This is prohibited practice.
6-3-4  Conformance level-4.    Remove the word "possibility". then  
resultant might be "Add API an access method for specific table.
6-4.  Add a concept of conformance for "ordering result only"
6-5  Add a method to specify partial conformance of ordering result, for  
example, a method to state "every thing but Japanese repertoire are  
conforming this default order and Japanese repertoire are per JIS" would  
be a real life use of this standard.  (as one of sub-set of the ordering
result only conformance)
6-6,  Add a method to swap the order of th
0a+e scripts, but still the orders within each scripts are conforming
default order.
6-7,  Add a method to state only selected scripts in comment 6-6 are  
conforming the default order.
6-8,  Maintain compatibility with POSIX and C.  Providing independent  
conformance level may be one of the choice to respond for this comment. .
6-9,  Remove all of "best guess" dependency.  Write exactly what is needed.   For example, there is no description what "default order" is.  There is default table and API (and conformance levels), so best guess may be use the "default table" with the API
s.


7. Requirements (major technical)
7-1.  There are many options in one conformance level, those should be  
another levels of conformance if those are really necessary.
7-2.  The "Toggle" mechanism, which is realized by parameters  
"order_accent", "order_case" and "sign_escape", should be removed  
because:
1) it contradicts with the concept of the locale mechanism -- it allows an  
ordering regardless of the ordering table defined as a locale,
2) the concepts of "case" and "accents" are specific to some scripts and  
they are not defined in this draft where these script-dependent concepts  
have been resolved into universal rules in tables.
Instead of the current "Toggle" mechanism, Japan proposes to reconsider  
the specification of ordering tables, which will be defined in ISO/IEC  
14652, so as to enable variants of the default table be defined more  
flexibly -- for example, by introducing som e preprocessing elements
        #define ...
        #ifdef ...
        #include ...
etc.

7-3.  table
To specify a name of an ordering table in COMPCAR and CARABIN as a  
parameter "table" will put a heavy burden on implementations.  At runtime  
the processes COMPCAR and CARABIN should check every time whenever the  
table is changed from that of the previous call and/or the table should  
be compiled.
There are two alternatives to this problem:
1) to remove the parameter "table" from the two processes and define a  
new process "set_collating_table" which has a parameter "table",
2) to define a new process "open_table" which has an input parameter  
"table" and returns a pointer to a protected structure derived from that  
"table" while the parameter "table" in the two process is changed to  
"table_pointer".
7-4  "chbin1" and "chbin2" in COMPCAR are not necessary. Further more,  
options within an API specification does not make any sense at all.
7-5. The whole contents of 5.3 should be removed or put into an
informative annex because those contents are to be defined in ISO/IEC  
14652 in the current framework.
7-6. Add text for the case where characters are not encoded in ISO/IEC  
10646. Some character set, e.g. ISO 6937 are not in ISO/IEC 10646, and  
some do not have conversion table (or same character names) with ISO/IEC  
10646 (yet).

8. Data table (such as Annex A) (major technical)
8-1.  Japan confirms a principle of default order table as:
- The default order is non-native user friendly (easy to understand,  
simple rule, less exceptions)
- Cultural correctness for the native user of the script should be done  
by tailoring.  APIs and data format should have enough room for the  
necessary tailoring.
- Therefore, cultural correctness of the default order is not a goal of  
this standard.  Based on the principle above, Japanese proposal on
Japanese scripts are not correct for Japanese view, however, it is easy
for the people who are not familiar with Japanese scripts.
8-2 Collation for HIRAGANA and KATAKANA
Japan proposes to add a set of collating rules for HIRAGANA and KATAKANA  
attached..

The order defined in Attachment is different from one defined in JIS X
4061 which was published in February 1997.  The main differences in  
handling of a prolonged sound mark <U30FC>. Roughly speaking, JIS X 4061  
replaces the prolonged sound mark with the vowel of the most recent
letter,  while Attachment neglects the prolonged sound mark at first in  
the same way as a hyphen.
The second difference is handling of the iteration marks <U309D>, <U309E>,  
<U30FD>, <U30FE>.  Roughly speaking, JIS X 4061 replaces the iteration  
marks with the most recent KANA letter, while Attachment handles the  
iteration marks as they are.

The reasons for proposing Attachment are as follows:
  1) JIS X 4061 cannot be realized by LC_COLLATE representation
  unless some rules using regular expression, which will put a heavy
  burden on implementations, are introduced,

  2) ordering results of JIS X 4061 are hard to understand for
  foreigners without knowledge of how letter sequences are
  pronounced -- it is not cross-culture friendly,

  3) ordering results of Attachment are easy to understand for
  foreigners without knowledge of pronunciation of letter sequences
  and even in Japan, a number of encyclopedia order their items in
  the same way as Attachment does -- it is cross-culture friendly,

8-3  Consideration on Compatibility characters of ISO/IEC 10646.  
Consideration on the compatibility characters are missing.  At least,  
following are needed. 8-3-1  UFF00-FF9F, FFE0-FFE8
Handle those characters as same as equivalent characters in A-zone.
8-3-2  F900-FA0D, FA10, FA12, FA15-FA1E, FA20, FA22, FA25, FA26,  
FA2A-FA2D of ISO/IEC 10646-1  Handle those characters as same as
equivalent characters in I-zone.
8-4  FA0E, FA0F, FA11, FA13, FA14, FA1F, FA21, FA23, FA24, FA27-FA29 of 
ISO/IEC 10646-1 and future addition of CJK ideographs (ext-A and B).
Merge them with I-zone characters with defined rule. Provide informative  
annex which describe the rule (radical, number of the stroke and so  
on.....)
8-5  Character combination type symbols.
For those characters which are made up combination of two or more  
Japanese characters such as 3300-336F, Handle those as if those are  
string of independent characters.
8-6. Symbols of character(s) and symbol(s)
Symbols with character(s) should be handled one of following methods. 
a) Character(s) and symbol(s) like "short form" of normal writing such  as
2480 which is looked like "( 13 )".   Split the symbol as if it is a  
normal string.
b) Character(s) and symbols can not split into one unambiguous sequence  
such as 2470 which the circle can be either before or after character 17.
Handle as if it is a special form of the character(s) part of the symbol.
8-7.  Symbols for making combining sequence such as 20E0.
Follow the rule proposed at 8-6 above, the process might be different  
from the method for combining sequences.
8-8. Japan expect many countries have same kinds of comments above.   
Japan request, therefore, confirmation of specific to the data table to  
be circulated to all JTC1 member countries (not only SC22 p-member) for  
review.

9. Other comments
Japan recognizes many editorial issues as well as technical issues which  
are not on this ballot comment, too many major technical comments (and  
may be more to expect) does not give us a time to scan all of them.    
Japan thinks the minor editorial comment are unnecessary components of  
this ballot comments because of un-matureness of the CD 14651.
Anyway, the text should be rewritten totally for full acceptance of the  
technical comments.


-------  ATTACHMENT ---------

%level 1
<kn-dot>
<kn-prolong>
<kn-a>
<kn-i>
<kn-u>
<kn-e>
<kn-o>
<kn-ka>
<kn-ki>
<kn-ku>
<kn-ke>
<kn-ko>
<kn-sa>
<kn-si>
<kn-su>
<kn-se>
<kn-so>
<kn-ta>
<kn-ti>
<kn-tu>
<kn-te>
<kn-to>
<kn-na>
<kn-ni>
<kn-nu>
<kn-ne>
<kn-no>
<kn-ha>
<kn-hi>
<kn-hu>
<kn-he>
<kn-ho>
<kn-ma>
<kn-mi>
<kn-mu>
<kn-me>
<kn-mo>
<kn-ya>
<kn-yu>
<kn-yo>
<kn-ra>
<kn-ri>
<kn-ru>
<kn-re>
<kn-ro>
<kn-wa>
<kn-wi>
<kn-we>
<kn-wo>
<kn-n>
<kn-cmb-voice>
<kn-cmb-semivoice>
<kn-voice>
<kn-semivoice>
<kn-iter>

%level 2
<kn-HIRA>
<kn-KATA>

%level 3
<kn-SMALL>
<kn-SMALL-HF>           % for Table 122
<kn-NORMAL>
<kn-NORMAL-HF>          % for Table 122
<kn-VOICED>
<kn-SEMIVOICED>

% abbreviations in comments -- UCS Name:
%       HIRAkn -- HIRAGANA LETTER
%       KATAkn -- KATAKANA LETTER
%       hfwd   -- HALFWIDTH
%       voice  -- HIRAGANA-KATAKANA VOICED SOUND MARK
%       semi-voice -- HIRAGANA-kATAKANA SEMI-VOICED SOUND MARK
%       iter   -- ITERATION
%       comb    -- COMBINING
%       prolong -- KATAKANA-HIRAGANA PROLONGED SOUND MARK
%
<U3041> <kn-a>;<kn-HIRA>;<kn-SMALL>;<IGNORE> % HIRAkn  SMALL A
<U3042> <kn-a>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  A
<U30A1> <kn-a>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL A
<UFF67> <kn-a>;<kn-KATA>;<kn-SMALL-HF>;<IGNORE> % hfwd KATAkn  SMALL A
<U30A2> <kn-a>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  A
<UFF71> <kn-a>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  A
%
<U3043> <kn-i>;<kn-HIRA>;<kn-SMALL>;<IGNORE> % HIRAkn  SMALL I
<U3044> <kn-i>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  I
<U30A3> <kn-i>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL I
<UFF68> <kn-i>;<kn-KATA>;<kn-SMALL-HF>;<IGNORE> % hfwd KATAkn  SMALL I
<U30A4> <kn-i>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  I
<UFF72> <kn-i>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  I
%
<U3045> <kn-u>;<kn-HIRA>;<kn-SMALL>;<IGNORE> % HIRAkn  SMALL U
<U3046> <kn-u>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  U
<U3094> <kn-u>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  VU
<U30A5> <kn-u>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL U
<UFF69> <kn-u>;<kn-KATA>;<kn-SMALL-HF>;<IGNORE> % hfwd KATAkn  SMALL U
<U30A6> <kn-u>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  U
<UFF73> <kn-u>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  U
<U30F4> <kn-u>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  VU
%
<U3047> <kn-e>;<kn-HIRA>;<kn-SMALL>;<IGNORE> % HIRAkn  SMALL E
<U3048> <kn-e>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  E
<U30A7> <kn-e>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL E
<UFF6A> <kn-e>;<kn-KATA>;<kn-SMALL-HF>;<IGNORE> % hfwd KATAkn  SMALL E
<U30A8> <kn-e>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  E
<UFF74> <kn-e>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  E
%
<U3049> <kn-o>;<kn-HIRA>;<kn-SMALL>;<IGNORE> % HIRAkn  SMALL O
<U304A> <kn-o>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  O
<U30A9> <kn-o>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL O
<UFF6B> <kn-o>;<kn-KATA>;<kn-SMALL-HF>;<IGNORE> % hfwd KATAkn  SMALL O
<U30AA> <kn-o>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  O
<UFF75> <kn-o>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  O
%
<U304B> <kn-ka>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  KA
<U304C> <kn-ka>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  GA
<U30F5> <kn-ka>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL KA
<U30AB> <kn-ka>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  KA
<UFF76> <kn-ka>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  KA
<U30AC> <kn-ka>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  GA
%
<U304D> <kn-ki>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  KI
<U304E> <kn-ki>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  GI
<U30AD> <kn-ki>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  KI
<UFF77> <kn-ki>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  KI
<U30AE> <kn-ki>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  GI
%
<U304F> <kn-ku>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  KU
<U3050> <kn-ku>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  GU
<U30AF> <kn-ku>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  KU
<UFF78> <kn-ku>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  KU
<U30B0> <kn-ku>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  GU
%
<U3051> <kn-ke>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  KE
<U3052> <kn-ke>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  GE
<U30F6> <kn-ke>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL KE
<U30B1> <kn-ke>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  KE
<UFF79> <kn-ke>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  KE
<U30B2> <kn-ke>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  SMALL GE
%
<U3053> <kn-ko>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  KO
<U3054> <kn-ko>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  GO
<U30B3> <kn-ko>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  KO
<UFF7A> <kn-ko>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  KO
<U30B4> <kn-ko>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  GO
%
<U3055> <kn-sa>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  SA
<U3056> <kn-sa>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  ZA
<U30B5> <kn-sa>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  SA
<UFF7B> <kn-sa>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  SA
<U30B6> <kn-sa>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  ZA
%
<U3057> <kn-si>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  SI
<U3058> <kn-si>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  ZI
<U30B7> <kn-si>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  SI
<UFF7C> <kn-si>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  SI
<U30B8> <kn-si>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  ZI
%
<U3059> <kn-su>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  SU
<U305A> <kn-su>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  ZU
<U30B9> <kn-su>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  SU
<UFF7D> <kn-su>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  SU
<U30BA> <kn-su>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  ZU
%
<U305B> <kn-se>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  SE
<U305C> <kn-se>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  ZE
<U30BB> <kn-se>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  SE
<UFF7E> <kn-se>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  SE
<U30BC> <kn-se>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  ZE
%
<U305D> <kn-so>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  SO
<U305E> <kn-so>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  ZO
<U30BD> <kn-so>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  SO
<UFF7F> <kn-so>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  SO
<U30BE> <kn-so>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  ZO
%
<U305F> <kn-ta>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  TA
<U3060> <kn-ta>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  DA
<U30BF> <kn-ta>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  TA
<UFF80> <kn-ta>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  TA
<U30C0> <kn-ta>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  DA
%
<U3061> <kn-ti>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  TI
<U3062> <kn-ti>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  DI
<U30C1> <kn-ti>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  TI
<UFF81> <kn-ti>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  TI
<U30C2> <kn-ti>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  DI
%
<U3063> <kn-tu>;<kn-HIRA>;<kn-SMALL>;<IGNORE> % HIRAkn  SMALL TU
<U3064> <kn-tu>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  TU
<U3065> <kn-tu>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  DU
<U30C3> <kn-tu>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL TU
<UFF6F> <kn-tu>;<kn-KATA>;<kn-SMAL-HFL>;<IGNORE> % hfwd KATAkn  SMALL TU
<U30C4> <kn-tu>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  TU
<UFF82> <kn-tu>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  TU
<U30C5> <kn-tu>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  DU
%
<U3066> <kn-te>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  TE
<U3067> <kn-te>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  DE
<U30C6> <kn-te>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  TE
<UFF83> <kn-te>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  TE
<U30C7> <kn-te>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  DE
%
<U3068> <kn-to>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  TO
<U3069> <kn-to>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  DO
<U30C8> <kn-to>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  TO
<UFF84> <kn-to>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  TO
<U30C9> <kn-to>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  DO
%
<U306A> <kn-na>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  NA
<U30CA> <kn-na>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  NA
<UFF85> <kn-na>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  NA
%
<U306B> <kn-ni>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  NI
<U30CB> <kn-ni>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  NI
<UFF86> <kn-ni>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  NI
%
<U306C> <kn-nu>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  NU
<U30CC> <kn-nu>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  NU
<UFF87> <kn-nu>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  NU
%
<U306D> <kn-ne>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  NE
<U30CD> <kn-ne>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  NE
<UFF88> <kn-ne>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  NE
%
<U306E> <kn-no>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  NO
<U30CE> <kn-no>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  NO
<UFF89> <kn-no>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  NO
%
<U306F> <kn-ha>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  HA
<U3070> <kn-ha>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  BA
<U3071> <kn-ha>;<kn-HIRA>;<kn-SEMIVOICED>;<IGNORE> % HIRAkn  PA
<U30CF> <kn-ha>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  HA
<UFF8A> <kn-ha>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  HA
<U30D0> <kn-ha>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  BA
<U30D1> <kn-ha>;<kn-KATA>;<kn-SEMIVOICED>;<IGNORE> % KATAkn  PA
%
<U3072> <kn-hi>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  HI
<U3073> <kn-hi>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  BI
<U3074> <kn-hi>;<kn-HIRA>;<kn-SEMIVOICED>;<IGNORE> % HIRAkn  PI
<U30D2> <kn-hi>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  HI
<UFF8B> <kn-hi>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  HI
<U30D3> <kn-hi>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  BI
<U30D4> <kn-hi>;<kn-KATA>;<kn-SEMIVOICED>;<IGNORE> % KATAkn  PI
%
<U3075> <kn-hu>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  HU
<U3076> <kn-hu>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  BU
<U3077> <kn-hu>;<kn-HIRA>;<kn-SEMIVOICED>;<IGNORE> % HIRAkn  PU
<U30D5> <kn-hu>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  HU
<UFF8C> <kn-hu>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  HU
<U30D6> <kn-hu>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  BU
<U30D7> <kn-hu>;<kn-KATA>;<kn-SEMIVOICED>;<IGNORE> % KATAkn  PU
%
<U3078> <kn-he>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  HE
<U3079> <kn-he>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  BE
<U307A> <kn-he>;<kn-HIRA>;<kn-SEMIVOICED>;<IGNORE> % HIRAkn  PE
<U30D8> <kn-he>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  HE
<UFF8D> <kn-he>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  HE
<U30D9> <kn-he>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  BE
<U30DA> <kn-he>;<kn-KATA>;<kn-SEMIVOICED>;<IGNORE> % KATAkn  PE
%
<U307B> <kn-ho>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  HO
<U307C> <kn-ho>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAkn  BO
<U307D> <kn-ho>;<kn-HIRA>;<kn-SEMIVOICED>;<IGNORE> % HIRAkn  PO
<U30DB> <kn-ho>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  HO
<UFF8E> <kn-ho>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  HO
<U30DC> <kn-ho>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  BO
<U30DD> <kn-ho>;<kn-KATA>;<kn-SEMIVOICED>;<IGNORE> % KATAkn  PO
%
<U307E> <kn-ma>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  MA
<U30DE> <kn-ma>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  MA
<UFF8F> <kn-ma>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  MA
%
<U307F> <kn-mi>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  MI
<U30DF> <kn-mi>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  MI
<UFF90> <kn-mi>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  MI
%
<U3080> <kn-mu>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  MU
<U30E0> <kn-mu>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  MU
<UFF91> <kn-mu>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  MU
%
<U3081> <kn-me>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  ME
<U30E1> <kn-me>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  ME
<UFF92> <kn-me>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  ME
%
<U3082> <kn-mo>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  MO
<U30E2> <kn-mo>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  MO
<UFF93> <kn-mo>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  MO
%
<U3083> <kn-ya>;<kn-HIRA>;<kn-SMALL>;<IGNORE> % HIRAkn  SMALL YA
<U3084> <kn-ya>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  YA
<UFF6C> <kn-ya>;<kn-KATA>;<kn-SMAL-HFL>;<IGNORE> % hfwd KATAkn  SMALL YA
<U30E3> <kn-ya>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL YA
<U30E4> <kn-ya>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  YA
<UFF94> <kn-ya>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  YA
%
<U3085> <kn-yu>;<kn-HIRA>;<kn-SMALL>;<IGNORE> % HIRAkn  SMALL YU
<U3086> <kn-yu>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  YU
<U30E5> <kn-yu>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL YU
<UFF6D> <kn-yu>;<kn-KATA>;<kn-SMAL-HFL>;<IGNORE> % hfwd KATAkn  SMALL YU
<U30E6> <kn-yu>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  YU
<UFF95> <kn-yu>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  YU
%
<U3087> <kn-yo>;<kn-HIRA>;<kn-SMALL>;<IGNORE> % HIRAkn  SMALL YO
<U3088> <kn-yo>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  YO
<U30E7> <kn-yo>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL YO
<UFF6E> <kn-yo>;<kn-KATA>;<kn-SMAL-HFL>;<IGNORE> % hfwd KATAkn  SMALL YO
<U30E8> <kn-yo>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  YO
<UFF96> <kn-yo>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  YO
%
<U3089> <kn-ra>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  RA
<U30E9> <kn-ra>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  RA
<UFF97> <kn-ra>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  RA
%
<U308A> <kn-ri>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  RI
<U30EA> <kn-ri>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  RI
<UFF98> <kn-ri>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  RI
%
<U308B> <kn-ru>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  RU
<U30EB> <kn-ru>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  RU
<UFF99> <kn-ru>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  RU
%
<U308C> <kn-re>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  RE
<U30EC> <kn-re>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  RE
<UFF9A> <kn-re>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  RE
%
<U308D> <kn-ro>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAkn  RO
<U30ED> <kn-ro>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  RO
<UFF9B> <kn-ro>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  RO
%
<U308E> <kn-wa>;<kn-HIRA>;<kn-SMALL>;<IGNORE> % HIRAkn  SMALL WA
<U308F> <kn-wa>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  WA
<U30EE> <kn-wa>;<kn-KATA>;<kn-SMALL>;<IGNORE> % KATAkn  SMALL WA
<U30EF> <kn-wa>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  WA
<UFF9C> <kn-wa>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  WA
<U30F7> <kn-wa>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  VA
%
<U3090> <kn-wi>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  WI
<U30F0> <kn-wi>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  WI
<U30F8> <kn-wi>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  VI
%
<U3091> <kn-we>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  WE
<U30F1> <kn-we>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  WE
<U30F9> <kn-we>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  VE
%
<U3092> <kn-wo>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  WO
<U30F2> <kn-wo>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  WO
<UFF66> <kn-wo>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  WO
<U30FA> <kn-wo>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAkn  VO
%
<U3093> <kn-n>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % HIRAkn  N
<U30F3> <kn-n>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAkn  N
<UFF9D> <kn-n>;<kn-KATA>;<kn-NORMAL-HF>;<IGNORE> % hfwd KATAkn  N
        ---
<U3099> <kn-cmb-voice>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % comb voice
<U309A> <kn-cmb-semivoice>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % comb semi-voice
<U309B> <kn-voice>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % voice
<UFF9E> <kn-voice>;<kn-HIRA>;<kn-NORAML-HF>;<IGNORE> % hhwd voice
<U309C> <kn-semivoice>;<kn-HIRA>;<kn-NORAML>;<IGNORE> % semi-voice
<UFF9F> <kn-semivoice>;<kn-HIRA>;<kn-NORAML-HF>;<IGNORE> % hfwd semi-voice
<U309D> <kn-iter>;<kn-HIRA>;<kn-NORMAL>;<IGNORE> % HIRAGANA iter MARK
<U30FD> <kn-iter>;<kn-HIRA>;<kn-VOICED>;<IGNORE> % HIRAGANA VOICED iter MARK
<U309E> <kn-iter>;<kn-KATA>;<kn-NORMAL>;<IGNORE> % KATAKANA iter MARK
<U30FE> <kn-iter>;<kn-KATA>;<kn-VOICED>;<IGNORE> % KATAKANA VOICED iter MARK
%
<U30FB> <IGNORE>;<IGNORE>;<IGNORE>;<U30FB> % KATAKANA MIDDLE DOT
<UFF65> <IGNORE>;<IGNORE>;<IGNORE>;<UFF65> % hfwd KATAKANA MIDDLE DOT
<U30FC> <IGNORE>;<IGNORE>;<IGNORE>;<U30FC> % prolong
<UFF70> <IGNORE>;<IGNORE>;<IGNORE>;<UFF70> % hfwd prolong
<UFF61> % hfwd HALFWIDTH IDEOGRAPHIC FULL STOP -- to be handled with <U3002>
<UFF62> % hfwd HALFWIDTH LEFT CORNER BRACKET -- to be handled with <U300C>
<UFF63> % hfwd RIGHT CORNER BRACKET -- to be handled with <U300D>
<UFF64> % HALFWIDTH IDEOGRAPHIC COMMA -- to be handled with <U3001>

________________________end of Japan comments ___________________________
__________beginning of Netherlands comments accompanying negative _____

From: John Bijlsma <John.Bijlsma@nni.nl>


JTC1 SC22 N2364, ISO/IEC CD 14651
IT - International String Ordering - Method for
Comparing Character Strings and Description of a
Default Tailorable Ordering
97-04-24, DISAPPROVAL WITH COMMENT
......................................................

The Netherlands vote negative on CD 14651. To turn our vote to positive
modifications shall be made in accordance with our comments. We reserve
our final position regarding the CD until we have seen the Final CD.

Technical comments:

1.  Remove Annex 1 and all references to an International Default Order.

-- SC22 has no expertise in this field, and cannot check for correctness
    Most NBs in SC22 are not able to check whether a proposed ordering
    for a certain unfamiliar script is in agreement to actual practice
    far from home. Those NBs that are familiar are not represented in
    SC22, nor have been asked for comment.

-- Default order is an instrument of cultural imperialism.
    In several countries more than one ordering rule is in use without
    any agreed preference. Calling one of these the "default" is
    imposing an extraneous pressure, and will involve interference with
    national habits.

-- No need for a default.
    No country uses always all characters from 10646. They should not be
    burdened with unwanted features. A method for supplying ordering
    information for a given restricted character set to an API should be
    contained in 14651 itself, without reference to 14652.

2.  Remove all references to 14652.

-- Needless complexity should be avoided.
    An ISO standard should  be as independent as possible of other ISO
    standards. If ordering information can only be supplied by way
    of a complete set of cultural conventions, as specified in 14652,
    there is involved an enormous overhead, and an obligation to NBs of
    also having to specify non-ordering information which is irrelevant
    to 14651, but nevertheless required in this CD.

Editorial comments:

The text of this document leaves much to be desired regarding
precision of definition, clarity of presentation and conformance to
ISO directives part-3.
The NNI cannot give detailed comments here, nor offer replacement text as
doing so would require rewriting more than half of the document for which  
we have no resources available. The NNI already gave some directions with
its vote on CD-registration, but found almost no improvement in this CD.

__________________________end of Netherland comments ________________
_________beginning of USA comments accompanying negative _____________


The US National Body votes to Disapprove ISO/IEC CD 14651 with the following
comments:

These are the U.S. comments for the first CD ballot for ISO/IEC CD 14651,
International String Ordering (SC22 N2364).

No alternative text is supplied as part of this response because a lot of it
would have to be written.  Here are the concerns:


AF-1

The specification of the sorting algorithm must be made independently of a
programming model.

Sorting is a process that is used in an incredible variety of circumstances
and on widely different systems, including object-oriented systems.  Care
should be taken in preparing the normative specifications for CD 14651 that
they are usable independent of a particular programming model, programming
language, or environment.

In particular, the descriptions of the sorting operations should be 
expressed
in an abstract form, specifying IN, OUT and RETURN parameters but "without"
language binding. Also, no parameters needed for the sorting operation may 
be
presumed to hide in some semi-opaque state, but rather they should always be
specified explicitly in the description of the operation.

If it is desired to show how the standard might be implemented in a POSIX
environment, that could be the subject of an informative annex. Function
bindings for POSIX could assume transparent access to locale data from the
POSIX locale model, if that is desired. The annex would specify how the
proposed POSIX functions make use of the abstract operations defined in the
normative part of the standard, and how their parameters are set either
explicitly or implicitly.


RLG 1:

The body of the standard includes material which belongs in an informative
annex, specifically the "Tutorial on problems solved by this standard."

RLG 2:

The order specified for two Cyrillic characters (p. 95-100 of the CD)
conflicts with the order in Table 2 of ISO/R9 and other sources (cited 
below).
 The characters in question are these two case pairs:  CYRILLIC CAPITAL 
LETTER
TSHE/CYRILLIC SMALL LETTER TSHE and CYRILLIC CAPITAL LETTER DZE/CYRILLIC 
SMALL
LETTER DZE.

Cyrillic letter TSHE:
In the CD, TSHE follows KA WITH HOOK and precedes EL.
In ISO/R9 and other sources, TSHE follows TE and precedes U.

Cyrillic letter DZE:
In the CD, DZE follows KOPPA and precedes CHE.
In ISO/R9 and other sources, DZE follows ZE and precedes I.

Other differences in the order of Cyrillic characters between the CD and 
Table
2 of ISO/R9 are either not supported by the other sources or are arbitrary.

RLG 3:

The order of scripts on p. 31 differs slightly from the order in ISO/IEC
10646.  Specifically:
 - Georgian follows Cyrillic; in ISO/IEC 10646, it follows Tibetan (pDAM-6)
 - Hebrew follows Arabic, in ISO/IEC 10646, it follows Armenian (and  
precedes Arabic).
These differences are not explained.

RLG 4:

Hangul is positioned between Tibetan and Cherokee (i.e., consistent with the
location of Hangul Jamo in ISO/IEC 10646).  There is no explanation as to 
why this position was chosen, rather than that of Hangul Syllables.  Since 
Korean may be written with a mixture of ideographs and Hangul syllables,
the Hangul Syllables position established by pDAM-5, immediately after the
CJK Unified Ideographs, might be preferable.


HP 1

The outline of the document does not follow the well defined and established
method already used in other JTC1 standards.  For example, the Introduction 
is too big and the reader gets lost and might decide not to continue to
read the document. Usually such information belongs to an informative
annex otherwise it becomes normative.

HP 2

The structure of the document has the "Scope" clause on page 11.  This 
clause should come immediately after a newly written short Introduction
clause.  In addition, this clause needs clarifications.  For example, does
it describes the APIs needed by applications to specify character string
ordering?  It is also not clear what is meant by the phrase "full
repertoire of ISO/IEC 10646 (independently of coding)".  The part that is
not clear in the previous statement is the one in parenthesis.  In
addition, the "Scope" clause talks about a specific default ordering but
it is not clear as to where in the CD how it was derived or how it is
related to the APIs.

HP 3

The "Conformance" clause should follow immediately the "Scope" clause.  It
should be combined with the "Requirements" clause.  It should be rewritten 
to make easy to understand how to conform without having to go through the 
syntax and content complexity of the "Requirements" clause.

Conformance is difficult to determine from the document; the document 
requires a table of precisely which features are required. Moreover, the
functions levels are, in general, independent of the previous level; there
is little reason to force all features of one level before the next higher
is reached.  Post handling is informative, and has no place in
conformance.

HP 4

In the clause "Tailoring Mechanism", it is not clear at all as to what an
application developers needs to do to override the default ordering that is
specified in Annex 1.

HP 5

May be it would be better to have this CD become a Technical Report rather
than a standard since it allows users to override the default ordering
proposed and there might be more users overriding the default, with an
undefined and nowhere described mechanism, than what the CD proposes.

HP 6

Dependency on an unpublished standard 14652, Cultural Conventions
Specification is too high.  Currently, 14652 is still in the CD stage as
mentioned in clause 2, Normative References, of this CD (14651).

In summary, there is a lot of structural and technical fine tuning that is
necessary to make this document complete.  If such an effort takes too much
time may be the industry could be served better if the proposal is modified
for publication as a TR rather an ISO standard.  This work can be later
converted to an ISO publication when CD 14652, Cultural Conventions
Specification, is accepted and is published as an ISO standard.


TG 1

The organization and nomenclature (e.g. COMPCAR) in unnecessarily obscure.
Names should be spelled out completely for clarity.

TG 2

The requirement that the original string be recoverable is unnecessary; many
applications, such as databases, will have a sort key be an alternate field 
in the record. They may only need to have a level 1 sort for their
application.  In that case, storing the original string twice or requiring
internal structure that enables reconstruction is unnecessary and only
increases storage to no purpose.

TG 3

Use of NBSP is in practice an unacceptable overload of its primary function.
Being able to functionally tailor just space and nbsp is in practice not
useful; in general a whole host of similar characters, punctuation and
symbols, behave the same way.

TG 4

The algorithm for comparison must be stated in terms of results, NOT a
specific mechanism.

TG 5

The format in Annex 1 is unnecessarily complex. It is impossible to assess 
and recommend this standard where we cannot clearly determine the result
of the default sorting order rules in this annex.  It forces use of a
whole separate notation for characters. To correct this, characters must
always be referred to by their full 10646 name for clarity, rather than
arbitrary notations such as AYEHS, AIGUT, POINN, QARNP, or many other
examples. Script names should always be the 10646 block name.

TG 6

The equivalencies of composed characters vs. composite character sequences;
e.g. a + umlaut and a-umlaut can be stated much more succinctly.

TG 7

The relative ordering of characters cannot be determined from the character
lists, since they are not even remotely in the resulting order.  To correct
this, the ordering of characters within a script must be presented in the
resulting order as much as possible. Example:

<U0000> IGNORE;IGNORE;IGNORE;<U0000> % NULL
<U2400> IGNORE;IGNORE;IGNORE;<U2400> % SYMBOL FOR NULL
<U0001> IGNORE;IGNORE;IGNORE;<U0001> % START OF HEADING
<U2401> IGNORE;IGNORE;IGNORE;<U2401> % SYMBOL FOR START OF HEADING
<U0002> IGNORE;IGNORE;IGNORE;<U0002> % START OF TEXT
<U2402> IGNORE;IGNORE;IGNORE;<U2402> % SYMBOL FOR START OF TEXT
<U0003> IGNORE;IGNORE;IGNORE;<U0003> % END OF TEXT
<U2403> IGNORE;IGNORE;IGNORE;<U2403> % SYMBOL FOR END OF TEXT
...
The fourth column (in this case) determines the final ordering of the
characters, which is NOT the order presented. It must be presented as:

<U0000> IGNORE;IGNORE;IGNORE;<U0000> % NULL
<U0001> IGNORE;IGNORE;IGNORE;<U0001> % START OF HEADING
<U0002> IGNORE;IGNORE;IGNORE;<U0002> % START OF TEXT
<U0003> IGNORE;IGNORE;IGNORE;<U0003> % END OF TEXT
...
<U2400> IGNORE;IGNORE;IGNORE;<U2400> % SYMBOL FOR NULL
<U2401> IGNORE;IGNORE;IGNORE;<U2401> % SYMBOL FOR START OF HEADING
<U2402> IGNORE;IGNORE;IGNORE;<U2402> % SYMBOL FOR START OF TEXT
<U2403> IGNORE;IGNORE;IGNORE;<U2403> % SYMBOL FOR END OF TEXT

TG 8

The Annex also does not make clear that the vast majority of its characters
are sorted in character code order. This requires the reader to visually
inspect every line to no purpose. These should be replaced one statement;
"Except where otherwise noted, all symbols are sorted as:

<Uxxxx> IGNORE;IGNORE;IGNORE;<Uxxxx>"

TG 9

Annex 2
List #1 is superfluous. The statement should be that the words in List#2 in
any initial order, when sorted will result in List #2.

______________________ end of USA comments __________________________
_________beginning of Israel comments accompany negative ________________

THE STANDARDS INSTITUTE OF ISRAEL (SII)

Comments on ISO/IEC CD 14651 (ISO/IEC JTC 1/SC22/WG20 N471en)

The SII votes NO on CD 14651.  If items 1, 2 and 3 were to be accepted,
our vote would become YES.

1.  Hebrew Accents

The Hebrew accents (UO591 to UO5AF), Meteg (UO5BD) and Upper Dot (UO5C4)
do not participate in the string ordering process.  They relate, in fact,
to the whole word, rather than to the letter to which they are attached,
and are never used in the lexicographic order or in any other ordering of
Hebrew texts.

-  The Hebrew accents should be removed from the list of collating
symbols, page 35, and from page 45.

-  On page 56 they should all be defined as:

         -  IGNORE; IGNORE; IGNORE; IGNORE;


2.  Composite characters and combining characters.

It seems that combining characters do not sort and compare as equivalent
to their precomposed encoding.  For instance, the two strings "Gu:nther"
and "Gu:nther", the first coded with UOOFC, the second with UOO75 followed
by UO3O8, are equivalent and should not be distinguished but are not
equivalent in the CD.  The particular coding used is an artifact, possibly
not under the control of the user, and is normally meaningless.

3.  Introduction, page 6, last paragraph:  "If two equivalent strings are
not absolutely identical, then the tie must be broken."

This sentence is not acceptable.  If two strings are equivalent they
should be treated as such.  For example, Hebrew strings that are
equivalent but have different accents.

4.  Introduction, page 4 (Editorial):

The introduction begins with a negative statement and continues with a
criticism of past practices.  The SII suggests it should be preferable to
begin with a positive statement describing what the standard is and what
are its benefits.

5.  Tutorial, page 7 (Editorial).

The tutorial would be better placed in an informative appendix.

6.  Page 35 (Editorial).

The comment should be qubuts (the s is mussing).

________________end of Israel comments; end of document SC22 N2466 ____