WG15 Defect Report Ref: 9945-2-25
Topic: tr


This is an approved interpretation of 9945-2:1993.

.

Last update: 1997-05-20


								9945-2-25

	Class: Defect situation

The standards states what it states, and conforming implementations
must conform to this. However, concerns have been raised about this
which are being referred to the Sponsors of the standard for consideration as
a future amendment.



	Topic:			tr
	Relevant Sections:	4.64.3, 4.64.7


Defect Report:
-----------------------
 
          In Section 4.64.3 - Options {of  tr},  the  standard  states 
          that the -c option means ``complement the set of  characters 
          specified by string1.  See 4.64.7.''  [Draft 12 of ISO/IEC 
          9945-2:1993 (July 1992), p. 482, line 10472], and in Section 
          4.64.7 - Extended Description {of tr}, the  standard  states 
          that  [:class:]   ``[r]epresents  the  range  of   collating 
          elements between the range endpoints, inclusive, as  defined 
          by the current setting of the LC_COLLATE locale  category.'' 
          [Ibid., p. 484, lines 10544-10546] 
 
          In Section  4.64.7  -  Extended  Description  {of  tr},  the 
          standard states that 
 
               [i]f the -c option is specified, the complement of 
               the characters specified by string1-the set of all 
               characters  in  the  current  character  set,   as 
               defined by the setting  of  LC_CTYPE,  except  for 
               those actually specified in the  string1  operand- 
               shall  be  placed  in  the  array   in   ascending 
               collation sequence,  as  defined  by  the  current 
               setting of LC_COLLATE. 
 
          [Ibid., p. 485, lines 10590-10594] 
 
          However, if the character set is ISO 646, for example,  then 
          the command 
 
               tr -c '[:print:]' '?' 
 
          which  should  translate  all  unprintable   characters   to 
          question-mark characters, will pass bytes with the high  bit 
          set  through  unchanged.   This  is   clearly   wrong,   not 
          historical practice, and violates  the  principle  of  least 
          astonishment. 
 
          The tr  utility  is  a  binary  file  manipulator.   May  we 
          interpret the wording of lines 10590-10594 as 
 
               If the -c option is specified, the  complement  of 
               the characters specified by string1-the set of all 
               possible machine byte patterns, except  for  those 
               actually specified in the string1 operand-shall be 
               placed  in  the  array  in   ascending   collation 
               sequence, as defined by  the  current  setting  of 
               LC_COLLATE. 
 
          to fix this problem? 

WG15 response for 9945-2:1993 
-----------------------------------


The standard is clear in its requirement that the -c option in this case
will place the complement of the characters specified in string1 in the
array.  

It is also clear that the definition of the complement is "the
set of all current characters in the current character set, except for
those actually specified in string1".  This precludes an interpreation
allowing the set of all possible machine byte patterns to be added to
the complement set. 

The implementation must follow these requirements.  Concern over the
wording of this area of this standard has been forwarded to the
sponsors.

Rationale for Interpretation:
-----------------------------
None.

 _____________________________________________________________________________