Document number: N2352
Submitter: Martin Sebor
Submission Date: March 18, 2019
Subject: Add stpcpy and stpncpy to C2X

Summary

As discussed in N2349 - Toward more efficient string copying and concatenation, the string concatenation and copying functions specified in the <string.h> header, namely strcat, strncat, strcpy and strncpy, are difficult and sometimes impossible to use with optimal efficiency. Optimally efficient string concatenation is a linear operation that reads each string exactly once. However, due to the functions returning a pointer to the first destination character rather than one to the last one, chains of calls to the functions to concatenate multiple strings have a quadratic complexity.

Specifically, the optimal complexity of the concatenation into the array d of N strings, S1 through SN with lengths L1 through LN is:

O (L1 + L2 + L3 + ... + LN)

but the complexity of a chain of calls

	strcat (... (strcat (strcpy (d, S1), S2), ... ), SN)
approaches quadratic because each subsequent strcat call must first traverse all the characters copied by the call before it:

O (N × L1 + (N − 1) × L2 + (N − 2) × L3 + ... + LN)

As N2349 mentions, a number of superior solutions have emerged since the introduction of the functions into C in 1989. Among those are the POSIX functions stpcpy and stpncpy. The POSIX functions were first introduced in the The Open Group Technical Standard, 2006, Extended API Set Part 1. They are no more difficult to implement efficiently than the historical C standard functions. In fact, an efficient implementation of the latter is trivial in terms of the proposed functions so implementations with highly tuned forms of the former may simply rename them to the new names and have the traditional functions delegate to them as shown below.

	char* stpcpy (char *dst, const char *src)
	{
	  char *end = dst;
 	  // highly tuned code points end to the final nul
	  …
	  return end;
	}

	char* strcpy (char *dst, const char *src)
	{
	  stpcpy (dst, src);
	  return dst;
	}

This is a proposal to add these functions to C2X. The functions appear to be more moderately used in existing code than their traditional equivalents, strcat, strncat, strcpy and strncpy. The latest GCC source tree contains 86 vs 1254 calls to them, the shared Binutils/GDB tree 112 vs 2093 calls, and GNU Elfutils 88 vs 31 calls.

Why No stpcat and stpncat?

Because stpcpy and stpncpy return a pointer to the end of the copy there is no need for the corresponding concatentaion functions: they are completely superseded by stpcpy and stpncpy. Appending two strings to form a concatenatation of the two accomplished by chaining calls to the alternate functions. The following

	strcat (strcpy (d, s1), s2);
is equivalent to the more efficient
	stpcpy (stpcpy (d, s1), s2);
      


Suggested Change

Add the following subsection just after §7.24.2.3 The strcpy function.

7.24.2.? The stpcpy function

Synopsis
	#include <string.h>

	char* stpcpy(char * restrict s1, const char * restrict s2);
Description

The stpcpy function copies the string pointed to by s2 into the array pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.

Returns

The stpcpy function returns a pointer to the terminating null character copied into the array pointed to by s1.

Furthermore, add the following subsection just after §7.24.2.4 The strncpy function.

7.24.2.? The stpncpy function

Synopsis
	#include <string.h>

	char* stpncpy(char * restrict s1, const char * restrict s2, size_t n);
Description

The stpncpy function copies not more than n characters (characters that follow a null character are not copied) from the array pointed to by s2 to the array pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.

If the array pointed to by s2 is a string that is shorter than n characters, null characters are appended to the copy in the array pointed to by s1, until n characters in all have been written.

Returns

If a null character is written to the destination, the stpncpy function returns the address of the first such null character. Otherwise, it returns &s1[n].