Fundamental C - Simple Strings
Written by Harry Fairhead   
Sunday, 08 December 2019
Article Index
Fundamental C - Simple Strings
String Handling Functions
Buffer Overflow

Cbookcover

String Handling Functions

C, the core language, has minimal string support but the standard library has extensive string functions. It is important to know that the majority of the standard functions, all those with names starting with str, work with null-terminated strings. If the string or strings that they accept are not null-terminated then you will almost certainly encounter an array overflow.

Before getting on to the string functions it is worth pointing out that as a C string is just a char array you can directly access any character by indexing. That is:

char myString[10]="abc";
myString[2]=’X’;

changes c to X.

You can also add to a string if there is enough space, but remember to fix up the null terminator:

char myString[10]="abc";
myString[3]=’X’;
myString[4]=’\0’;

To make use of the string functions you have to add:

#include <string.h>

One of the first functions to discover is the strlen function which returns the length of a string.

For example:

char myString[10]="abc";
printf("%d",strlen(myString));

displays 3 as the length of the string even though the length of the char array is 10. If you are used to other languages and know a little about how they work you might expect strlen to retrieve a value that gives the string length but as already explained C strings are null-terminated. What strlen does is to scan the string to find the first null, counting the non-nulls on the way.

You can write your own strlen function quite easily:

int myStrLen(char string[]){
    int i=0;
    while(string[i]!=0){
        i++;
    }
    return i;
}

This isn’t the most compact way to write strlen but it does make it easy to understand.

What happens if you pass strlen a string that isn’t null-terminated?

The same as happens for all str functions – the loop keeps going until it hits a memory location that contains a zero by chance. If you are lucky this will just give you an incorrect result. If you are less lucky then it will probably try to read some other program’s data and cause a crash.

Other commonly used string functions are:

  • strcat - concatenate two strings

  • strchr - string scanning operation

  • strcmp - compare two strings

  • strcpy - copy a string

Using these is generally obvious.

Two examples should help.

To concatenate two strings you need to make sure that the first string has enough storage available to store both strings:

char myString1[10] = "abc";
char myString2[] = "def";
strcat(myString1, myString2);

You can see that myString1 has storage for 9 characters plus a null terminator and so it has space for the extra three letters. The three characters in myString2 are copied into myString1 starting at the null and myString2’s own null finishes the string. Notice that there is a lot that can go wrong with this and no check is done to make sure that there is sufficient space and no checks that the strings are null-terminated.

The strchr function will find the position of a single character in a string:

char myString1[10] = "abc";
char* position=strchr(myString1,'c');

This sets position to point at the first occurrence of c in myString1 or a null pointer if c isn’t in the string. Notice that this is not the array index of the element but a pointer to it. That is, position can be regarded as the substring of myString1 starting at the first occurrence of c including all the characters to the end and the null. Also notice that this is not a copy of the string. Again, if the string isn’t null-terminated this will overrun the char array.

A very common use of string functions is to assign to a string. As mentioned earlier, you cannot use an idiom like:

char myString[10]=”abc”;
myString=”def”;

to assign to a string variable. You have to use the strcpy function:

char myString[10]=”abc”;
strcpy(myString,”def”);

Notice that this only works if the destination has enough space to store the new string and its null terminator. If the source is a general string and not a string literal you also have the potential for an array overrun if it isn’t null-terminated.

You can see the general pattern.

C strings are scanned or transferred from one to the other using loops that halt on finding either the target or the null terminator. As long as you are sure that the strings are null-terminated i.e. they are strings your program created, then this is usually not a problem. Where the problems start is when your program accepts input from another source – a user say or data via an input port or network.

In such cases it is better to use the similar related functions, the safe string functions, starting with the prefix strn – where the n signifies the use of an additional parameter an upper bound on the number of chars to process.

For example:

char myString1[10] = "abc";
char myString2[] = "def";
strncat(myString1, myString2,3);

This is the same as the previous strcat example and the result is the same, i.e. “abcdef” followed by a null terminator. The important difference is that at most 3 chars will be copied from myString2 even if it isn’t null-terminated. Notice that the destination, myString1 in this case, is always null-terminated and so it needs to have space for n+1 characters from myString2. The only thing that can go wrong with this function call is if myString1 doesn’t have enough space, and as you know it has to have at least n+1 free elements you should be able to ensure this.

If you are trying to assign one string to another then don’t use strcpy, which is fine for assigning a string literal, instead use strncpy:

char myString1[10]=”abc”;
char myString2[10]=”def”;
strncpy(myString1,myString2,10);

While there is no danger of array overrun it is possible that myString1 will not have a null terminator if myString2 doesn’t have one or if it beyond the 10 char limit.

The other strn functions work in the same way and you can use them to avoid string overflow.



Last Updated ( Monday, 09 December 2019 )