Fundamental C - Random Access Files
Written by Harry Fairhead   
Monday, 21 February 2022
Article Index
Fundamental C - Random Access Files
Updates

This extract, from my book on programming C in an IoT context explains how easy random access files are in C and how easy it is to implement a simple database.

Fundamental C: Getting Closer To The Machine

Now available as a paperback and ebook from Amazon.

  1. About C
      Extract Dependent v Independent
                  & Undefined Behavio
  2. Getting Started With C Using NetBeans
  3. Control Structures and Data
  4. Variables
      Extract Variables
  5. Arithmetic  and Representation
      Extract Arithmetic and Representation
  6. Operators and Expression
      Extract: Expressions
      Extract Side Effects, Sequence Points And Lazy Evaluation
      First Draft of Chapter: Low Down Data
  7. Functions Scope and Lifetime
  8. Arrays
      Extract  Simple Arrays
      Extract  Ennumerations
  9. Strings
      Extract  Simple Strings
     
    Extract: String I/O ***NEW!!
  10. Pointers
      Extract  Starting Pointers
      Extract  Pointers, Cast & Type Punning
  11. Structs
      Extract Basic Structs
      Extract Typedef
  12. Bit Manipulation
      Extract Basic Bits
      Extract Shifts And Rotates 
  13. Files
     Extract Files
     
    Extract Random Access Files 
  14. Compiling C – Preprocessor, Compiler, Linker
     Extract Compilation & Preprocessor

Also see the companion volume: Applying C

<ASIN:1871962609>

<ASIN:1871962463>

<ASIN:1871962617>

<ASIN:1871962455>

 

Cbookcover

File handing is a difficult area because providing files and file access is the responsibility of the operating system and so cannot be really platform-independent. However, C does provide some standard file handling functions in the standard library and these are where you start, no matter what the machine or operating system.

The whole subject of file handling and file I/O is a very big one and there are many functions that are involved in format conversion and so on that we haven’t the space to cover. Once you have seen the basic and most commonly used functions, then the remainder are fairly easy to understand.

There is also a second set of file handling commands that are found on Linux and Unix systems – the file descriptor, which is covered in Applying C For The IoT With Linux.

In Chapter But Not In This Extract

  • The File Idea
  • Basic Files
  • Text Mode
  • Binary Files
  • Structs as Records
  • Buffering
  • Character I/O

Positioning Functions

When you open a file the reading or writing position is at the start of the file. As you read or write, the file pointer to the current position in the file moves to the next byte to be read or written. This is what a sequential file is all about – reading and writing one byte after another. Originally files were stored on magnetic tape and you worked from the start to the end of the tape. If you wanted to reread the tape or to read what you had just written then you had the tape rewound. You can also rewind a general C file using the function:

rewind(fptr);

this moves the file position pointer to the start of the file.

Not all streams can be rewound or positioned. Usually files that are stored on disk or similar allow positioning and these are generally referred to as random access files.

More generally you can move to any position in the file using:

fseek(fptr,offset,whence);

where offset is the number of bytes you want to move the pointer by and whence is the location that the offset is measured from. If whence is SEEK_SET then offset is from the start of the file, if it is SEEK_CUR it is from the current location and if it is SEEK_END it is from the end of the file. The function returns 0 if the seek worked.

You can find out where the position pointer is using:

ftell(fptr);

which returns the position or -1 if the stream doesn’t have a position.

For POSIX systems ftell and fseek work for both binary and text files. In non-compatible systems they may only be trustworthy for binary files. For text files the best that you can do is to use ftell to retrieve a position and then use that value in fseek to move back to that position.

End Of File Errors

If you try to read beyond the current end of a file then the file functions return an EOF value, which is usually -1. The problem is that they also return EOF for file errors. To test for a true EOF you need to use:

feof(fptr);

which returns a non-zero value if the file is positioned at the end of file.

You can also use:

ferror(fptr); 

which returns a non-zero value for a general file error.

Note that fseeking beyond the end of the file does not trigger an EOF. In fact a seek clears any EOF flags that have been set.

Random Access

What you can do with a file when you are using fseek depends on how the file was opened.

If the file was opened exclusively for reading then you can move around the file, successfully reading whatever bytes you care to and interpreting them in whatever way you want to. You can also re-read any part of the file as often as you need to. It is also obvious that the set position has to be between the start and the end of the file.

If the file was opened exclusively for writing then you can move to a position that has already been written or to the end of the file where new data can be added to extend the file. On POSIX-compliant systems you can also move beyond the EOF and write data there. The data between the end of file and the newly written data will be read as zeros. This is not part of any C standard which instead specifies that you can rewrite data as many times as you like, but that you can only add new data to the end of the file.

Things get a tiny bit more complicated if the file is open for both reading and writing. In this case you can still rewrite existing data and add new data to the end of the file. If it is POSIX-compliant then you can also fseek beyond the end of the file. However if you switch between reading and writing you have to take account of the effects of buffering. Essentially if you move from writing to reading you have to make sure that the buffer is flushed. The C99 standard says that if you follow a write by a read then there has to be an intervening call to fflush, fseek, fsetpos or rewind. Switching from reading to writing is easier but you still need a call to fseek, fsetpos or rewind first unless you are already at the end-of-file. If you do immediately follow a write by a read or a read by a write then you trigger undefined behavior.

Opening a file for read and write and using positioning functions makes it easy to create a record oriented random access file – a simple database.

The idea is that you write a file of records based on a struct. For example, to write out ten name and age records you might use:

struct person me;
strcpy(me.name, "Harry");
me.age = 18;
FILE *f = fopen("myFile.bin", "wb+");
for (int i = 0; i < 10; i++) {
    fwrite(&me, sizeof (struct person), 1, f);
    me.age++;
};
fflush(f);

As we are writing consecutive records, no file positioning is needed and notice that each record is a year older so that you can check which record is read.

Now suppose you want to read the fifth record. This is obviously stored at an offset from the start of the file of 5*sizeof(struct person) and this is how you can move the file position to the correct location before reading the struct:

int record = 5;
fseek(f, record * sizeof (struct person), SEEK_SET);
struct person me2;
fread(&me2, sizeof (struct person), 1, f);
printf("%s  %d", me2.name, me2.age);


Last Updated ( Tuesday, 22 February 2022 )