Files

On This Page: Functions, FILE*, Paths

Being able to access, read, and write files is a handy feature of any programming language. Because all of the variables, data, and memory that we create in our program is 'deleted' after our program finishes we often require some way of storing data that our program creates or needs the next time it runs.

Files are stored in broadly two different formats: text and binary. A binary file is simply a file that contains only 1s and 0s. Binary is nice a easy for our computers to read but are basically impossible for humans to read without assistance from the computer. The .out files that we've been producing with gcc are in a binary format. Text files are generally much easier for humans to read but can take quite a lot of work for a computer to understand. Our .c source files are in a text format, which is obvious because we're the ones typing them.

Download the files.c source file to see how we might write a simple program to access files

Files.c

Side Note

In order for the gcc compiler to understand our .c files it first needs to go through a step called Lexical Analysis (Lexing) which creates tokens that represent parts of our code (sometimes this is called a Lexer or a Tokenizer). After the Lexical Analysis, the newly generated tokens go through a Parser which actually tries to comprehend what the code is doing. Once it has gone through both of these steps the compiler is able to start turning our program from text into a binary file. It's pretty complicated!

There are few ways in which C can interact with files on your computer and they come from our trusty stdio.h header file.


Functions

  • fopen

    FILE *fopen(const char *filename, const char *mode)
    fopen() returns a FILE pointer (This will be covered in the FILE* section below). It also takes a number of const char * arguments. When we come across const char * we can often assume that it's looking for a string.

    Parameters

    • filename: Path to a file (relative or absolute). e.g. "C:\User\file.txt"
    • mode: This parameter is expecting a few special strings:
      • "r": open a file for reading
      • "w": create a file for writing, discarding any previous content
      • "a": open a file for appending, starting at the end of the file (similar to "w" but doesn't delete the contents of the file)
      • "r+": opens a file for reading and updating
      • "b": Indicates the file is binary

    These modes can be combined in various ways to achieve different results, e.g. "rb" is read a binary file.


  • fclose

    int fclose(FILE *stream)
    fclose() closes a file and discards the remainder of the stream. Returns 0 if successful otherwise returns an EOF.

    Parameters

    • stream: The FILE pointer previously returned by the fopen() function.

  • fscanf

    int fscanf(FILE *stream, const char *format, ...)
    This largely works very similarly to the scanf() function we're already familiar with but you are required to indicate which file you plan to read from using the *stream argument.

    Parameters

    • stream: The FILE pointer previously returned by the fopen() function.
    • format: This is the %d, %lf part, just like scanf().
    • ...: This is where you put the &variables to store the information that fscanf() has gathered, just like scanf().

  • fprintf

    int fprintf(FILE *stream, const char *format, ...)
    Just like printf() but you need to specify the stream (in this case, probably file) that you'd like to read from. The return value is the number of characters written or a negative if it fails.

    Parameters

    • stream: The FILE pointer previously returned by the fopen() function.
    • format: This is the %d, %lf part, just like printf().
    • ...: This is where you put the variables to write into the file, similarly to scanf().

  • fgetc

    int fgetc(FILE *stream)
    Returns the next character in the stream. Returns EOF if the end of the file is reached or an error occurs.

    Parameters

    • stream: The FILE pointer previously returned by the fopen() function.

  • fputc

    int fputc(int c, FILE *stream)
    Writes the character c (converted to an unsigned char) into the file. Returns an EOF if an error occurs.

    Parameters

    • c: Character to write
    • stream: The FILE pointer previously returned by the fopen() function.

  • fread

    size_t fread(void *ptr, size_t size, size_t nobj, FILE *stream)
    There's a lot going on in this function so it's probably best to just ignore this for now but I'll explain it a little.
    fread() reads from a stream until it reaches the end of the file or has reached nobj and places the data into the ptr array.

    Parameters

    • ptr: An array represented as void pointer. See Pointers for more info.
    • size: The size of the objects that are expected to be read. (Size in bytes)
    • nobj: The number of objects of size size to be read before finishing.
    • stream: The FILE pointer previously returned by the fopen() function.


FILE*

A common theme with my notes so far is that I bring up pointers frequently. That is because pointers are integral to how C works and we have already been using them in some cases without realising it.
Pretty much all of the file functions outlined above make use of or return something called a FILE pointer, represented in C as FILE* or FILE * - (they are the same).
A FILE pointer is essentially a memory address that stores the location for the file that we're working on. The only way to get a FILE pointer is to use the fopen() function which returns a file pointer. We can store this pointer in our own variable and the pass it around to other functions, such as fgetc() or fprintf(), like any other variable.


Relative and Absolute Paths

There are two ways to represent locations of files on computers. That is, a relative path or an absolute path. A relative path starts from wherever your current directory is, or rather, where your C program is being run.
An absolute path starts at the root directory of your system and works it's way to the desired file. The root directory depends on your OS but for windows it is generally C:\ (or any other partition you currently are in) and in Linux and MacOS it the root is /.

When we're using relative paths we also are able to use special characters to indicate our current directory and our parent directory:

  • Current Directory: "."
  • Parent Directory: ".."