File Input and Output
Kenneth Leroy Busbee
Overview
A computer file is a computer resource for recording data discretely in a computer storage device. Just as words can be written to paper, so can information be written to a computer file.
There are different types of computer files, designed for different purposes. A file may be designed to store a picture, a written message, a video, a computer program, or a wide variety of other kinds of data. Some types of files can store several types of information at once.
By using computer programs, a person can open, read, change, and close a computer file. Computer files may be reopened, modified, and copied an arbitrary number of times.[1]
Discussion
In computer programming, standard streams are pre-connected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin – keyboard), standard output (stdout – originally a printer) and standard error (stderr – monitor). Streams may be redirected to other devices and/or files. In current environments, stdout is usually redirected to the monitor.[2]
Computer files are stored on secondary storage devices and used to maintain program data over time. Most programming languages have built-in functions or libraries to support processing files as text streams. We need to understand how to open, read, write and close text files. The following File Input/Output terms are explained:
Text File – A file consisting of characters from the ASCII character code set. Text files (also known as ASCII text files) contain character data. When we create a text file we usually think of it consisting of a series of lines. On each line are several characters (including spaces, punctuation, etc.) and we generally end the line with a return (a character within the ASCII character code set). The return is also known as the new line character. You are most likely already familiar with the escape code of \n which is used within many programming languages to indicate a return character when used within a literal string.
A typical text file consisting of lines can be created by text editors (Notepad) or word processing programs (Microsoft Word). When using a word processor you must usually specify the output file as text (.txt) when saving it. Most source code files are ASCII text files with a unique file extension; such as C++ using .cpp, C# using .cs, Python using .py, etc. Thus, most compiler/Integrated Development Environment software packages can be used to create ASCII text files.
Filename – The name and its extension. Most operating systems have restrictions on which characters can be used in filenames. Example Lab_05.txt
Because some operating systems do not allow spaces, we suggest that you use the underscore where needed for spacing in a filename.
Path (Filespec) – The location of a file along with its filename. Filespec is short for file specification. Most operating systems have a set of rules on how to specify the drive and directory (or path through several directory levels) along with the filename. Example: C:\myfiles\cosc_1436\Lab_05.txt
Because some operating systems do not allow spaces, we suggest that you use the underscore where needed when creating folders or sub-directories.
Open – Your program requesting the operating system to let it have access to an existing file or to open a new file. In most current programming languages, a file data type exists and is used for file processing. A file variable will be used to store the device token that the operating system assigns to the file being opened. An open function or method is used to retrieve the device token, and typically requires at least two parameters: the path and the mode (read, write, append, or a combination thereof). Corresponding pseudocode would be:
Declare File datafile datafile = open(filespec, mode)
The open function provides a return value of a device token from the operating system and it is stored in the variable named data.
It is considered good programming practice to determine if the file was opened properly. The reason the operating system usually can’t open a file is because the filespec is wrong (misspelled or not typed case consistent in some operating systems) or the file is not stored in the location specified. Accessing files stored on a network or the Internet may fail due to a network error.
Verifying that a file was opened properly is processed with a condition control structure. That structure may be either be an if-then-else statement or a try-catch / try-except error handler, depending on the programming language used.
Read – Moving data from a device that has been opened into a memory location defined in your program. For example:
text = read(datafile)
or
text = datafile.read()
Write – Moving data from a memory location defined in your program to a device that has been opened. For example:
write(datafile, text)
or
datafile.write(text)
Close – Your program requesting the operating system to release a file that was previously opened. There are two reasons to close a file. First, it releases the file and frees up the associated operation system resources. Second, if closing a file that was opened for output; it will clear the out the operating system’s buffer and ensure that all of the data is physically stored in the output file. For example:
close(datafile)
or
datafile.close()
Using / With – A wrapper around a processing block that will automatically close opened resources, available in some programming languages. For example:
// C# using (datafile = open(filespec, mode)) { //... }
or
# Python3 with open(filespec, mode) as datafile: # ...
Key Terms
- close
- Your program requesting the operating system to release a file that was previously opened.
- device token
- A key value provided by the operating system to associate a device to your program.
- filename
- The name and its extension.
- filespec
- The location of a file along with its filename.
- open
- Your program requesting the operating system to let it have access to an existing file or to open a new file.
- read
- Moving data from a device that has been opened into a memory location defined in your program.
- stream
- A sequence of data elements made available over time.[3]
- stdin
- Standard input stream, typically the keyboard. [4]
- stderr
- Standard output error stream, typically the monitor.[5]
- stdout
- Standard output stream, originally a printer, but now typically the monitor.[6]
- text file
- A file consisting of characters from the ASCII character code set.
- using / with
- A wrapper around a processing block that will automatically close opened resources.
- write
- Moving data from a memory location defined in your program to a device that has been opened.