Python: How to open and read text documents

This article will details some basic Python techniques for opening and reading text files. This applies to any plain text files, they don’t have to have a .txt file extension. I will detail three different techniques.

 

Example 1: Open text files and read one line at a time with .readline()

This technique demonstrates probably the most basic way to open and read a text file and read one line at a time. This code opens a text file called “Number.txt”, reads in the first line, strips out the end of line characters, prints the line to the screen, and closes the text file:

f = open(r'C:\Users\Chris\Desktop\python-blog\Number.txt', 'r')
line = f.readline()
print line.rstrip('\n')
f.close()

which yields:

The JavaScript Number object is a wrapper object allowing you to work with numerical values. A Number object is created using the Number() constructor.

Here is what the contents of Numbers.txt looks like:

The JavaScript Number object is a wrapper object allowing you to work with numerical values. A Number object is created using the Number() constructor.

The primary uses of the Number object are:

If the argument cannot be converted into a number, it returns NaN.
In a non-constructor context (i.e., without the new operator), Number can be used to perform a type conversion.

The Number() function converts the object argument to a number that represents the object's value.

If the value cannot be converted to a legal number, NaN is returned.

Note: If the parameter is a Date object, the Number() function returns the number of milliseconds since midnight January 1, 1970 UTC.

As the above code shows, the full path to the text file can be supplied as an argument to the open() function. This could just as easily be a variable that contains the full path. For example:

readpath = r'C:\Users\Chris\Desktop\python-blog\Number.txt'
f = open(readpath, 'r')
line = f.readline()
print line.rstrip('\n')
f.close()

The “r” in front of the path tells Python to treat this string as “raw” text so as not to interpret any of the backslashes in the path as special characters.

The ‘r’ used as the second argument in the open() function means “read”, as in open the text file in read mode.

Here is the code again, this time with comments:

# This example uses .readline() which will read only a single line at a time
readpath = r'C:\Users\Chris\Desktop\python-blog\Number.txt'

f = open(readpath, 'r')

# Read the first line
line = f.readline()

# strip the newline char from the end of the line and print line
print line.rstrip('\n')

# Read the second line
line = f.readline()
print line.rstrip('\n')

# Read the third line
line = f.readline()
print line.rstrip('\n')

# Read the forth line
line = f.readline()
print line.rstrip('\n')

# etc..

# close the document
f.close()

This yields:

The JavaScript Number object is a wrapper object allowing you to work with numerical values. A Number object is created using the Number() constructor.

The primary uses of the Number object are:

Notice that lines 2 and 4 were printed as just blank lines, which matches what is actually seen in Numbers.txt.

 

Example 2: Open and read text file into a list using .readlines()

In this example, I will use .readlines() which will place all lines of the text document into a list so that they can be accessed by index. Here is the code with comments:

# This example uses readlines() which will read all lines in the file and return then as an array
# lines in this case is a list containing all lines in the text file
readpath = r'C:\Users\Chris\Desktop\python-blog\Number.txt'

f = open(readpath, 'r')
lines = f.readlines()

# Let's see what lines contains, and what type of data container it is
print "Lines is a:", type(lines)
print lines

# close the file
f.close()

Output:

Lines is a: <type 'list'>
['The JavaScript Number object is a wrapper object allowing you to work with numerical values. A Number object is created using the Number() constructor.\n', '\n', 'The primary uses of the Number object are:\n', '\n', 'If the argument cannot be converted into a number, it returns NaN.\n', 'In a non-constructor context (i.e., without the new operator), Number can be used to perform a type conversion.\n', '\n', "The Number() function converts the object argument to a number that represents the object's value.\n", '\n', 'If the value cannot be converted to a legal number, NaN is returned.\n', '\n', 'Note: If the parameter is a Date object, the Number() function returns the number of milliseconds since midnight January 1, 1970 UTC.\n']

Now that I have the entire text file in a list, I can access any line I want by index. For example, if I want the last line in the document:

# Get the last line in the document
last_line = lines[-1].rstrip()
print last_line

Which yields:

Note: If the parameter is a Date object, the Number() function returns the number of milliseconds since midnight January 1, 1970 UTC.

A for loop is all you need to print the entire text file to the screen:

for line in lines:
    print line.rstrip()

Notes: Using .readlines() works  great on small text files, but if you are dealing with a huge file that is too large to open on your computer, then a different strategy is needed.

 

Example 3: Open and read text file using with

The first two examples above use the open() function followed by an explicit .close() as in f.close()function to close the text document.  This third example uses the Python key word ‘with’ which eliminates the need to explicitly close the document using .close(). This is the preferred method for opening and reading text files in Python as it has less overhead and you don’t have to remember to close the file you opened.

# This example uses with open() as var
readpath = r'C:\Users\Chris\Desktop\python-blog\Number.txt'

# f is the filehandle
# with closes the file automatically when the action is complete
with open(readpath, 'r') as f:
    lines = f.readlines()

# Get the second to last non-blank line in the document
target_line = lines[-3].rstrip()
print target_line

Which yields:

If the value cannot be converted to a legal number, NaN is returned.

 

For more information about opening and reading text files with Python, see: