Unlike text documents, which Python has a native open() function for:
Opening and reading .csv files requires importing the csv module and using a csv.reader(). For example, this code will open and read the first 10 lines in a .csv file:
import csv csv_path = r'C:\Users\Chris\Desktop\Python Scripts\Python Class\Week_2\code examples\EEU_071114.csv' reader = csv.reader(open(csv_path, 'r'), dialect='excel-tab') # Print only the first 10 lines x = 0 for line in reader: print line x += 1 if x > 10: break
Output:
['Czech Republic,Czech (CZE),Street Names,"43,585"'] [',,Zones,160'] [',,Admins,"10,349"'] [',,Signs,"5,520"'] [',,POI Names,"72,297"'] ['Greece,Greek (GRE),Street Names,"41,138"'] [',,Zones,"1,199"'] [',,Admins,"3,916"'] [',,Signs,"7,009"'] [',,POI Names,"60,878"'] ['Hungary,Hungarian (HUN),Street Names,"33,926"']
Notice this key line:
reader = csv.reader(open(csv_path, 'r'), dialect='excel-tab')
The first argument to csv.reader() is open(csv_path, 'r')
. Just like opening text documents, use open(), with an ‘r’ argument for ‘read’. The second argument is dialect='excel-tab'
.
reader is a csv read object that is iterable, but not accessible by index, so if we want to capture all lines of the csv file in a list, we must first iterate over the reader object. For example:
import csv csv_path = r'C:\Users\Chris\Desktop\Python Scripts\Python Class\Week_2\code examples\EEU_071114.csv' reader = csv.reader(open(csv_path, 'r'), dialect='excel-tab') # Capture all lines in an array csv_doc = [] for line in reader: csv_doc.append(line)
Then if we wanted to print the last line of the csv file we could do so by index:
# Print the last line of the csv document print csv_doc[-1]
which yields:
[',,POI Names,"22,450"']
Notice that the line is returned as a comma separated string in an array. We can now parse this data by splitting the string by comma. For example:
# For the first 5 lines, split each line by comma x = 0 for line in reader: line = line[0] line = line.split(',') print line x += 1 if x > 5: break
which yields:
['Czech Republic', 'Czech (CZE)', 'Street Names', '"43', '585"'] ['', '', 'Zones', '160'] ['', '', 'Admins', '"10', '349"'] ['', '', 'Signs', '"5', '520"'] ['', '', 'POI Names', '"72', '297"'] ['Greece', 'Greek (GRE)', 'Street Names', '"41', '138"']
For more information about the csv module, see:
https://docs.python.org/2/library/csv.html