{"id":2203,"date":"2018-07-28T10:42:06","date_gmt":"2018-07-28T15:42:06","guid":{"rendered":"http:\/\/bluegalaxy.info\/codewalk\/?p=2203"},"modified":"2018-07-28T10:42:06","modified_gmt":"2018-07-28T15:42:06","slug":"python-open-read-text-documents","status":"publish","type":"post","link":"https:\/\/bluegalaxy.info\/codewalk\/2018\/07\/28\/python-open-read-text-documents\/","title":{"rendered":"Python: How to open and read text documents"},"content":{"rendered":"<p>This article will details some basic Python techniques for opening and reading text files. This applies to any plain text files, they don&#8217;t have to have a .txt file extension. I will detail three different techniques.<\/p>\n<p>&nbsp;<\/p>\n<h4><strong>Example 1: Open text files and read one line at a time with .readline()<\/strong><\/h4>\n<p>This technique demonstrates probably the most basic way to open and read a text file and read one line at a time. This code opens a text file called &#8220;Number.txt&#8221;, reads in the first line, strips out the end of line characters, prints the line to the screen, and closes the text file:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">f = open(r'C:\\Users\\Chris\\Desktop\\python-blog\\Number.txt', 'r')\r\nline = f.readline()\r\nprint line.rstrip('\\n')\r\nf.close()<\/pre>\n<p>which yields:<\/p>\n<p><code class=\"EnlighterJSRAW\" data-enlighter-language=\"no-highlight\">The JavaScript Number object is a wrapper object allowing you to work with numerical values. A Number object is created using the Number() constructor.<\/code><\/p>\n<p>Here is what the contents of Numbers.txt looks like:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"raw\">The JavaScript Number object is a wrapper object allowing you to work with numerical values. A Number object is created using the Number() constructor.\r\n\r\nThe primary uses of the Number object are:\r\n\r\nIf the argument cannot be converted into a number, it returns NaN.\r\nIn a non-constructor context (i.e., without the new operator), Number can be used to perform a type conversion.\r\n\r\nThe Number() function converts the object argument to a number that represents the object's value.\r\n\r\nIf the value cannot be converted to a legal number, NaN is returned.\r\n\r\nNote: If the parameter is a Date object, the Number() function returns the number of milliseconds since midnight January 1, 1970 UTC.<\/pre>\n<p>As the above code shows, the full path to the text file can be supplied as an argument to the open() function. This could just as easily be a variable that contains the full path. For example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">readpath = r'C:\\Users\\Chris\\Desktop\\python-blog\\Number.txt'\r\nf = open(readpath, 'r')\r\nline = f.readline()\r\nprint line.rstrip('\\n')\r\nf.close()<\/pre>\n<p>The &#8220;r&#8221; in front of the path tells Python to treat this string as &#8220;raw&#8221; text so as not to interpret any of the backslashes in the path as special characters.<\/p>\n<p>The &#8216;r&#8217; used as the second argument in the open() function means &#8220;read&#8221;, as in open the text file in read mode.<\/p>\n<p>Here is the code again, this time with comments:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\"># This example uses .readline() which will read only a single line at a time\r\nreadpath = r'C:\\Users\\Chris\\Desktop\\python-blog\\Number.txt'\r\n\r\nf = open(readpath, 'r')\r\n\r\n# Read the first line\r\nline = f.readline()\r\n\r\n# strip the newline char from the end of the line and print line\r\nprint line.rstrip('\\n')\r\n\r\n# Read the second line\r\nline = f.readline()\r\nprint line.rstrip('\\n')\r\n\r\n# Read the third line\r\nline = f.readline()\r\nprint line.rstrip('\\n')\r\n\r\n# Read the forth line\r\nline = f.readline()\r\nprint line.rstrip('\\n')\r\n\r\n# etc..\r\n\r\n# close the document\r\nf.close()<\/pre>\n<p>This yields:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"raw\">The JavaScript Number object is a wrapper object allowing you to work with numerical values. A Number object is created using the Number() constructor.\r\n\r\nThe primary uses of the Number object are:\r\n\r\n<\/pre>\n<p>Notice that lines 2 and 4 were printed as just blank lines, which matches what is actually seen in Numbers.txt.<\/p>\n<p>&nbsp;<\/p>\n<h4><strong>Example 2: Open and read text file into a list using .readlines()<br \/>\n<\/strong><\/h4>\n<p>In this example, I will use .readlines() which will place all lines of the text document into a list so that they can be accessed by index. Here is the code with comments:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\"># This example uses readlines() which will read all lines in the file and return then as an array\r\n# lines in this case is a list containing all lines in the text file\r\nreadpath = r'C:\\Users\\Chris\\Desktop\\python-blog\\Number.txt'\r\n\r\nf = open(readpath, 'r')\r\nlines = f.readlines()\r\n\r\n# Let's see what lines contains, and what type of data container it is\r\nprint \"Lines is a:\", type(lines)\r\nprint lines\r\n\r\n# close the file\r\nf.close()<\/pre>\n<p>Output:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"raw\">Lines is a: &lt;type 'list'&gt;\r\n['The JavaScript Number object is a wrapper object allowing you to work with numerical values. A Number object is created using the Number() constructor.\\n', '\\n', 'The primary uses of the Number object are:\\n', '\\n', 'If the argument cannot be converted into a number, it returns NaN.\\n', 'In a non-constructor context (i.e., without the new operator), Number can be used to perform a type conversion.\\n', '\\n', \"The Number() function converts the object argument to a number that represents the object's value.\\n\", '\\n', 'If the value cannot be converted to a legal number, NaN is returned.\\n', '\\n', 'Note: If the parameter is a Date object, the Number() function returns the number of milliseconds since midnight January 1, 1970 UTC.\\n']<\/pre>\n<p>Now that I have the entire text file in a list, I can access any line I want by index. For example, if I want the last line in the document:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\"># Get the last line in the document\r\nlast_line = lines[-1].rstrip()\r\nprint last_line<\/pre>\n<p>Which yields:<\/p>\n<p><code class=\"EnlighterJSRAW\" data-enlighter-language=\"no-highlight\">Note: If the parameter is a Date object, the Number() function returns the number of milliseconds since midnight January 1, 1970 UTC.<\/code><\/p>\n<p>A for loop is all you need to print the entire text file to the screen:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">for line in lines:\r\n    print line.rstrip()<\/pre>\n<p>Notes: Using .readlines() works\u00a0 great on small text files, but if you are dealing with a huge file that is too large to open on your computer, then a different strategy is needed.<\/p>\n<p>&nbsp;<\/p>\n<h4><strong>Example 3: Open and read text file using with<br \/>\n<\/strong><\/h4>\n<p>The first two examples above use the open() function followed by an explicit .close() as in <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">f.close()<\/code>function to close the text document.\u00a0 This third example uses the Python key word &#8216;with&#8217; which eliminates the need to explicitly close the document using .close(). This is the preferred method for opening and reading text files in Python as it has less overhead and you don&#8217;t have to remember to close the file you opened.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\"># This example uses with open() as var\r\nreadpath = r'C:\\Users\\Chris\\Desktop\\python-blog\\Number.txt'\r\n\r\n# f is the filehandle\r\n# with closes the file automatically when the action is complete\r\nwith open(readpath, 'r') as f:\r\n    lines = f.readlines()\r\n\r\n# Get the second to last non-blank line in the document\r\ntarget_line = lines[-3].rstrip()\r\nprint target_line\r\n<\/pre>\n<p>Which yields:<\/p>\n<p><code class=\"EnlighterJSRAW\" data-enlighter-language=\"no-highlight\">If the value cannot be converted to a legal number, NaN is returned.<\/code><\/p>\n<p>&nbsp;<\/p>\n<p>For more information about opening and reading text files with Python, see:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.tutorialspoint.com\/python\/file_readline.htm\">readline() method<\/a><\/li>\n<li><a href=\"https:\/\/www.tutorialspoint.com\/python\/file_readlines.htm\">readlines() method<\/a><\/li>\n<li><a href=\"https:\/\/docs.python.org\/3\/whatsnew\/2.6.html#pep-343-the-with-statement\">&#8216;with open&#8217; method<\/a><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This article will details some basic Python techniques for opening and reading text files. This applies to any plain text files, they don&#8217;t have to have a .txt file extension. I will detail three different techniques. &nbsp; Example 1: Open text files and read one line at a time with .readline() This technique demonstrates probably &hellip; <a href=\"https:\/\/bluegalaxy.info\/codewalk\/2018\/07\/28\/python-open-read-text-documents\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Python: How to open and read text documents<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[22],"tags":[160,158,4,159],"class_list":["post-2203","post","type-post","status-publish","format-standard","hentry","category-python-language","tag-close","tag-open","tag-python","tag-readline"],"_links":{"self":[{"href":"https:\/\/bluegalaxy.info\/codewalk\/wp-json\/wp\/v2\/posts\/2203","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/bluegalaxy.info\/codewalk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/bluegalaxy.info\/codewalk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/bluegalaxy.info\/codewalk\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/bluegalaxy.info\/codewalk\/wp-json\/wp\/v2\/comments?post=2203"}],"version-history":[{"count":11,"href":"https:\/\/bluegalaxy.info\/codewalk\/wp-json\/wp\/v2\/posts\/2203\/revisions"}],"predecessor-version":[{"id":2214,"href":"https:\/\/bluegalaxy.info\/codewalk\/wp-json\/wp\/v2\/posts\/2203\/revisions\/2214"}],"wp:attachment":[{"href":"https:\/\/bluegalaxy.info\/codewalk\/wp-json\/wp\/v2\/media?parent=2203"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/bluegalaxy.info\/codewalk\/wp-json\/wp\/v2\/categories?post=2203"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/bluegalaxy.info\/codewalk\/wp-json\/wp\/v2\/tags?post=2203"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}