In Python, local data requiring long-term storage exist in the form of files So, acquiring local data is indeed to access and read data from files How are files read Certainly, we need to open this file first and, after that, read data from the file No doubt you may also write in data If a file has been completely processed we should close the file Why do we need to close the file It's because Python may buffer the written-in data And if a program crashes under exceptions data can not be written into the file So, for the purpose of security after using a file we should develop the habit of actively closing it Let's look at how to open a file To open a file, use the "open()" function in Python The "open()" function has multiple arguments Let's look at the three common ones The first argument is the filename where it may also include a path the second argument represents the mode as "r" by default meaning "to read" the third argument represents whether buffering is needed as -1 by default meaning to use the default size of buffer in the system Among them, the first argument of the three is mandatory meaning it is necessary the other two may be omitted If not indicated, the default values will be adopted Apart from these three arguments there are many other arguments we may use the "help(open)" to view the details For example, another common argument is "encoding", meaning "to encode" Let's look at the three instances here the three lines of code are with one two and three arguments, respectively The first line of code indicates to access and read data from the text file of "infile.txt" from Disk D adopting the default size of buffer in the system The third line of code is to write a binary file It’s worth noticing that In Python a binary file may choose not to use buffering but a text file must use buffering There are varieties of mode arguments for the "open()" function like "read", "write", and "append" it is allowed to open in the mode of text file or in the mode of a binary file Let's watch a flash The "r" mode is the mode of file reading A file must exist The "w" mode is file writing emptying a file or creating a new file The "a" mode is to append to append contents to the tail of file r+, w+ and a+ are with other additional capabilities If with a 'b', it means the reading, writing and appending of binary file When using the mode of "open()" function we should know their differences and charactersitics Careless mistakes in use may lead to some problems Say, you intended to read a file but careflessly, open it with the "w" mode As we know, the "w" mode will empty the original contents Then, it causes trouble The "open()" function returns a file object It is iterable so each sub-item inside can be traversed The file object has many functions To be exact, there are many methods The concept of method is an important idea in object-oriented programming we'll deal with later Now, let’s simply understand it as functions related to objects Let's see how it is used Its form can be understood as this "object name.method name", followed by the parentheses Sure, the parentheses may also include arguments For example, like this It means to close the file object "f" In this part, we'll talk about local data acquisition which can be realized with some functions Let's look at the details The "write()" function is to write a character string into a file Let's create a new file, say, firstpro and then write this string into the file with the "write()" function Finally, close the file Quite easy, right Actually, this writing is not recommended in Python now We recommend you use the "with" statement we talked about last week to write such programs It can handle file closing automatically more succinct and more effectively OK, let's have a try Create a new file Write a string into it with the "write()" function Let's check whether this file has really been created Yeah, there's a text file named "firstpro" And what's inside is the newly typed-in string "Hello World" The "write()" function is to write data into a file You must have also guessed that For reading data from a file there must be a correspondingly function "read()" If without any argument the effect of the "read()" function is to read all data from the current location to the end of file and return it as a string It may also be followed by an argument The "size" represents the byte numbers it has read It also returns a string We may read the data we've just written in in such a way or read with a "read()" function without any argument Let's think. What do the two mean exactly Have a try Well, let's read data from this file Read 5 bytes first Assign them to the variable "p1" Then, read the rest of them Let's output the value of p1 As we see, it's "hello" The value of p2 is the remaining punctuation marks and "world" We've got the correct result I'd like to remind you here that since the "with" statement will actively close the file handle after its execution there's no need in the program to write any additional "close()" statement Apart from the functions of "read()" and "write()" there are other common functions like readlines(), readline() and writelines() in Python readlines() is to read several lines of data readline() is to read one line of data writelines() is to write in several lines of data Among them, readlines() and writelines() are more common Let's look at an example In this example, read data from a file Suppose there are several lines of data in the original file OK, let's have a try Suppose a file has 4 string lines and we're going to read them with the "readlines()" function Alright, as we see readlines() can achieve it and its return result is a list Here, we can see the newline character "\n" in the result That's because, when Python reads lines from a file it does not delete the newline character For deleting these newline characters the programmer shall do it by himself We might also utilize the string "strip()" method we're going to mention later It's worth mentioning that there is no "writeline()" function in Python since it is similar to the "write()" function of calling a single line of string Next, let's use a simple instance to see how a file is read and written The requirements are add the serial numbers 1, 2, 3, 4 … to the beginning of strings in the previous file and then write them into another file This task is not difficult We may read data with the "readlines()" function and then use a loop to add serial numbers to the beginning Here, the "str()" function must be used to convert integers into strings and then use the writelines() function to write data into the file The final effect is like this Let's look at another example Suppose we're going to create such an effect to add a string to the tail of file and read all data from the new file Let's see whether this writing works First, write in and then read them out Well, let's carefully consider it for 10 seconds If you execute this program you'll see the result is not what we expected reading new data we need Why That's because there is a file pointer during reading or writing Data reading or writing starts with the location of file pointer For example, we used "writelines()" to write in data just now At that time the file pointer had reached the tail of file When reading with "readlines()" then it produces incorrect results when reading data Then, how can we get the file pointer to point to the desired location Python provides the "seek" method The "seek()" method has two arguments The first argument is "offset", meaning the offset amount the second argument represents the starting location, as 0 by default For example, f.seek(0,0) or f.seek(0) means to move the file pointer to the head of file How about f.seek(50, 1) move the file pointer forward for 50 bytes Let's look at the program How to modify it You might already guess it we only need to, before reading move the file pointer to the head of file so, let's add such a statement here The result will be correct At the end of local data acquisition let's look at something related to standard files Look at this program please We know, the "input()" function can read in data from the keyboard while the "print()" function may output the data to the display terminal Like in many other high-level programming languages the keyboard and the display terminal in Python are both files They are standard files "stdin" is a standard input "stdout" is a standard output while "stderr" is a standard error However, since they are so frequently used we normally use the "input()" function and the "print()" function for such dedicated purposes They are actually realized with the functions provided in the "sys" module like the realization of the "print()" function At the end, let's look at a case of comprehensive file processing that is, count the number of data lines in a file Let's start with the simplest one For example, suppose that, under the current py file directory which is the file directory where the py file is located there's a text file "data1.txt" If we're to count the number of data lines in this file we only need to read its contents and then count them Our code can be written like this Here, with the "with" statement we open this file and then, with the "readlines" method of file read the file data With "len", count and print the number of data lines in the file Well, let's execute this program As we see, the result of execution is this file has 6 data lines Supper that, under the current directory there are, say, 4 text files data1, data2, data3 and data4 and we need to count the lines of data in the 4 files We can save these filenames into a list and then process it with the loop We might as well modify this program Define the capacity of counting and printing file lines as a new function Define a list in the main module and store in there the filenames to be processed Repeatedly call this function and get the desired result Run this program As we can see the respective statistical results of file lines of the 4 files are like this Have you noticed that we directly used the filenames in the program just now without any path added before the filename Think about it Are you clear now That's because our py file and those text files are stored under the same directory Therefore, it's OK to express that in relative path Then, if the text file is under the directory one level higher than the current py file how can we read this file So easy we only need to like in "open()" writing down "../", .. in the argument of this function meaning, to return to the directory one level higher use open('../data1.txt') which will achieve the effect In actual practice of file processing, these files are often under the same single directory and their quantity is big Then, we need to resort to some functions in the "os" module for processing For example, if some files are located in the "testdata" directory under the "test" directory in Disk C or any other directory and we need to count the number of lines of all text files We might as well modify this program again and change it like this Inside it, the listdir() function in the "os" module means listing all the files in the directory the argument indicates The argument "path" here means the text files we're counting are in the "testdata" directory under the directory where the "py" file is located "./" indicates the current directory This is a recommended way of writing Of course, it's also OK to designate an absolute path like defining "path" as C:/test or any directory in Disk D And then, in the os.listdir() function link "path" to the actual path say, testdata by way of path+'/testdata' Now, let's move to endswith() here It means the file ends with .txt indicating that we are looking for all the text files Then, with the join() function join the file path and traversed filenames one by one to constitute a complete file path This step is necessary, worth special attention as, if the actual argument of function contains no file path like described above what the formal argument of countLines() function reads is still the several filenames that is, the py file the text name in the current directory Thus, it requires special attention Well, let's execute this program The result is still the same but actually, under the testdata directory files of other types exist as well Besides we often need to write results into one or more files We may rely on the mkdir() function in the "os" module to create a directory for storing the output results If this directory exists to ensure security, i.e., no data we want exist inside we may delete this directory first The detailed writing may be like this Suppose that, in this directory there is an "output" directory then, we may use the rmtree() function in the shutil module to delete this directory first The effect of this function is to recursively delete nonempty directories Then, we use mkdir() to create the directory we need It's OK now That is an example of comprehensive file processing in daily life from simple to complex Apart from that the data in actual practice may be bigger than that in this example our concept of processing and writing style of program are both quite mainstream You may create your own file and attempt to completely process it I believe you can better understand the steps and writing of file reading, writing and processing