1 00:00:00,330 --> 00:00:01,130 Welcome. 2 00:00:01,380 --> 00:00:07,020 So once again we have these three for this year and each of them has a column with some data values 3 00:00:07,080 --> 00:00:15,240 and a header called value and Olmo write a program that will access these three files read their data 4 00:00:15,240 --> 00:00:21,140 in Python and calculate three average values one for each of the three columns. 5 00:00:21,180 --> 00:00:27,300 These can be probably done in different ways using different libraries but one easy way to do this is 6 00:00:27,300 --> 00:00:29,310 to use the Panas library. 7 00:00:29,310 --> 00:00:32,480 So the panderers library is a data analysis library and. 8 00:00:32,560 --> 00:00:40,950 It's like an interface that creates a data or frame object inside Python data or frame object like structure 9 00:00:40,980 --> 00:00:42,780 that hold the data. 10 00:00:42,780 --> 00:00:47,050 So you've got to think of it as yet another platform data type. 11 00:00:47,070 --> 00:00:52,140 So as you work with other libraries you will discover other complex later types. 12 00:00:52,140 --> 00:00:59,540 So first I'll go ahead and calculate the average value all the values inside the file 1.60. 13 00:00:59,640 --> 00:01:05,770 First of all how import are the panels library and what we want to do first is generate a data free 14 00:01:05,910 --> 00:01:09,580 object out of the first textfile. 15 00:01:09,720 --> 00:01:13,340 So we will read the data of the first text file. 16 00:01:13,530 --> 00:01:19,520 We will store those data in a data frame object and we want to store the object in a variable. 17 00:01:19,620 --> 00:01:25,830 So remember variables are like containers where we can store everything and I'll call my variable div 18 00:01:27,460 --> 00:01:32,970 then I'll generate the data frame using the Pandurs I read C the method. 19 00:01:33,150 --> 00:01:40,540 So read 60 cents for read characters separated values and our text file is such a file. 20 00:01:40,740 --> 00:01:43,530 So it has that data structure. 21 00:01:43,530 --> 00:01:49,660 The file could also be seen for all but C as if all things work the same. 22 00:01:49,710 --> 00:01:55,730 So Pandurs that is the event we need to pass a parameter to the Richard's method. 23 00:01:55,740 --> 00:02:02,940 That parameter is the file path where we want to convert to a data frame so we want to convert that 24 00:02:02,940 --> 00:02:09,580 file and I'll pass my path here which is the analysis of the name of the file. 25 00:02:09,610 --> 00:02:12,560 I notice here that the path is a string Nader type. 26 00:02:12,690 --> 00:02:19,700 So Python recognizes stream's objects when you pass the string to the Red Sea is your method. 27 00:02:19,710 --> 00:02:26,100 Let's use the method parses the string so it sends String objects and locate the file in the computer 28 00:02:26,850 --> 00:02:28,670 through the given string. 29 00:02:29,100 --> 00:02:35,150 So let's end the year and we have a or frame object generated to display the data frame. 30 00:02:35,160 --> 00:02:42,610 If you want by just calling the variable that holds the data frame so this is the data frame. 31 00:02:42,800 --> 00:02:51,190 So this is constructed for either which in this case is valid value string and the data frame also has 32 00:02:51,190 --> 00:02:55,680 an index column which was assigned some default values as you see here. 33 00:02:55,740 --> 00:02:59,100 And of course the data frame also has those values. 34 00:02:59,110 --> 00:03:05,290 So to calculate the mean value of all the different values we just go ahead and apply the mean method 35 00:03:05,500 --> 00:03:06,710 to the data from object. 36 00:03:06,760 --> 00:03:10,000 So they have the dot method. 37 00:03:10,120 --> 00:03:12,010 This is a mean value. 38 00:03:12,010 --> 00:03:13,580 So it was quite easy. 39 00:03:13,600 --> 00:03:18,100 No we can calculate the average values of the other files the same way. 40 00:03:18,520 --> 00:03:24,780 So we can repeat the same actions but here we have 3 4 Songlian we can do that both. 41 00:03:24,820 --> 00:03:32,690 Imagine if we had the waltz on lots of files and in such case we would need to use a for loop. 42 00:03:33,070 --> 00:03:35,050 That's exactly what I'll do now. 43 00:03:35,140 --> 00:03:41,950 But in order to iterate through all the file paths I first need to have all these so file path strings 44 00:03:42,420 --> 00:03:47,550 in a smart way to get at least easy to use glop to my library. 45 00:03:47,560 --> 00:03:53,130 So first you should go ahead and install the library in your command line using p. 46 00:03:53,320 --> 00:03:58,740 Installed with all up to good once the library has been installed. 47 00:03:58,780 --> 00:04:00,800 You don't have to start anything. 48 00:04:00,820 --> 00:04:06,280 Just go ahead and import the library in your console when you want to use a method over a library that 49 00:04:06,280 --> 00:04:14,050 generates a list of four file paths that math is called globe and I want to store the list of the file 50 00:04:14,050 --> 00:04:17,090 path strings inside a variable of course. 51 00:04:17,320 --> 00:04:20,490 So let's say for all this equals and go to the ultimate lump. 52 00:04:20,680 --> 00:04:27,320 So a global math of the OP to luxury and here we need to pass is a full path for where the files are 53 00:04:27,340 --> 00:04:28,340 located. 54 00:04:28,460 --> 00:04:30,370 So that is C.. 55 00:04:30,520 --> 00:04:39,470 Now this is and here we can play around by using an asterisk and a dot DXi after that. 56 00:04:39,540 --> 00:04:47,230 That means we want all the files containing dot DXi exception those that execute and see what you got 57 00:04:47,230 --> 00:04:48,740 in the file is variable. 58 00:04:49,030 --> 00:04:53,090 So it certainly solve all the TXU file paths. 59 00:04:53,110 --> 00:04:56,550 Now we have a list of 5 paths and we can iterate through. 60 00:04:56,570 --> 00:05:01,270 At least I know exactly the same action for all the items of a list. 61 00:05:01,360 --> 00:05:11,560 Let's say for file in file we create a data frame object and call the methods of the Penates library 62 00:05:12,130 --> 00:05:15,010 then pass the file path to that object. 63 00:05:15,550 --> 00:05:18,660 So that would be the file variable. 64 00:05:18,670 --> 00:05:26,190 Think of the file the variable as temporary variables so they all have different values or each iteration. 65 00:05:26,500 --> 00:05:34,500 Next we can calculate the mean value of the current data frame for the duration our then we print all 66 00:05:34,770 --> 00:05:36,330 the value on the screen. 67 00:05:36,430 --> 00:05:37,430 That's it. 68 00:05:37,540 --> 00:05:40,830 Executer and we get three values here. 69 00:05:41,100 --> 00:05:48,270 So each of the values represents the average value for each text while these values are actually domestic 70 00:05:48,270 --> 00:05:50,870 panderers data type calls serious. 71 00:05:50,880 --> 00:05:58,890 So if you like to actually return a plain floor Python data type you could apply the full function to 72 00:05:58,890 --> 00:06:00,100 these numbers. 73 00:06:00,390 --> 00:06:01,320 So let's do just that. 74 00:06:01,350 --> 00:06:07,700 Let's modify our code and just press the upper arrow key to recall the previously executed statement. 75 00:06:09,020 --> 00:06:16,280 What you want to do is add a line that converts the series data to a folded up type we can use the full 76 00:06:16,280 --> 00:06:17,090 function for that. 77 00:06:17,210 --> 00:06:24,500 So as soon as a variable is created we will grab that value of that variable and pass it to the full 78 00:06:24,500 --> 00:06:27,600 function which we converted to four data. 79 00:06:28,000 --> 00:06:29,120 That's it. 80 00:06:29,120 --> 00:06:30,280 Exec cute. 81 00:06:30,590 --> 00:06:33,310 And here we have three main values. 82 00:06:33,650 --> 00:06:37,670 So I hope this is fun and full of useful information for you. 83 00:06:37,670 --> 00:06:42,560 Please feel free to post this question if there is anything you need to talk about and see you in the 84 00:06:42,560 --> 00:06:43,150 next lecture.