1 00:00:01,260 --> 00:00:04,550 Hello and welcome back to the course on augmented random search. 2 00:00:04,650 --> 00:00:09,230 And today we're going to have an overview of the A.S. algorithm. 3 00:00:09,240 --> 00:00:16,680 In fact we're going to have an overview of what the goal is for that a r s algorithm and this tutorial 4 00:00:16,680 --> 00:00:27,270 is designed to be an intro for those who don't really know much about artificial intelligence or how 5 00:00:27,270 --> 00:00:31,960 these problems work how these benchmarks work how all this is structured. 6 00:00:31,950 --> 00:00:37,830 So if you already are familiar with AI if you've taken artificial intelligence it is that Course then 7 00:00:37,830 --> 00:00:44,030 bear with us as the turtle will dive straight into the important and fun stuff of arrest. 8 00:00:44,340 --> 00:00:48,930 But nevertheless you will probably still find some useful information to. 9 00:00:49,200 --> 00:00:55,230 And if you haven't done any courses before or you're not aware of AI and how it all works then this 10 00:00:55,230 --> 00:00:59,370 tutorial is going to be very very beneficial to lay the foundations of what we're going to be talking 11 00:00:59,370 --> 00:01:00,180 about. 12 00:01:00,180 --> 00:01:01,750 All right so let's get started. 13 00:01:01,880 --> 00:01:09,090 A.S. stands for mentor and research and it is an algorithm that was developed just recently earlier 14 00:01:09,090 --> 00:01:10,100 in 2018. 15 00:01:10,110 --> 00:01:23,070 I believe it was around March 2018 by Horia money and Ereli a guy at the University of Berkeley University. 16 00:01:23,080 --> 00:01:26,560 And what is this algorithm all about. 17 00:01:26,790 --> 00:01:29,540 Well here's a magical figure. 18 00:01:29,550 --> 00:01:34,560 Magical stands for multi joint dynamics with contact. 19 00:01:34,560 --> 00:01:44,340 You may have seen the Google Video or the video of the Google algorithm that controls these musical 20 00:01:44,340 --> 00:01:48,830 figures to successfully train them how to walk and things like that. 21 00:01:48,840 --> 00:01:50,710 That's that's probably where they are famous from. 22 00:01:50,730 --> 00:01:53,370 So Eris also solves the same problem. 23 00:01:54,360 --> 00:02:01,410 And so midget goes like an engine like a physics engine where you have these figures that have different 24 00:02:01,710 --> 00:02:06,630 kind of like landscapes that they need to get across and they have certain degrees of freedom they have 25 00:02:06,630 --> 00:02:13,670 muscles that they can contract and that will allow them to move for instance this humanoid figure has 26 00:02:13,680 --> 00:02:15,320 22 degrees of freedom. 27 00:02:15,360 --> 00:02:21,060 And so the artificial intelligence or the algorithm that you apply to this environment will have control 28 00:02:21,060 --> 00:02:22,520 of 22 degrees of freedom. 29 00:02:22,710 --> 00:02:26,450 And at the same time it will have inputs from the environment and we'll talk about that just in a second. 30 00:02:26,520 --> 00:02:32,520 So for instance in this case the Mujer core humanoid needs to get from where he is now to the end somewhere 31 00:02:32,520 --> 00:02:36,270 there and you know overcomes certain obstacles not just walk in a. 32 00:02:36,290 --> 00:02:42,710 Walking is quite a difficult challenge to master let alone overcoming obstacles. 33 00:02:42,710 --> 00:02:45,230 And just to get to the end of that. 34 00:02:45,390 --> 00:02:51,150 And so a quick note that arrest is not the only algorithm for Mujica. 35 00:02:51,170 --> 00:02:57,030 So when you go is an engine you can apply any algorithm you can apply deep learning you can apply reinforcement 36 00:02:57,030 --> 00:03:02,200 learning and apply A-3 C you can fly a arrays you can apply any kind of algorithm to go. 37 00:03:02,340 --> 00:03:05,190 I can apply machine learning if you like. 38 00:03:05,570 --> 00:03:06,570 That's that's the point. 39 00:03:06,570 --> 00:03:13,690 This is a benchmark test designed to train to facilitate the training of artificial intelligence. 40 00:03:13,800 --> 00:03:15,450 But we will be looking at. 41 00:03:15,450 --> 00:03:20,910 Would you go throughout the theoretical and practical terms because we need some kind of benchmark to 42 00:03:21,120 --> 00:03:27,210 look into things and that's exactly what Israeli money and really a guy looked at as well in their research 43 00:03:27,210 --> 00:03:27,980 paper. 44 00:03:28,470 --> 00:03:37,050 So then moving on for instance here we have another Yuriko example of a figure and its goal is a half 45 00:03:37,150 --> 00:03:43,230 cheetah so you can tell what to half to because it's kind of like two dimensional. 46 00:03:43,350 --> 00:03:46,380 It's not really two image the dimensional but only has two legs. 47 00:03:46,390 --> 00:03:47,940 That's the point. 48 00:03:48,060 --> 00:03:55,800 And so with this hot cheetos you get it like Lance and then it's it's the eyes actually telling the 49 00:03:55,800 --> 00:04:02,850 finger which muscles to contract which ones do you know how to move its muscles in order to move its 50 00:04:02,850 --> 00:04:06,340 joints in order to proceed to progress forward. 51 00:04:06,360 --> 00:04:14,010 And so if we look at any point in time here you'll see that the environment is constantly giving feedback 52 00:04:14,100 --> 00:04:17,270 to the stick figure or to the model. 53 00:04:17,400 --> 00:04:21,390 And that's what is going in as an input for the AI to make the decision. 54 00:04:21,390 --> 00:04:26,430 So for instance you can see that the back leg of the half cheetahs on the ground and therefore there 55 00:04:26,430 --> 00:04:32,340 is certain pressure that it is applied to the ground and therefore the ground is applying the same pressure 56 00:04:32,340 --> 00:04:38,010 as we know from physics from a high school that the ground will be applying the same force back on the 57 00:04:38,010 --> 00:04:38,240 sheet. 58 00:04:38,250 --> 00:04:44,260 And that's exactly what is being inputted into the algorithm that's controlling the cheater right now. 59 00:04:44,490 --> 00:04:49,050 And the same time you can see based on the reflection you can see the reflection of it here that there 60 00:04:49,050 --> 00:04:52,680 is a reflection that it touches the foot but here the reflection ends over here. 61 00:04:52,680 --> 00:04:53,810 So that's where it is. 62 00:04:53,880 --> 00:04:57,800 You see that this foot is in the air and therefore there is no force applied to it. 63 00:04:57,870 --> 00:05:03,240 So it's different in that case and therefore the cheetah also has feedback that there is a forceable 64 00:05:03,290 --> 00:05:09,930 and you will be using this feedback to make a decision then the next job you can see the forces are 65 00:05:09,930 --> 00:05:10,340 different. 66 00:05:10,340 --> 00:05:16,170 Again I just drew them here approximately may not be exactly correct but the point stands at different 67 00:05:16,710 --> 00:05:18,180 points in time. 68 00:05:18,240 --> 00:05:23,460 The environment is giving different feedback and that shit is using that to learn how to walk as we 69 00:05:23,460 --> 00:05:24,450 humans do as well. 70 00:05:24,450 --> 00:05:30,900 Like if you think about how we walk we're actually falling forward and then we're putting our back foot 71 00:05:30,900 --> 00:05:33,000 in front to prevent us from falling. 72 00:05:33,060 --> 00:05:37,730 And that gives us that forward perpetual movement that we get. 73 00:05:37,770 --> 00:05:40,100 So similar thing that cheater's is going to figure it out. 74 00:05:40,100 --> 00:05:47,960 And the point here is that the cheetah or the humanoid or any other mortal is not preprogrammed. 75 00:05:47,970 --> 00:05:51,390 So the algorithm doesn't know how to walk in advance. 76 00:05:51,390 --> 00:05:53,190 We're not teaching them how. 77 00:05:53,220 --> 00:05:57,580 We're not saying you have to put this leg forward in this leg and at this point this leg and so on. 78 00:05:57,750 --> 00:06:01,650 It has to figure out all this on its own kind of like a baby when a baby learns to walk. 79 00:06:01,830 --> 00:06:07,710 You don't give it an instruction manual or you don't you know sit down there and teach you how to walk 80 00:06:07,710 --> 00:06:11,410 and explain you can explain anything to me because it doesn't are Tokyo. 81 00:06:11,430 --> 00:06:13,890 So and things so it just can't. 82 00:06:13,890 --> 00:06:15,990 It can only do it through trial and error. 83 00:06:16,140 --> 00:06:16,730 Same thing here. 84 00:06:16,740 --> 00:06:20,280 Through trial and error from multiple generations of this cheater. 85 00:06:20,310 --> 00:06:22,170 So what are we seeing here. 86 00:06:23,330 --> 00:06:28,400 This running This is after a lot of training at the start is going to follow is going to experiment 87 00:06:28,410 --> 00:06:30,950 is going to try to get around a little bit then for longer and so on. 88 00:06:31,040 --> 00:06:32,470 But eventually they'll get there. 89 00:06:32,510 --> 00:06:37,730 Same thing as with a baby that will keep falling falling falling and then eventually you learn at all. 90 00:06:37,730 --> 00:06:40,610 So this result is actually a trained result. 91 00:06:40,970 --> 00:06:44,780 So that's the point that these models don't get preprogramed. 92 00:06:44,780 --> 00:06:46,750 They have to learn through trial and error. 93 00:06:48,320 --> 00:06:53,300 And so what we get is for now show we'll talk about the whole trial and error situation. 94 00:06:53,460 --> 00:06:58,560 But further down in this course for now what is important for us to understand is that there are certain 95 00:06:58,620 --> 00:07:04,610 inputs that come from the environment and they go into the AI model for now. 96 00:07:04,620 --> 00:07:06,120 For us it's like a black box. 97 00:07:06,120 --> 00:07:11,580 We don't know what happened but we do know that what comes out is our outputs outputs on how to contract 98 00:07:12,690 --> 00:07:18,780 or how to move certain muscles and joints and you know whatever degrees of freedom are available to 99 00:07:18,780 --> 00:07:25,610 the model how to control the model in order for it to achieve the goal that we want. 100 00:07:25,830 --> 00:07:33,950 So that's an overview of how the our air system or any other artificial intelligence works. 101 00:07:33,960 --> 00:07:39,180 In general you have input something happens in his black box he has I have outputs and then the model 102 00:07:39,450 --> 00:07:43,090 actually behaves and we'll explore this further in the course. 103 00:07:43,090 --> 00:07:47,960 For now if you'd like to get a quick head start into the world of Erris. 104 00:07:48,360 --> 00:07:57,710 There's a great blog post by Bednar or act who is the supervisor of our aliar guy and Horia mania. 105 00:07:58,170 --> 00:08:02,100 So we're not going to jump into their paper right away but you can start with his blog post which will 106 00:08:02,100 --> 00:08:08,130 give you a head start an overview of what we're going to be talking about the blog post is called clues 107 00:08:08,130 --> 00:08:14,340 for which I search and choose and it's available on Armyn which I believe is Ben's Web site. 108 00:08:14,340 --> 00:08:19,120 This link will be available in the course notes. 109 00:08:19,320 --> 00:08:20,070 So there we go. 110 00:08:20,070 --> 00:08:26,280 That's how we're going to end today's tutorial with this wonderful blog post. 111 00:08:26,280 --> 00:08:27,910 I personally enjoyed it. 112 00:08:27,930 --> 00:08:31,860 Hope you enjoyed today's Tauriel I know you have a bit of a better overview of these benchmarks and 113 00:08:32,160 --> 00:08:39,990 how AI in general is structured or like the purpose of AI in this case with the Mujica algorithm and 114 00:08:40,140 --> 00:08:45,810 starting from literally going to dive deeper and deeper into the world of Erris and of that to look 115 00:08:45,810 --> 00:08:47,210 for it see you back here next time. 116 00:08:47,220 --> 00:08:49,130 Until then enjoy.