1
00:00:01,260 --> 00:00:04,550
Hello and welcome back to the course on augmented random search.

2
00:00:04,650 --> 00:00:09,230
And today we're going to have an overview of the A.S. algorithm.

3
00:00:09,240 --> 00:00:16,680
In fact we're going to have an overview of what the goal is for that a r s algorithm and this tutorial

4
00:00:16,680 --> 00:00:27,270
is designed to be an intro for those who don't really know much about artificial intelligence or how

5
00:00:27,270 --> 00:00:31,960
these problems work how these benchmarks work how all this is structured.

6
00:00:31,950 --> 00:00:37,830
So if you already are familiar with AI if you've taken artificial intelligence it is that Course then

7
00:00:37,830 --> 00:00:44,030
bear with us as the turtle will dive straight into the important and fun stuff of arrest.

8
00:00:44,340 --> 00:00:48,930
But nevertheless you will probably still find some useful information to.

9
00:00:49,200 --> 00:00:55,230
And if you haven't done any courses before or you're not aware of AI and how it all works then this

10
00:00:55,230 --> 00:00:59,370
tutorial is going to be very very beneficial to lay the foundations of what we're going to be talking

11
00:00:59,370 --> 00:01:00,180
about.

12
00:01:00,180 --> 00:01:01,750
All right so let's get started.

13
00:01:01,880 --> 00:01:09,090
A.S. stands for mentor and research and it is an algorithm that was developed just recently earlier

14
00:01:09,090 --> 00:01:10,100
in 2018.

15
00:01:10,110 --> 00:01:23,070
I believe it was around March 2018 by Horia money and Ereli a guy at the University of Berkeley University.

16
00:01:23,080 --> 00:01:26,560
And what is this algorithm all about.

17
00:01:26,790 --> 00:01:29,540
Well here's a magical figure.

18
00:01:29,550 --> 00:01:34,560
Magical stands for multi joint dynamics with contact.

19
00:01:34,560 --> 00:01:44,340
You may have seen the Google Video or the video of the Google algorithm that controls these musical

20
00:01:44,340 --> 00:01:48,830
figures to successfully train them how to walk and things like that.

21
00:01:48,840 --> 00:01:50,710
That's that's probably where they are famous from.

22
00:01:50,730 --> 00:01:53,370
So Eris also solves the same problem.

23
00:01:54,360 --> 00:02:01,410
And so midget goes like an engine like a physics engine where you have these figures that have different

24
00:02:01,710 --> 00:02:06,630
kind of like landscapes that they need to get across and they have certain degrees of freedom they have

25
00:02:06,630 --> 00:02:13,670
muscles that they can contract and that will allow them to move for instance this humanoid figure has

26
00:02:13,680 --> 00:02:15,320
22 degrees of freedom.

27
00:02:15,360 --> 00:02:21,060
And so the artificial intelligence or the algorithm that you apply to this environment will have control

28
00:02:21,060 --> 00:02:22,520
of 22 degrees of freedom.

29
00:02:22,710 --> 00:02:26,450
And at the same time it will have inputs from the environment and we'll talk about that just in a second.

30
00:02:26,520 --> 00:02:32,520
So for instance in this case the Mujer core humanoid needs to get from where he is now to the end somewhere

31
00:02:32,520 --> 00:02:36,270
there and you know overcomes certain obstacles not just walk in a.

32
00:02:36,290 --> 00:02:42,710
Walking is quite a difficult challenge to master let alone overcoming obstacles.

33
00:02:42,710 --> 00:02:45,230
And just to get to the end of that.

34
00:02:45,390 --> 00:02:51,150
And so a quick note that arrest is not the only algorithm for Mujica.

35
00:02:51,170 --> 00:02:57,030
So when you go is an engine you can apply any algorithm you can apply deep learning you can apply reinforcement

36
00:02:57,030 --> 00:03:02,200
learning and apply A-3 C you can fly a arrays you can apply any kind of algorithm to go.

37
00:03:02,340 --> 00:03:05,190
I can apply machine learning if you like.

38
00:03:05,570 --> 00:03:06,570
That's that's the point.

39
00:03:06,570 --> 00:03:13,690
This is a benchmark test designed to train to facilitate the training of artificial intelligence.

40
00:03:13,800 --> 00:03:15,450
But we will be looking at.

41
00:03:15,450 --> 00:03:20,910
Would you go throughout the theoretical and practical terms because we need some kind of benchmark to

42
00:03:21,120 --> 00:03:27,210
look into things and that's exactly what Israeli money and really a guy looked at as well in their research

43
00:03:27,210 --> 00:03:27,980
paper.

44
00:03:28,470 --> 00:03:37,050
So then moving on for instance here we have another Yuriko example of a figure and its goal is a half

45
00:03:37,150 --> 00:03:43,230
cheetah so you can tell what to half to because it's kind of like two dimensional.

46
00:03:43,350 --> 00:03:46,380
It's not really two image the dimensional but only has two legs.

47
00:03:46,390 --> 00:03:47,940
That's the point.

48
00:03:48,060 --> 00:03:55,800
And so with this hot cheetos you get it like Lance and then it's it's the eyes actually telling the

49
00:03:55,800 --> 00:04:02,850
finger which muscles to contract which ones do you know how to move its muscles in order to move its

50
00:04:02,850 --> 00:04:06,340
joints in order to proceed to progress forward.

51
00:04:06,360 --> 00:04:14,010
And so if we look at any point in time here you'll see that the environment is constantly giving feedback

52
00:04:14,100 --> 00:04:17,270
to the stick figure or to the model.

53
00:04:17,400 --> 00:04:21,390
And that's what is going in as an input for the AI to make the decision.

54
00:04:21,390 --> 00:04:26,430
So for instance you can see that the back leg of the half cheetahs on the ground and therefore there

55
00:04:26,430 --> 00:04:32,340
is certain pressure that it is applied to the ground and therefore the ground is applying the same pressure

56
00:04:32,340 --> 00:04:38,010
as we know from physics from a high school that the ground will be applying the same force back on the

57
00:04:38,010 --> 00:04:38,240
sheet.

58
00:04:38,250 --> 00:04:44,260
And that's exactly what is being inputted into the algorithm that's controlling the cheater right now.

59
00:04:44,490 --> 00:04:49,050
And the same time you can see based on the reflection you can see the reflection of it here that there

60
00:04:49,050 --> 00:04:52,680
is a reflection that it touches the foot but here the reflection ends over here.

61
00:04:52,680 --> 00:04:53,810
So that's where it is.

62
00:04:53,880 --> 00:04:57,800
You see that this foot is in the air and therefore there is no force applied to it.

63
00:04:57,870 --> 00:05:03,240
So it's different in that case and therefore the cheetah also has feedback that there is a forceable

64
00:05:03,290 --> 00:05:09,930
and you will be using this feedback to make a decision then the next job you can see the forces are

65
00:05:09,930 --> 00:05:10,340
different.

66
00:05:10,340 --> 00:05:16,170
Again I just drew them here approximately may not be exactly correct but the point stands at different

67
00:05:16,710 --> 00:05:18,180
points in time.

68
00:05:18,240 --> 00:05:23,460
The environment is giving different feedback and that shit is using that to learn how to walk as we

69
00:05:23,460 --> 00:05:24,450
humans do as well.

70
00:05:24,450 --> 00:05:30,900
Like if you think about how we walk we're actually falling forward and then we're putting our back foot

71
00:05:30,900 --> 00:05:33,000
in front to prevent us from falling.

72
00:05:33,060 --> 00:05:37,730
And that gives us that forward perpetual movement that we get.

73
00:05:37,770 --> 00:05:40,100
So similar thing that cheater's is going to figure it out.

74
00:05:40,100 --> 00:05:47,960
And the point here is that the cheetah or the humanoid or any other mortal is not preprogrammed.

75
00:05:47,970 --> 00:05:51,390
So the algorithm doesn't know how to walk in advance.

76
00:05:51,390 --> 00:05:53,190
We're not teaching them how.

77
00:05:53,220 --> 00:05:57,580
We're not saying you have to put this leg forward in this leg and at this point this leg and so on.

78
00:05:57,750 --> 00:06:01,650
It has to figure out all this on its own kind of like a baby when a baby learns to walk.

79
00:06:01,830 --> 00:06:07,710
You don't give it an instruction manual or you don't you know sit down there and teach you how to walk

80
00:06:07,710 --> 00:06:11,410
and explain you can explain anything to me because it doesn't are Tokyo.

81
00:06:11,430 --> 00:06:13,890
So and things so it just can't.

82
00:06:13,890 --> 00:06:15,990
It can only do it through trial and error.

83
00:06:16,140 --> 00:06:16,730
Same thing here.

84
00:06:16,740 --> 00:06:20,280
Through trial and error from multiple generations of this cheater.

85
00:06:20,310 --> 00:06:22,170
So what are we seeing here.

86
00:06:23,330 --> 00:06:28,400
This running This is after a lot of training at the start is going to follow is going to experiment

87
00:06:28,410 --> 00:06:30,950
is going to try to get around a little bit then for longer and so on.

88
00:06:31,040 --> 00:06:32,470
But eventually they'll get there.

89
00:06:32,510 --> 00:06:37,730
Same thing as with a baby that will keep falling falling falling and then eventually you learn at all.

90
00:06:37,730 --> 00:06:40,610
So this result is actually a trained result.

91
00:06:40,970 --> 00:06:44,780
So that's the point that these models don't get preprogramed.

92
00:06:44,780 --> 00:06:46,750
They have to learn through trial and error.

93
00:06:48,320 --> 00:06:53,300
And so what we get is for now show we'll talk about the whole trial and error situation.

94
00:06:53,460 --> 00:06:58,560
But further down in this course for now what is important for us to understand is that there are certain

95
00:06:58,620 --> 00:07:04,610
inputs that come from the environment and they go into the AI model for now.

96
00:07:04,620 --> 00:07:06,120
For us it's like a black box.

97
00:07:06,120 --> 00:07:11,580
We don't know what happened but we do know that what comes out is our outputs outputs on how to contract

98
00:07:12,690 --> 00:07:18,780
or how to move certain muscles and joints and you know whatever degrees of freedom are available to

99
00:07:18,780 --> 00:07:25,610
the model how to control the model in order for it to achieve the goal that we want.

100
00:07:25,830 --> 00:07:33,950
So that's an overview of how the our air system or any other artificial intelligence works.

101
00:07:33,960 --> 00:07:39,180
In general you have input something happens in his black box he has I have outputs and then the model

102
00:07:39,450 --> 00:07:43,090
actually behaves and we'll explore this further in the course.

103
00:07:43,090 --> 00:07:47,960
For now if you'd like to get a quick head start into the world of Erris.

104
00:07:48,360 --> 00:07:57,710
There's a great blog post by Bednar or act who is the supervisor of our aliar guy and Horia mania.

105
00:07:58,170 --> 00:08:02,100
So we're not going to jump into their paper right away but you can start with his blog post which will

106
00:08:02,100 --> 00:08:08,130
give you a head start an overview of what we're going to be talking about the blog post is called clues

107
00:08:08,130 --> 00:08:14,340
for which I search and choose and it's available on Armyn which I believe is Ben's Web site.

108
00:08:14,340 --> 00:08:19,120
This link will be available in the course notes.

109
00:08:19,320 --> 00:08:20,070
So there we go.

110
00:08:20,070 --> 00:08:26,280
That's how we're going to end today's tutorial with this wonderful blog post.

111
00:08:26,280 --> 00:08:27,910
I personally enjoyed it.

112
00:08:27,930 --> 00:08:31,860
Hope you enjoyed today's Tauriel I know you have a bit of a better overview of these benchmarks and

113
00:08:32,160 --> 00:08:39,990
how AI in general is structured or like the purpose of AI in this case with the Mujica algorithm and

114
00:08:40,140 --> 00:08:45,810
starting from literally going to dive deeper and deeper into the world of Erris and of that to look

115
00:08:45,810 --> 00:08:47,210
for it see you back here next time.

116
00:08:47,220 --> 00:08:49,130
Until then enjoy.