The as a performance measure after a ?nished learning

The name RoboCup is a compression of the
opposition’s full name, “Robot Soccer World cup”. RoboCup is an annual
worldwide mechanical technology rivalry proposed and established in 1996
(Pre-RoboCup) by a gathering of college teachers (among which Hiroaki Kitano,
Manuela M. Veloso, and Minoru Asada). The point of such an opposition comprises
of advancing apply autonomy and AI inquire about, by offering an openly
engaging, yet impressive test.

“Rivalry pushes
progresses in innovations. What we gain from robots playing soccer or exploring
a maze can be connected to industry and enable us to take care of troublesome
genuine problems,” as per Educator Maurice Pagnucco, Leader of the School
of Software engineering and Building at UNSW.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

The RoboCup provides an
insight about how the modern world is going to look like in the coming years
advancements are being made in the field Artificial intelligence and machine
learning are what made it possible. Machine Learning (ML) and Knowledge
Discovery (KD) are research areas with several different applications but that
share a common objective of acquiring more and new information from data. The
application of several ML techniques are being used in the identification of
the opponent team and also on the classification of robotic soccer formations
in the context of RoboCup international robotic soccer competition.

Artificial Intelligence
(AI) Machine Learning (ML) are being used as a base for the robots being used
in the RoboCup. There are many algorithms that are based on (AI) that are being
used in robots to learn from their actions and gain information and then
convert it into knowledge to be used in the future scenarios to make a rational
decision based on the past facts and knowledge gained up-to that date. Some of
the algorithms that are being used are as follow.

1.    
Reinforcement Learning

2.    
Q-Learning

3.    
Hill-Climbing

Reinforcement
Learning

The
basic concept of reinforcement learning is to let the agent know when it is
performing well or badly. The reinforcement is a kind of reward (or punishment)
given to the agent as a performance measure after a ?nished learning phase.
Since reinforcements are given to the agent only after a ?nished learning
phase, and not after each decision made, the agent is only aware of its overall
performance in that phase. The agent can therefore not directly know which of
the individual actions that were good or bad respectively.

Q-Learning

Q-learning is a form of
reinforcement learning where not only the states, but also the actions, are
associated with a utility.

U(s) = maxa
Q(s,a)

where U(s) is the
utility in state s and Q(s,a) is the Q-value associated with taking action a in
states. An advantage with q-learning over ordinary reinforcement learning is
that a model of the environment is not needed in order to select the action
leading to the preferred state. All it has to know is which actions that are
legal in the current state, and compare the utilities for each action in that
state. This leads to e?cient implementations since the state, action utility
mapping can be stored in a lookup-table, usually referred to as a q-table.

Hill-Climbing

Hill climbing is an
iterative search heuristic that tries to ?nd peaks on a surface of states where
the height is de?ned by the evaluation function. The search starts at a
selected state in the multidimensional search space, where each dimension
corresponds to one of the state-parameters. The algorithm then evaluates all
neighboring states and proceeds to the state with the highest value. When no
progress is made a local optimum is found and the algorithm stops.

It has the following
Three well known drawbacks.

1.     Local
optimum

 The algorithm stops when ?nding a local
optimum. There is no way to know if the local optimum in fact is the global
optimum. This is a serious problem since the local optimum found may not be
nearly as high as the global optimum.

2.     Plateaux

A plateaux is an area
in the search space where the evaluation function is essentially ?at. In such a
case the algorithm will not know which path to choose. The result in this case
will be a random walk until an end of the plateaux is found.

3.     Ridges

A ridge may have
steeply sloping sides leading to fast progress to the top of the ridge, but if
the ridge itself is only slowly increasing towards the optimum little further
progress will be made.

BACK TO TOP
x

Hi!
I'm Al!

Would you like to get a custom essay? How about receiving a customized one?

Check it out