CMPT 310 Project Proposal

RoboCup Team

UZUL

0. Contents

Evaluation Criterions

Basic Problem

Passing Avoidance Strategy

Motionless Blockers

Simple Path Blockers

Enemy Follows Ball

Motion + Formation Strategy

Passing in Moving Formation

Generating Formations

Switching Formations

AI Tools and Design

References

1. Introduction

Our group will be entering the RoboCup team called UZUL or (U snooZe U Lose).

The two coaches for the team are Blair Leggett (bleggett@sfu.ca), and Stephen Moore (smmoore@sfu.ca). Our approach will be to solve a series of problems involved with passing among a collection of RoboCup client-programs. These problems start with a simple case and get progressively complex. By using machine learning through analysis of the results of these simulation it is hoped that we can teach the client programs better teamwork and formation passing skills that will increase our chances of winning a RoboCup match. The primary focus of our team is to try to solve these AI problems first, and then second try to worry about trying to win matches. This way we will have concrete results that we can show for our research.

2. A need for team based passing skills, and a simplified scenario in which to teach them to client programs.

Obviously, the challenge of RoboCup is creating the program for soccer team members that will allow a team to score on an opponent while defending their own goal. However, the challenge of scoring, and defending does little to describe the algorithmic challenge faced by programmers. In the paper Two Fielded Teams and Two Experts (2FT2E) by Tambo, Kaminka, Mansella, Muslea and Raines, the challenge of RoboCup is primarily based in three different areas: learning, teamwork, and agent modelling. A RoboCup team must try to balance their attempts in all three of these areas; however, for the purposes of this initial discussion let us focus on the challenge of teamwork as it relates to soccer. Further in 2FT2E the authors show the difference in a RoboCup team’s performance that is able to effectively communicate strategy. For this case, the individual team members have complex monitoring capabilities making them situationally well-aware and hence independent of others (for monitoring); however, these team members are made stronger in that when one team member recognizes a strategic threat or advantage, that member alerts the other team members so that the others may defend or exploit the situation. When this was done, teams could improve their mean score by one or even up to three—depending on the degree of communication noise for the client RoboCup programs. With this conclusion, we will further analyze the problem of group communication and teamwork in the situational problems that arise in a game of soccer.

In the paper Learning of Cooperative actions in multi-agent systems: a case study of pass play in Soccer by Matsubara, Noda, and Hiraki, the simplest problem of soccer passing is studied and subjected to machine-learning. In this case two instances are studies and shown in the diagrams below.

Figure 1: Situation where Player 2 should pass

Figure 2: Situation where Player 2 should shoot on goal

In the first example, player 2 is better off passing to fellow team-member player 3 who is clear to take the shot. In the second example, passing to player 3 is not advisable since player 2 has a clear shot on the goal. Through the use of machine-learning a formula is determined based on direction to opponent player which specifies whether or not to take a shot on goal, or to pass to a team-member. The machine-learning in this case is the generation of a set of rules, and probability functions that give the client-player code a better ability to perform simple soccer tasks. As this Matsubara, Noda and Hiraki’s paper concluded that after learning the player program did not make fruitless shots on goal, and that the success rate was greatly improved.

Following from the conclusions and analysis of the previously discussed papers, the challenge of the UZUL RoboCup team is to develop an effective team-based passing strategy. Similar to the shooting strategy in the Learning of Cooperative actions paper, the challenge here is to develop a broadly based team strategy, which requires much more communication than the simplified case of shooting on a goal between two players. This is a very complex problem, but is better solved by starting from a very simple problem and building progressive complexity. There are four ‘standard’ soccer formations that are used in typical human games: each formation describes the number of team-members devoted to one of three field regions: forwards, mid-fielders, and defence (also assuming one player for goalie). Figures 3-6 show the layout for these four formations: 2-4-4, 3-3-4, 4-2-4, and 2-5-3.

Figure 3: 2-4-4 Soccer Formation

Figure 4: 3-3-4 Soccer Formation

Figure 5: 4-2-4 Soccer Formation

Figure 6: 2-5-3 Soccer Formation

The significance of these formations is that RoboCup teams have started to use them to create “locker-room roles” for the client programs. Where programs will assign a program to play the role of a “forward” to define their relative position on the field and help establish their behaviour in terms of agent modelling. Further RoboCup teams like the champion team CMUnited have begun using such formations in strategic situations. For the CMUnited99 team, initially the team will employ a less-risky formation of 2-4-4: a formation that puts fewer members in forward positions to give more protection in the mid-field. But once ahead in score, the team will switch to a more aggressive 3-3-4 formation.

Returning to the UZUL programming problem, from now on we will assume that strategy development and machine-learning will be done when the team is in a 4-2-4 formation. Starting with the problem of having the client programs form the correct formation and then be able to pass the ball between members. The challenge for this simplified problem is having the programs know what distance apart they should be, how much power to put in their kicks, and how to adjust for field noise when the ball may be off-course, or stop short of the intended receiver of a pass. Also the members will have to know who are possible candidates for receiving a pass from their particular position. This case for accounting for errors is best represented in the four players problem. Consider the situation in Figure 7:

Figure 7: Four Player Problem

In this situation which player should go for the ball? Judging by distance, player 6 should dash and pass the ball to one of the other players. But if each player runs for the ball then resources, and stamina will be wasted in this group effort. Thus it seems more logical to devise a method where each player is meant to control a specific area of the field. When the ball is resting within that area then that player, and only that player runs for the ball which saves the other players from wasting energy since they would never reach the ball in time.

However, the challenge in RoboCup is the stress factor involved with playing a game. Time is a strict requirement for being able to get the ball in the shortest possible time. Also, the player clients have a limited communications capability for performing this task. And finally, there has been no introduction of the opposing team for which the four players must content with in going for the ball. While these concerns must be addressed they are not considered for an initial test-case scenario. Thus the initial problem simply concerns how to pass the ball in a team formation, with an effective communications model, minimizing the noise and error factor from wind and field, and solving the four player problem so that when errors do occur that the client programs can recover as quickly as possible and continue passing the ball. These then are the evaluation factors to be used in the machine learning of this simplified RoboCup scenario.

3. Adding Progressive Complexity to the Learning Scenario

Before proceeding into a discussion about the machine-learning process, more should be said about how the simple scenario will be developed so that more complex team-based behaviour will be achieved.

Once the basic passing formation behaviour has been achieved then two possible development strategies will be possible. Each of these development paths will be handled by different members of the programming team. The first path involves increasing the degree of enemy presence to the passing scenario which underlines the importance of passing avoidance strategies (i.e. being able to pass around an enemy presence). The second path develops handling complex motion and formation problems in the scenario. The development stages of these two paths are shown in Figure 8.

Figure 8: UZUL Machine Learning Problem Development Strategy

The remainder of this section will describe each of the scenarios, problems and evaluation factors in each of these six stages, starting with the passing avoidance path.

3.1 Motionless Blockers

In this case, the formation must be able to pass around motionless blockers who attempt to kick a ball away if they pass too close (see Figure 9).

Figure 9: Motionless Blockers Problem

In this situation, the team continues to pass the ball in their formation; however, certain passing combinations are no longer possible due to enemy presence. The client-programs must be able to determine which team-members are available for passing. An interesting problem presents itself in this scenario. Looking at Figure 9, it is not possible for team-member 5 to receive passes due to an overwhelming enemy presence. The client-programs should be able to comprehend this difficulty and appropriately adjust their passing possibilities. The evaluating factors of this scenario will be the length of time required to consider passing the ball, the degree of messages required for handling this scenario, and the ability of the client-programs to avoid enemy interception.

3.2 Simple Path Blockers

For this situation the blockers are allowed to move back-and-forth in a simple path. In this way their ability to intercept is variable and considerably harder to calculate (due to the intermittent sensory information provided by the soccer server program).

Figure 10: Enemy Motion Based on Simple Paths

As can be viewed in Figure 10, the enemy players move in such a way that they cross the gaps of the player formation in every possible way: left-to-right, up-and-down, and both kinds of diagonal movement. Again, it is the challenge of the client-programs to learn how to pass in such a way to avoid enemy interception of the ball. The evaluation factors of the previous case will remain the same for this example; however, time and time-based observations will become more significant.

3.3 Enemy Follows Ball

This scenario represents the most difficult form of enemy presence, and rivals the challenge that would be faced during an actual RoboCup game. In this situation the rival players actively attempt to pursue the ball, and overwhelm a single player by surrounding it, and thereby eliminating all passing possibilities.

Figure 11: Enemy Follows Ball

Understandably, this represents the hardest possible challenge for a RoboCup passing strategy. In practice our team may come to find that such a scenario is not possible to surpass, and may only measure effectiveness of player-client algorithms in terms of how long they can keep the ball away from an enemy team. It may be possible to make the scenario easier or harder by adjusting the enemy team’s speed, and dashing capabilities. However, if it is possible to succeed—through machine-learning—at developing an effective keep-away strategy then this would represent a major breakthrough in designing a global strategy for beating any RoboCup team. Theoretically such a solution should be possible. If it is possible to kick faster than a robot can run, then one uses the great conservation of formation resources against an unorganised team that simply chases the ball. It might be possible to fatigue a team using a simple “monkey-in-the-middle” strategy where enemy players are kept running after the ball in a formation designed to avoid interception. We are not aware of any team using these concepts in an official RoboCup tournament. Such strategies may be fruitless or unsuccessful, but the purpose of our research is to test the viability of such approaches.

And now lets remove the enemy presence from our learning scenarios, and go back and look at steadily increasing the complexity of same-team player motion.

3.4 Passing in Moving Formation

The goal in this scenario is to have the entire formation of players move back-and-forth in a simple path:

Figure 12: Passing in Moving Formation

The challenge in this scenario is being able to pass, not to the current location of a player, but to the expected location of a team-player. In this way we are teaching our client-programs a standard soccer skill of having the players “run to the ball” to achieve more efficient motion and ball-handling skills. The evaluating factors will be (as always) the difficulty of interpolating player position given the limited sensory information of their team member’s current position and having instead to rely upon “ghosts” or older position information. Perhaps that further complexity could be added to this problem by having the players move in increasingly complex patterns to help them evade the enemy.

3.5 Generating Formations

Much of what has been developed in “locker-room” roles for player-clients is having the clients assume a role at the beginning of the game, and then maintain that role until the half. The problem of generating formations looks at having a group of unorganized clients communicate and form a given formation. Thus in this situation the clients are themselves assigning roles to each other and moving into their designated field positions. Such a problem is generally concerned with solving the problem of assigning group roles with limited communication abilities. While solving this problem will not directly help a RoboCup team in their play, this is an important step for being able to solve the next progressive step (see 3.6). This simulation begins with the robots being randomly distributed on their side of the field, and ends once they have formed a prearranged formation: 2-4-4, 3-3-4, 2-5-3, etc… . The evaluation factors in this problem will be to have the robots take their formation in the shortest time possible.

3.6 Switching Formations

The ultimate realization of autonomous machine learning for a RoboCup team is when it can recognize an opposing teams weaknesses and form and autonomously use a formation that exploits such a weakness. In this way the RoboCup team has learned how to develop in-game strategy and to assign its own “locker-room” roles. Such a realization would be an ultimate victory for the teamwork, and agent modelling aspects of the RoboCup problem. While we do not foresee being able to develop the data to solve such a problem we will make headway on developing such strategies. We will probably be unable to recognize the “best” formation to use against an opponent, but might try such simple heuristics of attempting several formations during a game and finding if one is more effective against that team’s program. If such a formation can be found then that formation would be more much more likely to be assigned during a kick-off. Another hope from developing a formation assigning team would be to change formations when the ball is near the team’s goal so that more defence would be possible, and then switch to a more offensive formation when about to score on the opposing goal. The evaluation factor in such a problem would be to maximize goals against an opponent team.

4. Machine Learning Process for Solving RoboCup Scenarios

Having discussed the types of problems our team wishes to solve, we should now discuss the approach, and the sorts of AI tools that will be used in solving these problems.

Figure 13 provides a graphical representation of the machine learning process.

Figure 13: Machine Learning Process

From the diagram, the process begins with trying to solve one of the RoboCup passing problems from sections two, and three of this proposal.

From this a simple client program is written that is able to handle four basic tasks. First, the program has to be able to place a designated robot (either friendly or enemy) in a specific field position. Second, the program requires a segment of code that enables it to handle the basic sensory requirements of the problem: these requirements can include ball, field, friend, and enemy positions, movement and possibly even memory of previous team messages. This information will form the bulk of the sensory input for the client program. The next section will be heavily revised as the learning process goes through several iterations (which will hopefully improve the program). This section represents the evaluation of sensory input to form the actions that the program will perform to try to solve the RoboCup problem. The actions are grouped into three major categories: motion, kicking (i.e. passing), and communication. Finally there is a segment of code to adjust when passing is disrupted due to errors or noise: a good case of this code would be when a pass misses, or if a passed ball falls short of its intended target.

With this basic program written, the RoboCup simulation is run, which will probably involve one side of the field (to avoid for offside calls from the referee). The time duration will be set to a large number to allow for a greater range of responses. By running the simulation this will produce a logfile which is necessary for the next part of the machine learning process.

Now for the AI portion of the problem. The RoboCup logfile is taken and analysed. First the file must be parsed and broken down into understandable values for machine analysis. One of the online RoboCup logfile parser tools will be used to streamline this process. The data that is produced from this process is then subjected to data mining (using DBMiner) to try to find patterns or rules that would better improve the client program’s behaviour which is based on the evaluation factors listed in section three. The data mining could take several forms, but we have listed two probable methods used to help improve client-code performance. The first mining method is the extraction of association rules to provide algorithmic guidelines for passing. Second, since there are many factors involved with passing it might be necessary to note the factors that were present when a pass was successful. By plotting these factors in a multi-dimensional graph it might be possible to find regions for successful passing behaviour which would then again provide a rule-based system for improving client-program performance. It should also be noted that the analysed data may present the possibility to “eyeball” the data to generate a set of rules and heuristics from human analysis.

With a set of generated rules (either by data mining or human interpolation), they can be programmed into a second version of the client-program. Much of the problem of RoboCup involves the challenge of being able to generate effective responses in a short duration of time. Therefore, many of these generated rules would result in a decision tree for the client-programs. This would give the programs a divide-and-conquer algorithm for generating a response to a situation given a simple set of questions to follow. We are hoping that such an analysis will work given the time constraints.

Once the new decision tree has been constructed then this new decision tree would be programmed into the clients which hopefully will improve them to solve the RoboCup problem in a better and more efficient fashion. The RoboCup simulation would be run again with this new code which results in a new iteration of learning for the client programs.

5 References

Aberg, Helena. Agent Roles in RoboCup Teams. Department of Computer and Systems

Science, Royal Institute of Technology, 1998.

http://www.d.kth.se/~d93-hab/Exjobb/thesis98.html

Hitoshi Matsubara, Noda Itsuki, & Hiraki Kazuo. Learning of Cooperative actions in

multi-agent systems: a case study of pass play in Soccer. AAAI-96 Spring Symposium on Adaptation,

Coevolution and Learning in Multi-agent Systems, SS-96-01, pp. 63--67, Mar. 1996.

aaai96-sss.ps.gz

Noda Itsuki, Matsubara Hitoshi, & Hiraki Kazuo. Learning Cooperative Behaviour in

Multi-Agent Environment: a case study of choice of play-plans in soccer. PRICAI'96: Topics in Artificial Intelligence

(Proc. of 4th Pacific Rim International Conference on Artificial Intelligence, Cairns, Australia), pp. 570--579, Aug. 1996.

pricai96.ps.gz

Tambe Milind, Kaminka Gal A., Marsella Stacy, Muslea Ion, Raines Taylor. Two

Fielded Teams and Two Experts: A RoboCup Challenge Response from the Trenches.

Information Sciences Institute and Computer Science Department: University of Southern California, 1997.

ijcai99.zip