Fast moving of a population of robots through a complex scenario

. Swarm robotics consists in using a large number of coordinated autonomous robots, or agents, to accomplish one or more tasks, using local and/or global rules. Individual and collective objectives can be designed for each robot of the swarm. Generally, the agents' interactions exhibit a high degree of complexity that makes it impossible to skip nonlinearities in the model. In this paper, is implemented both a collective interaction using a modified Vicsek model where each agent follows a local group velocity and the individual interaction concerning internal and external obstacle avoidance. The proposed strategies are tested for the migration of a unicycle robot swarm in an unknown environment, where the effectiveness and the migration time are analyzed. To this aim, a new optimal control method for nonlinear dynamical systems and cost functions, named Feedback Local Optimality Principle - FLOP, is applied.


Introduction
Swarm robotics is aimed at using the coordination of many robots. It is generally inspired by the observation of the natural world, such as a flock of birds, ant colonies, school of fishes. The study of collective animal behavior is still a source of inspiration for scientists and engineers, who, by imitating biological processes, seek solutions to complex problems. Among many, the study and analysis of the migration and transport of swarm of robots are of interest. Through the study of stigmergy [1], it is possible to identify the interaction processes that give rise to intelligent cooperative systems, capable of performing complicated collective operations.
In nature, agents follow very simple rules, and even without the need for centralized control, global behavior emerges, unknown to the individual agents, who can find efficient methods of transport and migration. One of the first efficient collective transport is given by the Vicsek particle model in which each agent follows a collective group velocity [2,3]. This model is widely used to imitate the movement of shoals of fish and swarms of birds that manage to move in a coordinated way, following environmental stimuli. Based on this model, many studies have been developed concerning the coordinated collective transport of robots [3][4][5][6]. In particular, the generalization of the Vicsek model to robot movement concerns two types of models: (i) a first class does not involve anti-collision rules allowing for collisions between robots [5] and (ii) a second class uses sophisticated sensors and communication hardware that make the swarm collision-free [6].
In this paper the used communication template presents simplifications with respect to a fully all-to-all connected system, balancing short-range and long-range transmission of information within the swarm. This gives the possibility to equip the robots with exteroceptive sensors present on the market in such a way as to analyze the state of the agents around them and implement the actions provided by the control strategy. Furthermore, appropriate control logics introduce effective anti-collision rules between agents. For this reason, a method of analyzing information from neighbouring agents is proposed, which combines the most significant aspects of the simplified analysis of the first neighbours only and the complete global analysis. The proposed navigation system of a swarm of robots is divided into two main categories: collective exploration and coordinated motion. Here, unicycle robots [7] move in an unknown environment and navigates without internal collisions with other agents, trying to migrate from a start to a target zone using the information provided by the neighbors' agents to reduce the migration time. These different tasks are achieved by using innovative feedback controls developed by authors, named Feedback Local Optimality Principle -FLOP and Variational Feedback Controls -VFC [8][9][10]. The FLOP method controls linear and nonlinear dynamical systems, through the introduction of a nonlinear penalty in the cost function. This permits to apply simultaneously the collective exploration and the coordinated motion strategies. The environment is made by lowlands and hills, as the case, for example, of sand dunes. Robots are subjected to attractive and repulsive forces dependent on the terrain orography. The distributed control uses only local velocity information to drive the members of the swarm in a small region where the signal velocity is captured. The agents follow a nonlinear control strategy where each of them is tracking a target velocity resulting in a directional averaging operation. This process is mimicking the behavior of ant colonies in which the information travels with pheromones permitting to move around obstacles of various types together with a high migration speed.
This paper intends to show the ability of the FLOP method to control a large population of cooperative agents to complete the exploration within an unknown scenario.
The FLOP logic has the advantage of operating in pure feedback, ensuring a local minimum result. The present method, although does not reach the global minimum of the cost function, exhibits large computational advantages when compared to predictive control strategies. This has been already tested in complex systems like autonomous drive terrestrian and marine vehicles [8,11].

Resume of FLOP theory
Feedback Local Optimality Principle, or FLOP [8,9], is based on the variational approach aimed at the minimization or maximization of a given functional . The Lagrangian multiplier technique is used to include in the optimization process a differential constraint. In fact, the two pillars of the variational approach are the cost function ( , ), the base to build the cost functional , and the dynamical evolution of the system represented by the differential nonlinear equation ̇= ( , ) , with , the input and the control vectors, respectively. The constrained optimization is introduced by the Lagrangian multiplier as follows: Where the optimization is performed over the entire time interval [0, T]. The FLOP method introduces a different optimality criterion, switching from a global to a local principle. With this aim, the original functional is split into sub-integrals: The general optimization problem expressed by eq. (1) requires to find the minimum cost function * . FLOP, splitting the general problem in sub intervals, finds a local minimum result ̅ * where the following inequality is true: Equations (1), when subjected to the local optimality criterion, and using the first order Euler discretization technique, produces a set of three equations: The continuous counterpart of equation (4) leads to an augmented form to the Pontryagin's formulation: Equation (5) where none special assumption is required about the function ( ).

System dynamics and cost function
The single agent, represented on the left of Fig. 1, is intended as a unicycle model [7], and its dynamic is expressed as: where = [1; 1; 1; ; ]. The , , , , , , , , and ℎ( , ) are the spatial coordinates of each agent, the heading orientation, the longitudinal speed, the rotational speed, the mass, the rotational inertia, the two-speed resistance coefficients (longitudinal and rotational) and the potential function representing the unknown environment respectively. In this example, the robots are controlled by the thrust force and the yaw moment ] is simple to organize the full nonlinear dynamic system as: The cost function is expressed as: where ( ) = ( ) + ( ), and these two terms represent the collective exploration and the cooperative motion task respectively. The collective exploration task regards every single agent: every agent has the information to migrate to one zone to another by having an assigned target location and must avoid all other agents through an internal avoidance rule. The coordinated motion strategy has the aim to increase the performance of the migration of the swarm, by giving to each agent some information of the velocities of its surrounding agents, as illustrated later. The collective exploration provides two effects: Rendezvous: all agents must reach the assigned location . This task is here often referred also as Go to Target: Internal Avoidance: each agent must not collide with any of the other agents, here written as ( ). The internal avoidance penalty function is written as the sum of two terms, one for the relative positions between agents = − and one for their respective velocities ̇=̇−̇ with = [ , ] the -ℎ agent coordinates: The first addend of the are a Gaussian function and its gradient is depicted in the right of Fig. 1, where the repulsive elastic force between agents is represented. The Gaussian parameters , | |, , i.e., the variance-covariance matrix, its determinant, and a gain factor respectively are studied so that the maximum is high enough to avoid any kind of crash between agents. The second addend is a quadratic potential function of the relative speed ̇.

Fig. 1. Schematic representation of the unicycle model and internal avoidance strategy
Its gradient represents a dissipation force that is activated and deactivated as a function of gamma. In particular, (see example in Fig. 1), the dissipation force is turned on when two agents find themselves at a distance closer than and have a relative speed that identifies a collision given by the sign of the scalar product of •̇. Tuning positive parameters 1 , 2 permit to obtain a smooth slope for . The cooperative motion here introduced provides an adding term, which expresses the ability of every single drone to go to the area near him with the highest average speed in the direction of the target and its cost function is written by = [ , , ] as: where * is determined by the strategy proposed below. The -ℎ agent can observe a portion of the surrounding environment, called . is assumed as a sector of the circle of radius , centered at the agent position, and delimited by the two lines associated with the angles 0 and measured with respect to the x-axis of the -ℎ agent. is further divided into sub-sectors or zones, named ( = 1, ), so that each zone is a sector of angle ̂= − 0 . In , the agent searches for all the agents currently within , the number of which is denoted by . For each agent j therein, its velocity component in the direction of the target , is observed. Here, the chosen direction of the target is along the direction. If is the distance of the j-th agent within from the − ℎ observer agent, the weighing numbers are defined: = − + (13)

Fig. 2. Coordinated motion strategy for the -ℎ agent
The weighted velocity , is estimated by the i-th observer for the zone as: and the highest value within the sector is selected as: Once the sector with the highest velocity is found, the agent steers its velocity in the direction of the bisector of the zone * with the highest velocity.
In Fig. 2 the coordinated motion strategy is represented if the target is positioned along the vertical axes. In the case depicted in Fig. 2, the highest weighted velocity is in the zone 4 , so * = 4. The angle of the desired maximum velocity is * = 0 + ( * − 1 2 )̂, so * in (12) becomes: * = [ * ; ; 0] (16)

FLOP application for coordinated motion strategy
In this section, the benefits of the discussed strategy on the migration of robot swarm are discussed. Simulations are performed with and without the velocity-based strategy. Different simulations in the same environment (Fig. 3) are performed through the FLOP control: first, the number of robots is assigned. Then, many simulations are produced by varying the initial conditions of the swarm. The mean of the arrival time in the target area of the last entering agent is kept with and without using the velocity-based strategy. The simulations are then repeated for different numbers of agents, from 1 to 40, as shown on the left in Fig. 4. Finally, the Probability Density Function (PDF) of the arrival time for = 85 for 60 simulations is shown on the right of Fig. 4. The arrival time and the success of the strategy is strongly dependent on the number of obstacles. The collective motion strategy is not expected to have more success in the individual strategy in the case of a low number of obstacles. In Fig. 3 some screenshots for different time windows of one simulation are shown.

Stigmergy coordinated motion
As can be seen in Fig.4, the proposed strategy provides a remarkable decrease in the arrival time with a lower variance for the last agent.

Conclusions
In this paper, the application of an innovative feedback control named Feedback Local Optimality Principle for the coordinated motion strategy of a robot swarm is presented. The strategy is based on the Vicsek model, but it changes some paradigms to add internal avoidance and to localize the interaction between agents of the swarm. The FLOP application to the proposed strategy gives promising results in terms of the total time of the migration. Further developments will be the subject of future investigations, as for example the correlation between the migration time and the number of obstacles in the environment.