Feedback Local Optimality Principle applied to rocket vertical landing VTVL

. Vertical landing is becoming popular in the last fifteen years, a technology known under the acronym VTVL, Vertical Takeoff and Vertical Landing [1,2]. The interest in such landing technology is dictated by possible cost reduc-tions [3,4], that impose spaceship’s recycling. The rockets are not generally designed to perform landing operations, rather their design is aimed at takeoff operations, guaranteeing a very high forward acceleration to gain the velocity needed to escape the gravitational force. In this paper a new control method based on Feedback Local Optimality Principle, named FLOP is applied to the rocket landing problem. The FLOP belongs to a special class of optimal controllers, developed by the mechatronic and vehicle dynamics lab of Sapienza, named Variational Feedback Controllers - VFC, that are part of an ongoing research and are recently applied in different field: nonlinear system, marine and terrestrial autonomous vehicles [5-7], multi agents interactions and vibration control [8, 9]. The paper is devoted to show the robustness of the nonlinear controlled system, com-paring the performances with the LQR, one of the most acknowledged methods in optimal control.


Introduction
Landing, as for the Apollo 11 mission to the Moon, is an operation deputed to a lander module of the rocket body, the LEM, Lunar Excursion Module. As a new frontier of space discovery, space vehicles are today required to be able to land with reliability on different surfaces. Among the multiple complexities implied by the vertical landing, the control strategy plays a determinant role to obtain reliability and robustness. While take-off operations are better predictable and can be specifically designed by using suitable launch infrastructures, the landing phase is affected by higher uncertainties due to weather disturbances and ground surface imperfections. The launch umbilical tower, evacuation vanes, shockwaves dissipation, vibrations insulation and accurately designed attitude during the first phases of the launch help much in facilitating the takeoff operations. The return trajectory is instead weakly stable due to the presence of random disturbances. Hence, to improve the landing success probability, a feedback optimal trajectory is to be identified. The "Moon landing problem" is one of the prototype problems included in many control books and it is an excellent example of a twoboundary optimization problems, that is difficult to approach by a feedback control strategy. Moreover, the vertical landing is a nonlinear problem with instabilities, analogous to the challenging control of the inverse pendulum. The Feedback Local Optimality Principle FLOP approach represents an interesting alternative to more classical solutions, as the LQR. The aim of this paper is to define a robust and reliable control to land the vehicle safely. The quality of the control law is investigated considering the landing approach manoeuvre, starting from an assigned altitude, and varying the initial conditions, namely attitude and speed. The control actions involve the magnitude and the direction of the thrust, and orientable grid fins mounted on the top of the vehicle controlling the aerodynamic forces. The model of the system includes also actuators saturation effects.

FLOP: a new local optimality principle
A new control strategy based on classical variational approach has been recently developed by the authors and named Feedback Local Optimality Principle or FLOP [58,9]. The method relies on a local optimality criterion, replacing the global one used in the optimal control theory based on the Pontryagin approach. By using this idea, the chance to obtain a feedback control law for nonlinear dynamic system is supported. In classical variational problem the performance index J ̅ , represented by the integral of the cost function ( , ) subjected to the dynamic differential constraint ̇= ( , ) (representing the system dynamics) has to be minimized (or maximized) along the entire time interval  In (1), is the system state, the control vector and (0) = the initial condition. The differential constraint is introduced through Hamilton's formulation using the Lagrange multiplier ( ). The solution of (1) provides both the optimal control * ( ) and the corresponding optimal trajectory * ( ). The Feedback Local Optimality Principle, or FLOP approach, starts splitting the original integral (1) into = / integrals, where Δτ represents the time horizon of each of them (2). The FLOP method requires a weaker minimization concept, based on the extremal value for each individual integral within the related time horizon Δτ: where ℒ(̇, , , ) = ( , ) + 7 (̇− ( ( ), ( ))). is minimized following the optimality criterion: where UB T and LB T limits indicate upper bound and lower bound. Each integral solution satisfies the boundary conditions: This approach provides three main advantages, with respect to the classical variational approach: • The considered dynamic system can be described by a nonlinear model, namely belonging to the class of the affine systems ̇= ( ) + . • A more general class of nonlinear penalty functions, with respect to the classical quadratic forms of the state , can be included into the cost function and represented by ( ) • The FLOP approach provides a feedback solution for the control vector .
This permits to overcome the main engineering drawback of the Pontryagin's or the Bellman's approaches. In fact, they both provide feed-forward control law, taking into account only one single information related to the initial state of the system, not using the information coming out, as the time marches, from the sensors measurements of the system state evolution. The FLOP approach, in general, provides a solution, with a performance that depends on the choice of the variable Δ that acts as a tuning parameter.

Resume of FLOP technicality
The continuous counterpart of equations (2) and (3) as shown in [5,8,9], leads to an augmented form of the Variational set of equations. In fact, solving each integral of (2) with its boundary conditions (3), is equivalent to solve the integral (1) for the entire time interval [0-T], where the final constraint for ( ) = is replaced with a first order differential equation ̇= as in the following:  (4) can be, in general, explicitly solved for a penalty function ( , ), that is required to be quadratic in the control and with any degrees of nonlinearities in , with ( ) differentiable. For affine systems, equations (4) after some mathematics produces: The cost function O q 7 + ( ) exhibits a more general form than quadratic cost function in terms of the control variable, where additionally very weak hypotheses are required on ( ), since is sufficient its differential at the first order exists.

Dynamic model
In this section the rocket dynamic model depicted in Fig. 1 is presented. The dynamic of the system is described by a 6 DOF rigid body motion with an additional equation describing the fuel mass consumption. The origin of the mobile frame is placed in the geometric center of the vehicle body, since the longitudinal position changes during the flight, due to the mass variation of the system. As usual for aerial vehicles, the axis is aligned along the longitudinal axis, the axis is set on the wing's plane, the axis is orthogonal to the previous two (see figure Fig. 1).
where ( ), ( ) are the time variable, inertia and the generalized Coriolis matrices respectively. The mass variation due to fuel consumption is described adding the equation: where the total thrust is ™š™ = O + q + … + › + oe , sum of the individual forces provided by the main engines and is a suitable engine constant. The external forces collect the gravity action • , the aerodynamic forces žŠŸš acting on the vehicle body, the − ℎ main thrusters forces 7 K , the − ℎ cold gas thrusters actions ¢ £ and the − ℎ forces generated by the trimmable grid fins ž¥¦ § : dependences from ̇ and ̇ are neglected. These permit to evaluate the aerodynamic forces due to the airflow around the rocket body, with = , and »½ q = q + q , »¼ q = q + q the quadratic speed modulus in the − and − plane respectively.
When the vehicle approaches the atmosphere during the descent phase, the cold gasses thrusters have not enough power to control the vehicle attitude. Hence, the actions ž¥¦ § become predominant by suitable variations of their angles of attack © stabilizing the vehicle's flight.
The forces and torques ž¥¦ , ž¥¦ are born because of the trimmable fins and they are: where for the − ℎ fin, ˆGÇ § is the wing section area, ´ § ( © + ) is its drag coefficient, depending on © and on the angle equal to or depending on the considered fin. © represents the component of the velocity along the axis, the parameter ( ) is the shadowing coefficient that varies between 0 and 1 depending on its configuration, and finally © is the position vector of the − ℎ fin.

Control
The simulations were carried out considering a control frequency loop of 100Hz.The control action is performed introducing specific penalty functions, for each phase of the flight, this is represented by a quadratic penalty function of the state and target 7 : The matrix is suitably varied during the flight. The vehicle flight is composed by three main phases as shown in Fig. 2: the first is the attitude correction in LEO. The vehicle actuates the FLIP manoeuvre to reach the desired pitch. The state target is referred to a specific attitude and null angle rates 7

Results
The simulations consider a rocket with the following characteristics: . The lateral thrusters provide the required force for the rotation that sets the vehicle at the desired pitch. Fig. 3 shows the pitch evolution in time, its rate and the thrust provided by the lateral thruster responsible of the pitch control. In the second phase, the rocket is still in LEO, flying 100 [km] above earth surface, travelling at 10000 [km/h], with initial pitch 180 [°]. The value required for the pitch to safely approach the atmosphere is 80 [°]. Moreover, the vehicle approaches the atmosphere reducing the effect of gravity using the main engines.

Conclusions
The FLOP control shows good performances in all the phases that characterize the vehicle flight and landing, in a compound complex control operation. These results are possible thanks to the FLOP formulation, that allows to take into account the nonlinearities, typical of the rocket model. Further tests will be performed, introducing presence of random disturbances, both with state estimation algorithm in order to validate the good performance shown in the present paper.