A Data-driven Model of the COVID-19 Spread among Interconnected Populations: Epidemiological and Mobility Aspects Following the Lock-down in Italy

An epidemic multi-group model formed by interconnected SEIR-like structures is formulated and used for data ﬁtting to gain insight into the COVID-19 dynamics and into the role of non-pharmaceutical control actions implemented to limit the infection spread since its outbreak in Italy. The single submodels provide a rather accurate description of the COVID-19 evolution in each subpopulation by an extended SEIR model including the class of asymptomatic infectives, which is recognized as a determinant for disease diﬀusion. The multi-group structure is speciﬁcally designed to investigate the eﬀects of the interregional mobility restored at the end of the ﬁrst strong lock-down in Italy (June 3, 2020). In its time-invariant version, the model is shown to enjoy some analytical stability properties which provide signiﬁcant insights on the eﬃcacy of the implemented control measurements. In order to highlight the impact of human mobility on the disease evolution in Italy between the ﬁrst and second wave onset, the model is applied to ﬁt real epidemiological data of three geographical macro-areas in the period March-October 2020, including the mass departure for summer holidays. The simulation results are in good agreement with the data, so that the model can represent a useful tool for predicting the eﬀects of the combination of containment measures in triggering future pandemic scenarios. Particularly, the simulation shows that, although the unrestricted mobility alone appears to be insuﬃcient to trigger the second wave, the human transfers were crucial to make uniform the spatial distribution of the infection throughout the country and, combined with the restart of (production, trade and education) activities, determined a time advance of the contagion increase (autumn 2020).


Introduction
The "coronavirus disease 2019" (COVID- 19), caused by SARS-CoV-2, posed novel challenges to all world countries, often evidencing their vulnerability in the practical management of emergency states, particularly concerning the health effects, but also the implications for economic growth and social development. Understanding and quantifying the dominant variables that govern the current pandemic evolution, especially in order to limit the impact of future outbreaks, imposes the need of framing the determinants of the epidemic dynamics [43]. Many literature studies take into account several types of possible interventions, mainly identifying two domains of variables: pathogen-associated variables and society-based variables. This latter variable domain appears to be particularly relevant for COVID-19, since individuals are actually the vectors of SARS-CoV-2 virus.
Thus, although the availability of vaccines, at present social distancing and personal protective measures are still important mechanisms for controlling the COVID-19 spread. The effectiveness of social distancing is studied by many authors, whose papers propose projections where the impact of containment measures in reducing the infection spread is shown [15,7,39,25,10].
In the early stage of pandemic, only qualitative data analysis was performed, as in [28] where data regarding how the human mobility changed in the United States at the beginning of the pandemic course are studied; in that paper, it is also stressed the importance of quantifying the social distancing practices, emphasizing the opportunity of determining relationships between confirmed cases and the social distancing plateau. The concept of "social distancing" can also be associated to travel restrictions, that is to the attempt of limiting the virus transmission by reducing the amount of travels among regions. One of the first study about the effects of human mobility on COVID-19 spread is proposed in [26]. Through the analysis of time mobility data from Wuhan, the study shows the importance of applying travel restrictions in the early stage of the outbreak, evidencing their lack of effectiveness in late stages.
Among the growing number of literature papers on the topic, we focus on the quantitative studies concerning the effects of limiting social distancing and human mobility. These studies are generally based on mathematical modelling, still exploiting rather different approaches, and most often they use public or volunteered datasets to assess the impact of different nonpharmaceutical countermeasures, [11,12,21,34,35]. As an example, we mention the works of correlational analysis based on (generalized) regression models of cityclusters; in particular, applications are mainly proposed for single countries, e.g. China [47], USA countries [1], and some European countries (namely France, Spain, and Italy) [23], or for cross-city analysis in many worldwide countries [42]. In [6], a framework that employs an epidemic Renormalisation Group (eRG) approach to model the effects of inter and extra European border control and of social distancing for single countries is proposed; the model describes the time-evolution of the infected cases in a specific isolated region, while including the interactions among multiple regions of the World.
In many recent works, both deterministic/stochastic and discrete/continuous models have been applied for the description, forecast and control of the COVID-19 epidemic spread. In the framework of compartmental models, the classes of Susceptible (S), Exposed (E), Infected (I) and Removed (R) subjects are generally introduced, yielding SEIR models. For the COVID-19 pandemic, because of the specificity of the disease, other categories are generally introduced, referring to the condition of infected patients, such as asymptomatic, or hospitalised, or in the quarantine condition, [15,16,20,22,36]. A particularly critical epidemiological charac-teristic has been recognized in the fraction of infectious cases remaining undocumented owing to mild or very limited symptoms; indeed, estimating the extent of undiagnosed infections is crucial for evaluating the overall prevalence and contagiousness and then the pandemic potential of the disease, [16,29].
Some variants of the original SIR/SEIR framework, modeling peculiar aspects of epidemic transmission dynamics, have been proposed in the past to generate insights into the evolution of specific infectious diseases and assess the potential impact of different intervention strategies; so, these studies can usefully support the research on the present COVID-19 pandemic. Among numerous modeling works, we mention the book [8] and the paper [41], presenting in-depth overviews of theoretical and applicative results on measles, ebola, and other viral infections. Examples of more specific works are [38,14] on the measles disease, and [32,33,13] on HIV/AIDS. Also, it is worth mentioning that very recent works deal with the impact of the COVID-19 co-infection in patient with pre-existent morbidity [10]; for example, [3] reports a statistical populationbased study to estimate risk and possible outcome of the association between HIV infection and COVID- 19.
The epidemic models of the kind described concern the transmission of the virus within population. Concerning aspects of the virus transmission between populations, multi-group epidemic models, which are suitable extensions of SIR/SEIR frameworks, can be used to represent the COVID-19 spread among different (heterogeneous) populations, and to study the effect of interactions and travel restrictions on the pandemic evolution, [19,9,30,37,2,31,5]. The heterogeneity of the subpopulations is intended with respect to the epidemic properties and can depend on their different geographical allocation or other structural variables (e.g. economy, age, mobility).
The model presented in this paper is based on a previous epidemic model reported in [16], which represented the COVID-19 evolution by a SEIR-type model including two subpopulations of infected subjects, undiagnosed and diagnosed, and explicitly accounting for a fraction of asymptomatic infective subjects. The present model structure, depicted in Section 2, incorporates N interconnected epidemic models of that kind, particularly with the aim of representing the effects of individual interactions and geographical exchanges among groups. In Section 3 analytical results on the dynamics of both for the isolated subsystem and for the interconnected model are given. In Section 4 the general model is specialized for N = 3 and it is applied to simulate the disease evolution in Italy, and precisely in three macro-regions, in order to evidence some inter-esting aspects related to the increased human mobility following the first pandemic wave. The numerical results show that the model is apt in describing the summer period 2020 of the COVID-19 epidemic in Italy, by explicitly accounting for different scenarios characterizing geographically distinct macro-regions of the Italian territory (northern, central and southern).
2 A multi-group epidemic model for the spread of COVID-19 among N groups The model proposed here has a multi-group structure that incorporates different subunits, each one describing the dynamic evolution of the COVID-19 within a homogeneous population, whose epidemic evolution differs from that of all other units. For instance, different groups can represent different geographical areas or structurally different populations. The N groups (namely subunits) are interconnected by a mobility network that accounts for the transfers of individuals who are allowed to travel from a group to another. Most typically, the model can describe a geographical system composed by N regions with people of each region moving for work, study or simply personal/holiday reasons.
Each of the N submodels is a simplified version of the model previously proposed in [16] for the description of the first phase of the epidemic spread in our country (thereby modeled as a whole homogeneouslymixing group). In addition to a compartment of exposed individuals, which is proper of SEIR models, our model explicitly distinguishes between diagnosed and undiagnosed infective patients. As shown in [16], the proposed structure appears appropriate to mimic the Italian case, by incorporating also control actions reproducing government restrictions and emergency actions implemented to detect the infected cases, especially asymptomatic or mildly symptomatic ones. The block diagram reported in Fig. 1 shows the general structure and the state variables of one regional submodel. Precisely, each submodel takes into account the following five state variables:  complications). It is assumed that they cannot transmit the virus since they are isolated (at home or at hospital); R i (t), number of healed patients (spontaneously or after therapy).
The complete model formulation includes N structurally identical groups or subsystems, with each group i, {i = 1, 2, . . . , N }, described by a SEIR-type model with undiagnosed and diagnosed infected subjects, as reported above. For sake of compactness, when the whole system is considered, a vectorial notation is introduced for the state space variables, defining the vector and, with the same procedure, E(t), I u (t), I d (t) and R(t). The N subsystems are connected by means of a mobility network allowing people to move among groups. In the following, we refer to a specific epidemic group by its identifier i, also using the same subscript to denote the related state variables and parameters. In general, however, we assume that the only individuals allowed to move are the ones having no evidence and/or diagnosis of infection, i.e. the ones belonging to compartments S i , E i , I ui , R i , i = 1, 2, . . . , N . Fig. 2 shows how the mobility network works for N = 3 subsystems, also illustrating the epidemic core model of each subsystem. So, the spread of COVID-19 among N epidemiologically distinct groups can be formally described by means of N systems of time-varying ODE models of the following kind: Fig. 2 Block diagram of a mobility scheme among N = 3 interconnected groups of epidemic diffusion. Controls for group i: u i ∈ [0, 1], social contact limitations; v i ∈ [0, 1], test campaign intensity; w i ∈ [0, 1], efficacy of therapies against COVID-19 complications; f i ≥ 0, efficacy of therapies against COVID-19; z i,j ∈ [0, 1], limitation of movements between groups i and j. Coefficients c i,j : transition probability of a subject from compartment i to compartment j.
where i = 1, 2, . . . , N . Now, we briefly explain the meaning of all the quantities included in the ODE system (2)- (6). B i is the net input rate in compartment S i , which accounts for both the newborn (susceptible) individuals and the balance between immigration and emigration; µ i is the per capita death rate owing to causes not related to the infection (natural death of the population) and it represents the loss rate from any compartment of the model except for I di ; µ I d i is the per capita death rate of diagnosed patients I di ; β i is the relative contagiousness of individuals in compartment I ui and it accounts for two main factors, which are the contagion probability from one infected-susceptible contact (related to the aggressiveness of the virus) and the frequency of contacts; φ i represents the fraction of the infective population I ui that will show recognisable symptoms and that will consequently be diagnosed and isolated (possibly receiving therapies); k i describes the transition from E i to I ui , and it is set to k i = 1/τ i , where τ i is the mean length of the incubation period (see Fig. 3); h i refers to the transition from I ui to I di , taken as h i = 1/τ si , where τ si is the average time from infection until the occurrence of the first recognisable symptoms; γ Iu i models the outflow from the infective compartment I ui associated to recovery from infection and, then, it is assumed γ Iu i = 1/τ ri , with τ ri the mean recovery period without any medical assistance; similarly γ I d i models the outflow from the infective compartments I di due to recovery from the infection and, then, it is γ I d i = 1/τ ri , withτ ri denoting the mean recovery period of monitored patients; c ij is a weight accounting for the transition probability of a subject moving from the i -th compartment S i or E i or I ui to the corresponding j -th compartment. Note that the coefficients c ij can also be time-varying in order to represent the variation of transfer probabilities possibly occurring over time. This variability is especially required for long-term analysis when "ordinary" mobility regimens alternate with highly "intense" transfer periods, like summer or Christmas or Easter seasons. Note also that, for the sake of generality, the recovery rates from compartments I ui and I di are taken into account by the separate rate constants γ Iu i and γ I d i respectively, even though to a first approximation, and in the absence of experimental evidences, they are assumed equal in the following simulation. As far as the control actions are concerned, the timevarying quantities u i (t), v i (t), w i (t), z i,j (t), taking values in [0, 1], and f i (t) ≥ 0 are introduced to represent the intervention measures adopted by the authorities to contain the disease outbreak. More precisely, u i (t) quantifies possible actions locally implemented by authorities to reduce the contact rate, and then the relative infectivity β i , of population i. It accounts for all the government decrees introduced to limit the physical interactions among people, but also for the informative campaign about hygienic measures, TV/radio announcements, and so on. The quantity v i (t) represents the intensity of the swab test campaign performed on subpopulation i, which changes daily depending on the number of swab tests actually performed. For the sake of simplicity, and in the absence of other indications, we assume that the amount of performed tests is equally distributed among people of compartments S i , E i and I ui , so that the same per capita test rate can be assumed for all these compartments. This implies that the exit fluxes of tested (positive) individuals leaving E i and I ui are proportional to the number of individuals within the same compartment, i.e. v i (t)E i (t) and v i (t)I ui (t), respectively. We notice that the flux v i (t)S i (t) of (negative) test results exiting S i does not explicitly appear in the model equation (see also Fig. 1) since it does not contribute to the dynamical evolution (actually an identical flux amount comes back to compartment S i ). The control actions w i (t) and f i (t) refer to the efficacy of the therapies adopted by the i−th health system, either to reduce side effects of COVID-19 and, respectively, to cure the infection. Furthermore, the time-varying controls z i,j (t), i, j = 1, 2, . . . , N , represent the interventions and mobility restrictions implemented by the central government or local authorities to limit people transfers between groups i and j.
Finally, the pair of input fluxes Λ Ei (t) and Λ Iu i (t) is introduced (see Eqs. (3), (4)) to model the cumulative entry of infected people coming from outer groups/areas whose epidemic dynamics is not incorporated in the N group model. Since documented infective people are not allowed to travel, it is reasonable to account for such cumulative outer inputs only in the equations of E i and I ui , i = 1, . . . , N .
We notice that the proposed model does not incorporate the possibility of re-infection. Indeed in our model, once recovered, a patient (R) is no longer susceptible of infection and cannot re-enter the S class. This simplifying hypothesis, which is actually the object of clinical studies and debates on the persistence and actual length of the immunity period, can be a valid assumption if a short-term analysis, like the one presented in the next section, is performed.
As a remark on the asymptomatic undiagnosed subjects, we observe that a susceptible subject (of any group i) can become infected if a non-safe contact with an infected undiagnosed subject occurs. During the infectious period that follows incubation, the newly infected subject can at some time develop recognisable symptoms being easily diagnosed, and possibly recovering after medication and assistance without serious consequences. However, in a number of cases, the infectious individual can be asymptomatic or mildly symptomatic until full recovery, remaining hidden and undocumented as a positive case in I ui . So, in fact, the class of asymptomatic undiagnosed subjects represents the most dangerous class responsible for the possible epidemic spread since the individuals are allowed to move, thus transmitting the contagion and increasing the number of infections also to other groups. Exhaus-tive swab test campaigns performed on the entire population can improve the capability of diagnosis contributing to mitigate the infection diffusion as reported in [27].

Equilibria and stability analysis
The first step of the analysis, in Subsection 3.1, takes into account each i-th subsystem as an isolated dynamical model, neglecting all the mobility contributions, i.e. setting z i,j = 1, i, j = 1, . . . , N . In order to take into account the actual situation in which control actions are always present, the other controls are not set to zero but they are considered as constants. So, the following setting is introduced in all the analysis The resulting study can be also associated to the conditions over a limited time interval in which the controls do not vary sensibly. In addition, as usual for equilibria and stability analysis, the external inputs are set to zero, so that Λ the system addressed in the present section iṡ 3.1 Analysis of the i-th isolated subsystem Equilibrium points for system (9), and their stability analysis are here addressed from the point of view of each i-th submodel, assuming the absence of incoming and outgoing fluxes (z i,j = 1, i, j = 1, . . . , N ). Under these positions, the equilibrium points can be computed solving the nonlinear system is obtained and can be used in (11), giving from which the two solutions and can be computed. Denoting by the superscript 1 the equilibrium point associated to (17) and by 2 the one associated to (18), using (17) in (10)- (14), the solution is obtained. Equilibrium points like (19), characterised by the absence of any kind of infected people, are usually addressed as epidemic free.
The second equilibrium can be obtained starting from (18); setting S e2 i = S e i , the other components of can be computed as follows An important note is that equilibrium (20) is admissible if and only if all its components are non negative. Since for B i − µ i S e2 i = 0 the equilibrium (20) coincides with the epidemic free solution (19), the admissibility of an independent solution is assured only for In this case the equilibrium defines a condition in which a certain number of infected individuals is always present and then the epidemic is active. This kind of equilibrium define the so called endemic conditions. It can be noted that in the epidemic free condition, the equilibrium point is not dependent on the control actions, since no epidemic is present. On the contrary, the endemic equilibrium point is strongly dependent on the controls, if present.
Once the equilibrium points are known, it is possible to study their stability characteristics. Local conditions can be easily given on the basis of the linear approximation in a neighbourhood of each of them.
The first step is the computation of the Jacobian matrix for the system (9) in the isolated conditions considered above, without incoming and outgoing people fluxes. The result is a matrix of the form with Being the eigenvalues of J 2,2 always real negative ( , stability of local approximations depends on the different forms assumed by J 1,1 once computed in P e1 i and P e2 i . Setting, for sake of compactness, the matrix obtained for P e1 i can be written as along with From the triangular structure of the matrix and being −µ i < 0, stability is assured once the matrix has eigenvalues, i.e. the roots of the characteristic polynomial equation with negative real part. Thanks to the positiveness of the model parameters and making use of the Descartes' rule of signs, the required condition holds if and only if Easy computations, along with positions in (26), lead to When the stability of the second equilibrium point P e2 i is investigated, the same procedure as above brings to the computation of the matrix Its eigenvalues are given by the roots of the characteristic polynomial For the Routh-Hurwitz stability criterion, necessary and sufficient conditions to be fulfilled are It is possible to verify that condition (33) is always satisfied since, after some manipulations, the equivalence with can be proved. Condition (34) is equivalent to A preliminary observation is that condition (36) is the same as condition (21) and then when the endemic equilibrium exists, it is also locally asymptotically stable. At the same time, being (30) not satisfied, the epidemic free equilibrium is unstable. The opposite situation arises when (36), and then (21), are not satisfied: the only admissible equilibrium point is the epidemic free one and it is locally asymptotically stable.
For the analysis of the epidemic dynamics, it can be useful to derive the model-based expression of the reproduction number R, which gives the average number of secondary cases produced by a single infected patient in an entire susceptible population [41]. This parameter can be approximatively computed on the basis of the time history of the number of infected individuals to characterise the speed of the epidemic spread [4,45,46]: if R < 1 the epidemic does not spread, reducing autonomously until vanishing; R > 1 denotes a spreading epidemic with the number of new infected cases growing with a rate related to the magnitude of R. An analytical expression, in terms of a model parameters, can be also provided by means of the next generation matrix [41]. This procedure is here followed to derive the reproduction number R i for the isolated i-th sub group considered in this Subsection, showing the fulfilment of the above mentioned relationships w.r.t. the stability properties just provided.
According to [41], from system (9) only the equations regarding the infected part of the population E i , I ui and I di are considered; the same assumptions about individual fluxes and controls introduced for the equilibrium and stability analysis are here performed. Denoting by Z i (t) = (E i (t) I ui (t) I di (t)) T , the following subsystem is defined: , the reproduction number R i is given by the eigenvalue with the maximum modulus of the next gen- The computation yields in which the greatest eigenvalue can be directly found because of the particular structure. The result is that the basic reproduction number is expressed as According to its definition, if R i > 1 the epidemic spreads within the i-th population. This corresponds to that is the same condition as (36) providing the stability of the endemic equilibrium (as well as the instability of the disease free equilibrium). On the other hand, when R i < 1 the epidemics reduces and tends to vanish. From (41), this latter condition can be expressed as that guarantees the stability of the disease free equilibrium too, as required by (30).

The whole multi-group model
If all the N groups are considered together instead of each single one separately as in the previous Subsection 3.1, making reference to system (9), the equilibrium points can be computed solving the N systems (i = 1, . . . , N ) besides E e i = 0. Being all the coefficients non negative, and not all equal to zero from the connected hypothesis on the network, the only admissible solution is The connection property assures that this result can be propagated through the network, from subsystem i to subsystems j and so on, giving I e uj = 0, ∀j ∈ [1, N ].
Concerning the equilibrium points computation, from (46), for i = 1, . . . , N , it is possible to write the compact expression once the following matrices are introduced They are all non-negative diagonal matrices except C, which is a positive definite and then non singular, matrix: due to its structure, it is strictly diagonally dominant (w.r.t. its columns). In fact, from (8) andc i,j ≥ 0, is obtained. Computing the sum of (44) and (45), for each i = 1, 2, . . . , N , and collecting them in a vector equation, the resulting expression is given by where from which we get From Eqs. (45) we get Then, exploiting solutions (56) and (53) and introducing the matrix we get the equation Eq. (60) has one solution given by and then the epidemic free equilibrium Additional equilibrium points can be found computing the non-zero solutions I e u of (60) (for which matrix (58) is non singular) and obtaining all the other components by substituting I e u in Eqs. (53), (47), (48), (56). For such solutions, a closed expression is not easy to be found. A numerical approach can be fruitful for each specific case.
The availability of the expression (66) for P e1 allows to study its stability characteristics in analytic closed form. Then, the computation of the Jacobian is performed, getting the block-wise form where and Thanks to the block structure and the characteristics of C and (Γ +∆), the local stability of P e1 is proved once matrix has all its eigenvalues with negative real part. Some considerations about the position of the eigenvalues in the complex plane can be performed making use of the Gershgorin circle theorem on matrix eigenvalues 1 .
In fact, according to the column formulation, it is possible to write where and where Recalling the domains of definition of each parameter, it is possible to give the expressions of the upper bound for the real parts of the eigenvalues. Being the centres real, the minimum ℜ(z) min and the maximum ℜ(z) M ax real parts of possible values of the eigenvalues are the 1 The eigenvalues of a n × n matrix A with entries a i,j belong to at least one of the n discs D i , i = 1, . . . , n, where D i = {z ∈ C : |z − a i,i | ≤ n j=1,j =i |a i,j |} for a row formulation, or D i = {z ∈ C : |z − a i,i | ≤ n j=1,j =i |a j,i |} for a column based version.
intersections of the circles with the real axis. For the first N circles D i one has for z i ∈ B i , i = 1, . . . , N . It is possible to conclude that, being the N circles D i all contained in the real negative part of the complex plane, the epidemic free equilibrium point is locally asymptotically stable if all the ℜ(z i ) M ax of (76) are negative, which is true if, ∀i ∈ [1, N ], the following conditions hold Like for each isolated subsystems, it is possible to introduce the reproduction number for the entire network making use of the next generation matrix. The same procedure adopted for the single case is here performed.
According to [41], the same restriction of the full dynamics (9), considered in Subsection 3.1, is taken. It iṡ and with and Then, the derivative of F and of V with respect to the vector Z must be computed and evaluated in the disease free equilibrium: The reproduction number R is the eigenvalue with the maximum modulus of the maximum eigenvalue of F ′ (V ′ ) −1 can be found as the maximum eigenvalue of the matrix The analytical computation is not easy and it is preferable to compute R numerically for each specific case.
However, for the special case of N isolated groups, for which C is diagonal, it is possible to simplify Υ so that yielding Eq. (89) expresses the reproduction number of the whole system as the greatest one among all the subgroups of the isolated case. Therefore, the general stability condition (77), in the absence of mobility, implies R < 1 which is the well known condition guaranteeing the local stability of the disease free.
4 Numerical analysis of the mobility in Italy from March 9 to October 21, 2020 In this section, the proposed model (2)-(6) is particularised to analyse the COVID-19 evolution in Italy, considering N = 3 groups (geographical areas) and using real demographic and epidemic data for parameter estimation. Choosing N = 3 coupled to a summer observation period is motivated to highlight the influence of human movements on the epidemic spread. In the period of interest, which followed the first epidemic wave, a considerable number of movements could be counted across Italy, with non uniform flow, but with predominant flow orientation from North towards South. We finally note that, in view of the model modularity, the group number N could be differently chosen depending on the level of detail required by the analysis. For instance, with the aim of analysing the epidemic situation in each one of the 20 Italian regions and focusing on the peculiar policies adopted by the local authorities in each area, we could set N = 20 specializing the parameters of the submodels on the basis of the local epidemic and social characteristics.
Numerical simulations are performed in order to assess the impact of human mobility and of people transfer on the diffusion of COVID-19 considering three macroareas of the Italian territory. Each macro-area gathers different Italian regions, as reported by Table 1. Our analysis starts on March 9th 2020, beginning date of the hard lock-down implemented by the Italian government, and covers the summer period with the aim of evaluating the effects of restoring the mobility among regions, evidencing the role played by the holiday exodus and by the reopening of many activities in the late summer in triggering the second pandemic wave. The time interval selected for parameter fitting from epidemiological data allows to simplify the identification procedure, treating separately the three chosen macro-areas. Indeed, since in the chosen period the mobility was basically forbidden (except for work or necessity reasons) across Italy, we can perform the numerical procedure for the separate identification of the following parameter sets: Preliminarily to parameter fitting, the parameters B i , µ i , i = 1, 2, 3, have been evaluated on the basis of the demographic data provided by ISTAT [24]. The per capita death rates µ i , i = 1, 2, 3, have been assumed equal to each other and computed making reference to the mean value, over the period 2011-2018, of the ratio µ between the number of deaths in Italy in a year and the number of Italians at the end of the same year, i.e. µ 1 = µ 2 = µ 3 = µ = 2.81 · 10 −5 days −1 . Since the Italian population can be considered constant over the relatively "short" period of interest, we assume that birth (plus the net balance between immigration and emigration) approximately compensates for death, and that the average per capita rate of birth and death are comparable, and we compute the net input rate B i as the product between µ and the number of individuals P i in the i-th group at a given time (1-1-2020), i.e. B i ≈ P i · µ, i = 1, 2, 3. Then, according to the sizes (number of persons) of the three populations reported by ISTAT on the 1st of January 2020, P 1 ≈ 27.746·10 6 , P 2 ≈ 12.016·10 6 , P 3 ≈ 20.597·10 6 , we get B 1 = 779.67, B 2 = 337.65, B 3 = 578.79 (persons · day −1 ).
The parameters k i , h i , i = 1, 2, 3, have been fixed on the basis of time constants related to disease progression provided by the World Health Organization and confirmed by the scientific literature [44,40]. In particular, as previously done in [16], we take k i = 1/τ i = 1/6 days −1 and h i = 1/τ si = 1/5 days −1 , i = 1, 2, 3, assuming that an exposed individual becomes infective after nearly 3-7 days and that about 5 days are required for the appearance of the first symptoms after the end of the incubation period.
Concerning the relative infectivity of the three populations, the fractions of symptomatic infective people within I ui , i = 1, 2, 3, and the per capita rates of death and recovery (for both I ui and I di ), i.e. the parameters β i , φ i , µ I d i , γ Iu i , γ I d i , i = 1, 2, 3, have been estimated by means of a ordinary least square fitting procedure. For the sake of simplicity, we assumed γ Iu i = γ I d i for any i. The epidemiological data exploited for the fitting are: i) daily number of diagnosed individuals that are currently positive, ii) total number of recoveries among all diagnosed positives, iii) total number of notified deaths. For each group i, such data are respectively reproduced by computing the quantities: a) where δ 0 is March 9, beginning date of the national lock-down in Italy, while the j-th notification day δ j can run until June 3, which is the initial date of restored mobility among regions (about one month after the end of the "strong" lockdown on May 4).
Suitable time-varying controls, accounting for changes in the social behaviour (owing to government restrictions and to increased health risk knowledge), in the swab testing modalities, and in the efficiency of the health system have been exploited for the identification of the parameter sets {β i ,φ i ,µ I d i ,γ Iu i ,γ I d i } from data. More in details, the controls u i (t), i = 1, 2, 3, are assumed rapidly increasing as of March 9 as a consequence of the government lock-down decree [16], and reaching the saturation values 0.96, for u 1 , and 0.92, for u 2 , u 3 , about two weeks after the mentioned government act. Such high saturation values are maintained by all the controls u i (t), i = 1, 2, 3, for the entire identification period, so as to represent the achievement of high awareness levels among the Italians at the reopening. With regard to v i (t), i = 1, 2, 3, these controls are directly inferred from the data. In particular, denoting by ρ i (δ j ) the swab test ratio of group i measured at day δ j , it is and, for any t ∈ [δ j , δ j+1 ), we take v i (t) as the linear interpolation between M 3 (ρ i (δ j )) and M 3 (ρ i (δ j+1 )), where M 3 (·) is the function performing the moving average on a 3 week interval, i.e. a 21 sample moving window, centered in δ j , which allows us to filter out the oscillations in ρ i (δ j ) getting a smoother time course for v i (t), i = 1, 2, 3 (right bottom panels of Figs. [4][5][6]. Passing to the choice of w i (t), f i (t), i = 1, 2, 3, a rapid change in their time behaviour was required in order to fit adequately the total number of deaths and recoveries of the three groups. The mentioned time variations for w i (t), i = 1, 2, 3, can be explained by assuming an increased medical experience (reached almost at the end of the first wave) in curing the side effects of COVID-19 and in assisting mildly infective people, so avoiding the acute phase of the infection. On the other hand, the change in time of f i (t), i = 1, 2, 3, can be motivated by a likely initial defect in the notification process reporting the daily number of recoveries, especially due to the difficulty of monitoring the infection course of positive people at the beginning of the epidemic in our country.
As far as z i,j (t) is concerned, since uniform mobility restrictions have been implemented in Italy until mid October, we assume in this analysis z i,j (t) = z(t) for any pair i, j and for any time. So, the function z(t) (quantifying the level of mobility between regions) was set to 1.0 throughout the entire estimation interval, since no movement of people was allowed from March 9 to June 3. Obviously, this is a simplifying assumption as some people did actually move for serious or necessity reasons, though observing strict precautionary and distancing measures. However, such a modelling choice allowed to separate the parameter identification of the three ODE submodels, since each dynamical system depends only on the state variables of the related group as long as the condition z(t) = 1 is satisfied.
We finally assume a null infective input in each area from abroad during the whole identification period, i.e. Λ Ei (t) = Λ Iu i (t) = 0, i = 1, 2, 3, since strong restrictions and controls for people coming from foreign countries were carried out in Italy during the lock-down.
Concerning the initial conditions of the state variables at δ 0 = 9 March 2020, we fixed I di (δ 0 ) equal to the ISS data on the number of current diagnosed positives of group i, while we computed E i (δ 0 ), I ui (δ 0 ), R i (δ 0 ), i = 1, 2, 3, according to the ratios E i (δ 0 )/I di (δ 0 ) = I ui (δ 0 )/I di (δ 0 ) = 11 and R i (δ 0 )/I di (δ 0 ) = 1, i = 1, 2, 3, that we inferred from preliminary simulations on the whole national population [16]. Finally the number of susceptibles of group i was computed using the measured P i by means of the relation Numerical solutions of the ODE system (2)-(6) were obtained by a MATLAB procedure implementing a classical Runge-Kutta method, exploiting the ode45 solver. The best fitting procedure is based on fminsearch function of MATLAB, suitably modified to guarantee parameter estimates falling in the physical (positive) range. The estimates of the model parameters in each group are reported in Table 2 while the related optimal fitting curves of data i)-iii) are shown in panels A-C of Figs. 4 -6. The control actions exploited for each fitting are reported in panel D of the same figures.
As a comment to the obtained parameter estimates, we note that the difference of the relative infectivity β i in the three subpopulations is basically attributable to the bilinear form of the incidence rate, β i S i I ui , and to the different population sizes P i , i = 1, 2, 3. Indeed, as the bilinear incidence approximates the standard incidence,β i S i I ui /P i , which is normalized to the total population size, the coefficients β i , i = 1, 2, 3, of our formulation are actually of the order of the inverse of the population sizes P i , i = 1, 2, 3 (β i ∼ O(1/P i )). So, it is straightforward to verify that the productsβ i = β i P i , i = 1, 2, 3 differ only slightly from each other, ranging about in 0.2-0.3 day −1 .
Another comment about the estimates of Table 2 concerns the fraction of subjects showing recognisable symptoms within the infective population I ui . It can be noticed that the estimated values of φ i , i = 2, 3 (Centre, South) is rather higher than φ 1 (North). This could be explained by the different (smaller) number of swab tests initially implemented in Centre-South with respect to the amount of tests performed in North Italy. Indeed, as the virus was prevalent in the Northern area at the beginning of the epidemic outbreak, a very intense test campaign was performed in that area, focusing not only on suspected cases with symptoms (as done in Centre-South) but including also the general population even in the absence of recognisable symptoms. Based on the parameter estimates given in Table  2 and on the expressions of the reproduction number of Section 3, we can now provide some predictions on the initial value of such an indicator for the three isolated subsystems and for the interconnected model. As the three areas were actually isolated during the estimation interval (z i,j = 1 from 09/03 to 03/06) we can provide an initial estimation of their reproduction numbers, exploiting Eq. (41). In particular, taking into account the control actions on March 9 (first day of lock-down) and the estimated parameters, we obtain R 1 = 5.45, R 2 = 3.49, R 3 = 4.94.
Moreover, we can provide also an evaluation of the basic reproduction number of the whole country by computing the maximal eigenvalue of matrix Υ given by Eq. (87). In particular, setting the onset time of the epidemic spread in Italy at t 0 << March 9 when the "first" infected appeared in Italy and the mobility was completely active, while no other control actions were implemented (u c i = v c i = z c i,j = 0 in Eq. (87)), we obtain a basic reproduction number R = σ(Υ ) = 5.03. This value is located near the higher literature estimates of the basic reproduction number reported for instance in the study [20].
The identified model parameters, along with the mathematical expression of the reproduction number R given in Section 3, allow also to highlight some interesting aspects on the control action intensity which is necessary to contain the disease spread in Italy.  shows how the reproduction number of the whole country changes when the intensities of the swab test campaign and of the social contact limitations change. In particular the plot depicts the behaviour of R = σ(Υ ) (Υ given by Eq. (87)) with respect to v c i = v and u c i = u, i = 1, 2, 3, when the mobility is completely active (z c i,j = 0, i, j = 1, 2, 3). The plot highlights the following aspects: in the absence of social contact limitations the containment of the disease spread could be achieved provided that a very intense swab test campaign is performed: indeed, when u = 0, we get R < 1 for v < 0.12; this means that at least the 10% of the whole population should be tested every day, so requiring about 6 million of swabs per day. This scenario of intense infection tracing is very hard to be actualized, suggesting that some contact limitations are mandatory; -performing a strong limitation of the social and economic activities which reduces the human contacts more than 80%, could be enough to contain alone the outbreak; in fact, the condition u > 0.8 provides R < 1 independently of the intensity of v; -it is possible to infer a feasible trade off between social contact limitation and infection tracing, which is not very heavy from both aspects; implementing a reduction of social contacts of 70-80%, i.e. keeping u ∈ (0.7, 0.8) (note that u is at least 0.96 in the lock-down period), allows to contain the outbreak if v ∈ (0.0002, 0.02). This means that from about 12 thousand (for u = 0.8) to about 1.2 million (for u = 0.7) of swabs per day are required.
We notice also that Figure 7 does not significantly change when the mobility level change, even increasing 100 times the coefficients c ij (not shown), suggesting that the stability properties are not sensibly affected by the mobility level. However, as shown by the case study reported in the next section, the human mobility plays an important role in making uniform the spatial distribution of the infection throughout the country, also contributing to a time advance of the second wave.
4.2 Evaluation of the effect of switching on the mobility among areas after June 3rd, 2020 This section is devoted to the numerical simulation of the complete model to investigate the impact that restoring mobility and connections among the groups of regions had on the epidemic evolution of each area. Unrestricted mobility officially started as of June 3 in our country and we aim to reproduce the dynamics of the period following the lock-down phase, quantifying some policies and actions adopted with particular attention to the mobility aspects. Therefore, on the basis of the fitting of the separated models previously obtained, we fix values for the remaining model parameters (representing inter group exchanges) and we start tuning a tentative mobility framework so as to satisfactorily reproduce the epidemiological data updated with respect to time. Once the model is suitably calibrated with respect to the mobility and control variables, we hypothesize and simulate different evolutionary scenarios that could be envisaged as a consequence of adopting control actions that combine or exclude some restrictions actually adopted in Italy.
In the following, we propose a procedure to derive realistic time courses for the control actions translating into model quantities the social behaviours and sanitary actions. The unknown model variables are adjusted so that the model adequately reproduces the epidemiological data over the chosen period ending on October 21, which is an observation interval that allows to investigate the role played by the increased number of movements in summer on the trigger of the second pandemic wave. Epidemiological data used to adjust the model behaviour are the total number of reported positive cases and the daily number of new cases. The reason for considering these data is that they carry "cumulative" information on the total number of subjects notified since the beginning of the observation period and they depend on a reduced number of model parameters. The model quantities representing the mentioned measured data are for the total cases of area i at δ j and the related increment C i (δ j )−C i (δ j−1 ) for the daily new cases. Suitable values of u i (t) and z(t), as well as of Λ Ei (t), Λ Iu i (t), i = 1, 2, 3 have been accurately searched by trial and error until satisfactory data reproduction. For simplicity sake, we assume that a single cumulative external flux Λ i (t) enters the infected communities and that it is equally split up between E i and I ui , i.e. Λ Ei (t) = Λ Iu i (t) = Λ i (t)/2, i = 1, 2, 3. Next, ad-hoc time behaviours have to be chosen, according to the available data and official sources, for the coefficients c i,j (t) in order to represent both the "ordinary mobility" (occurring all year round) and the unbalanced transfer of people characterizing the holiday exodus. Concerning the other control actions, v i (t) is deduced by data on the number of swab tests (as done in the previous section), while w i (t) and f i (t) are kept fixed to the last value obtained during the identification period. Note that, as it can be seen from the model equations, the total number of positive cases does not depend on w i (t) and f i (t) (as well as on the actual value of the model parameters µ I d i and γ I d i ), so that an accurate tuning of these functions is not required to represent this quantity.
A first comment that can be made about the total number of positive cases is that their extents in the three geographical areas considered were highly heterogeneous since the beginning of the pandemic outbreak and the same difference was basically maintained until October, as it shown by Fig. 8. The total number of cases reported at the beginning of August (around day 150) in the North was about 200000, while a sensible smaller (about one order of magnitude) number of positive cases were reported by Centre (∼ 28000) and South (20000).
Let us now go into the quantitative details of the control tuning. Figure 9 depicts the time course finally adopted for the control actions, which has been established on the basis of the observations reported in the following. The relatively low number of the daily new cases in the central and southern regions during the mentioned period suggests that neither substantial changes in the social behaviour, like the relaxation of precautionary measures or of the social distancing, nor a sensible importation of new infections from outside occurred during the first phase of summer. Therefore, for i = 2, 3, the controls u i (t) should be taken as high as it was during the lock-down, i.e. 0.92, while taking the inputs Λ i (t) equal to zero. On the contrary, the number of cases in the North keeps on increasing after the end of the lock-down, although with a slope smaller than before May 4. For this reason, in order to reproduce the data, it is reasonable to hypothesize a decrease of u 1 (t) after reopening, or a non-negligible Λ 1 (t), or both these contributions. We assumed both, which means a decrease in u 1 (t) (until mid September) and a concomitant increase of Λ 1 (t) just after the reopening. As it can be seen from panels B, C of Fig. 9, the new values set for the inputs Λ i (t), i = 1, 2, 3, after the 3 June reopening are maintained up to 21 October. Moreover, we assume a sensible change occurring in u i (t), i = 1, 2, 3, (the second reduction for u 1 (t)) around mid September. On this date the restart of production activities and school re-opening increased the "intra-group" mobility and the consequent social contacts among people. As explained in detail below, this last change of u i (t), i = 1, 2, 3, in addition to the "inter-groups" mobility, is actually a necessary assumption in the simulation to explain the sharp increment of cases starting from the end of September in all the three areas (see Fig. 8) .
As far as the mobility controls z(t) and c i,j (t), i = j, i, j = 1, 2, 3 are concerned, they are chosen on the basis of the following considerations. First, according to the government decrees allowing people to travel as of June 3, we assume that the mobility is restored gradually along about forty days, so that z(t) starts decreasing from 1 (no mobility) on June 3 going to 0 (complete mobility) on July 15. In addition, we need to quantify the "regular mobility", i.e. the amount of people transfers for study and work, or just for visiting under "ordinary" conditions. So, in order to model the "regular mobility" we keep the coefficients c i,j (t), i = j, i, j = 1, 2, 3, constant, say c i,j (t) =c i,j , i = j, i, j = 1, 2, 3, before and after the holidays (which means out of the interval 15 July -30 September). Such constant coefficients are tuned in order to provide approximately 100000 persons travelling each day and between each pair of areas, paying attention in balancing the macro-areas demographies, i.e. assuming a constant number of people in each of them. In particular, we choosec 1,2 = 4 · 10 −3 , c 1,3 = 4 · 10 −3 ,c 2,1 = 8 · 10 −3 ,c 2,3 = 8 · 10 −3 ,c 3,1 = 6 · 10 −3 ,c 3,2 = 4 · 10 −3 (see Fig. 9, panel A).
From the beginning of August till the end of September, a remarkable mass departure of Italians from North to Centre and South took place, with a sensible delay with respect to previous years and with a remarkable increment of internal travelling towards the Southern seas or mountains. Indeed, a statistical analysis reported in September 2020 by ENIT (the Italian National Agency for Tourism) shows that almost 24 millions of Italian went to Centre-South for their holidays [18]. For this reason, as of July 20 we impose changes to the coefficients c i,j (t), i = j, i, j = 1, 2, 3, to reproduce an unbalanced demographic movement towards the Central and Southern populations whose size increases of some millions of persons (the northern population was reduced accordingly). Conversely, from about August 31 the mobility across Italy progressively changed again up to mid September, producing the opposite migration of people from Centre-South to North. The summer exodus is thought to terminate at the end of September. A temporal scheme of the model quantities so assumed is illustrated in Fig. 9, while their simulated effect on the susceptible populations (that basically coincide with the entire resident populations of the areas) is shown by the upper panels of Fig. 10.
The time course of the control actions assumed as described (see also Fig. 9) allows to reproduce well the data behaviour by means of the model. This is shown in Fig. 10 for the total number of cases (middle panels) and for the daily number of new cases (lower panels) reported from June 3 to October 21. The main comment that we can give is that the combination of the sum-mer exodus from North to Centre-South along with the actually enhanced virus circulation in the North (u 1 (t) lowered and Λ 1 (t) increased in the model) produces the joint spread and increment of positive cases throughout the three areas, and especially in Centre-South. Indeed, in the interval 150-200 days the total cases present a net slope change (Fig. 10, middle panels) while the daily number of new cases take a more bell-shaped course (lower panels of Fig. 10). Overall, such increment of positive cases was rather limited, which in our opinion proves that the general mobility increment during the summer holidays was not sufficient to directly induce the feared second wave. However, they did probably contribute to the second wave since the fast increase of cases after September 15 can be actually reproduced only by combining the further sensible reduction of containment measures (i.e. u i (t)) after that date with the assumptions on mobility and social behaviour made for the summertime. This combination of effects can be demonstrated also by the predictions reported below, which we construct by simulating the removal of one control variable at a time. Middle panels: total number of cases. Lower panels: new daily cases. Circles: ISS data [17]. Solid lines: model predictions.
So, let us now evaluate the impact of the single control actions, evidencing how the combination of the different controls assumed above is necessary to obtain a sufficiently accurate reconstruction of the data. In particular, our aim is to evaluate how different the situation could have been by adopting different control strategies. The following two scenarios are schematized by model simulation: (1) epidemic evolution keeping the restricted interregional mobility even after June 3; (2) epidemic evolution with restored mobility, but supposing that stronger precautionary measures and restrictions on social contacts are applied. Scenario (1) can be simulated setting z(t) = 1 for any time (until October 21). Scenario (2) can be modelled by fixing Λ 1 (t) = 0 for any time and keeping u 1 (t) = 0.96 at least until the middle of September (instead of allowing u 1 (t) to go down to 0.67 after the reopening, and before mid September, as in the reference simulation). Figures 11  and 12 report the model predictions obtained for the two scenarios (1) and (2) supposing valid the variations mentioned above w.r.t. the reference situation of Fig. 10.  Fig. 9. Upper panels: susceptible population S(t) northern, central, and southern from left to right. Middle panels: total number of cases. Lower panels: new daily cases. Circles: ISS data [17]. Solid lines: model predictions.
From the results of Fig. 11 it is evident that, if the interregional mobility had been prevented (i.e. keeping closed the borders of the three areas), a second wave would have occurred sooner in the North, where the virus circulation was higher than elsewhere, but it would have spared the other areas from a sharp increment of infections (at least until October 21). So the simulation evidences the role of the holiday mobility in the diffusion of the epidemic along the Italian territory, making the number of infections more uniformly distributed in the country and delaying the second wave.
Conversely, from Fig. 12 we can conclude that, if stricter rules for the social contacts and stronger precautionary measures had been imposed at the reopening (after June 3), the second wave would have been avoided in all the three groups until October 21. Note that the reduction of the controls u i (t), i = 1, 2, 3, af-  Fig. 9. Upper panels: susceptible population S(t) northern, central, and southern from left to right. Middle panels: total number of cases. Lower panels: new daily cases. Circles: ISS data [17]. Solid lines: model predictions.
ter the middle of September is not sufficient to produce the rapid increase that we actually observe in the real data over a month (from 15/09 to 21/10). Indeed, from the middle and lower panels of Fig. 12 we can notice a barely perceptible increase of the simulated cases (both total and new daily cases) at the end of the considered time interval (indeed, the same number of cases are basically obtained until October 21 without reducing u i (t) from the middle of September). This means that the second wave is slowly starting from the final days of the considered time interval, because of the reduction of u i (t), i = 1, 2, 3, at the middle of September, but if this had been the case the disease progression could have been slower giving more time to the government for effective interventions to limit the impact of the second wave.

Concluding remarks
In this paper, an enriched version of the classical SEIR model, including the effects of asymptomatic individuals on the COVID-19 outbreak, is proposed to describe the spread of a virus among different geographically defined populations, with the aim of analysing the consequences of interregional people mobility on the virus transmission.
After an in-depth model analysis, we present a case study, which is interesting by itself and also as a possible future scenario, considering the effects of the people fluxes among Italian macro-areas in summer 2020. During that period, the mobility restrictions were relaxed and a considerable number of movements with predominant flow orientation from North towards Centre-South (in early summer, and backward, in late summer) occurred.
By using official Italian epidemiological data, it is evidenced the capability of the proposed modeling of capturing the effects of individual mobility among regions characterized by different local evolutions and effects of government decrees to regulate and limit such interregional mobility.
The study of the expression of the reproduction number showed its dependency from the model parameters; in particular, it is stressed the influence of the swab test campaign and its relation with the changes in social contacts limitations, showing the possibility of an acceptable trade off between social contacts limitation and infection tracing once an high number of swab test is processed.
The numerical results obtained by ad hoc model simulations evidenced the capability of capturing the effects of the individual mobility among regions, characterized by different local evolutions, and the consequences of government decrees to regulate and limit such interregional mobility. It is proved that the mobility increment during summer was not sufficient to immediately induce the second wave, but it was probably decisive to its onset when the mass departure effect was combined with the restart of the productive activities and school reopening. Preventing the mobility among the macro-areas considered would have probably saved the Centre-South from a sharp contagion increase after the summer, while producing a time advance of the second wave in the North.
Future developments of the present study can be represented by the use of the identified models and the quantified contribution of mobility, social and economic activities, and schools, to analyse and predict the effects of the combination of containment actions in different possible scenarios.

Funding
The work was partially supported by Sapienza University of Rome, grants No. 806/2019 and 729 009 19