Many difficult statistical problems arising in censuses or in other large scale surveys have an underlying Combinatorial Optimization structure and can be solved with Combinatorial Optimization techniques. These techniques are often more efficient than the ad hoc solution techniques already developed in the field of Statistics. This thesis considers in detail two relevant cases of such statistical problems, and proposes solution approaches based on Combinatorial Optimization and Graph Theory. The first problem is the delineation of Functional Regions, the second one concerns the selection of the scope of a large survey, as briefly described below. The purpose of this work is therefore the innovative application of known techniques to very important and economically relevant practical problems that the "Censuses, Administrative and Statistical Registers Department" (DICA) of the Italian National Institute of Statistics (Istat), where I am senior researcher, has been dealing with. In several economical, statistical and geographical applications, a territory must be partitioned into Functional Regions. This operation is called Functional Regionalization. Functional Regions are areas that typically exceed administrative boundaries, and they are of interest for the evaluation of the social and economical phenomena under analysis. Functional Regions are not fixed and politically delimited, but are determined only by the interactions among all the localities of a territory. In this thesis, we focus on interactions represented by the daily journey-to-work flows between localities in which people live and/or work. Functional Regionalization of a territory often turns out to be computationally difficult, because of the size (that is, the number of localities constituting the territory under study) and the nature of the journey-to-work matrix (that is, the sparsity). In this thesis, we propose an innovative approach to Functional Regionalization based on the solution of graph partition problems over an undirected graph called transitions graph, which is generated by using the journey-to-work data. In this approach, the problem is solved by recursively partitioning the transition graph by using the min cut algorithms proposed by Stoer and Wagner and Brinkmeier. %In the second approach, the problem is solved maximizing a function of the sizes and interactions of subsets identified by successions of partitions obtained via Multilevel partitioning approach. This approach is applied to the determination of the Functional Regions for the Italian administrative regions. The target population of a statistical survey, also called scope, is the set of statistical units that should be surveyed. In the case of some large surveys or censuses, the scope cannot be the set of all available units, but it must be selected from this set. Surveying each unit has a cost and brings a different portion of the whole information. In this thesis, we focus on the case of Agricultural Census. In this case, the units are farms, and we want to determine a subset of units producing the minimum total cost and safeguarding at least a certain portion of the total information, according to the coverage levels assigned by the European regulations. Uncertainty aspects also occur, because the portion of information corresponding to each unit is not perfectly known before surveying it. The basic decision aspect is to establish the inclusion criteria before surveying each unit. We propose here to solve the described problem using multidimensional binary knapsack models.

Applications of combinatorial optimization arising from large scale surveys / Reale, Alessandra. - (2015 Nov 20).

Applications of combinatorial optimization arising from large scale surveys

REALE, ALESSANDRA
20/11/2015

Abstract

Many difficult statistical problems arising in censuses or in other large scale surveys have an underlying Combinatorial Optimization structure and can be solved with Combinatorial Optimization techniques. These techniques are often more efficient than the ad hoc solution techniques already developed in the field of Statistics. This thesis considers in detail two relevant cases of such statistical problems, and proposes solution approaches based on Combinatorial Optimization and Graph Theory. The first problem is the delineation of Functional Regions, the second one concerns the selection of the scope of a large survey, as briefly described below. The purpose of this work is therefore the innovative application of known techniques to very important and economically relevant practical problems that the "Censuses, Administrative and Statistical Registers Department" (DICA) of the Italian National Institute of Statistics (Istat), where I am senior researcher, has been dealing with. In several economical, statistical and geographical applications, a territory must be partitioned into Functional Regions. This operation is called Functional Regionalization. Functional Regions are areas that typically exceed administrative boundaries, and they are of interest for the evaluation of the social and economical phenomena under analysis. Functional Regions are not fixed and politically delimited, but are determined only by the interactions among all the localities of a territory. In this thesis, we focus on interactions represented by the daily journey-to-work flows between localities in which people live and/or work. Functional Regionalization of a territory often turns out to be computationally difficult, because of the size (that is, the number of localities constituting the territory under study) and the nature of the journey-to-work matrix (that is, the sparsity). In this thesis, we propose an innovative approach to Functional Regionalization based on the solution of graph partition problems over an undirected graph called transitions graph, which is generated by using the journey-to-work data. In this approach, the problem is solved by recursively partitioning the transition graph by using the min cut algorithms proposed by Stoer and Wagner and Brinkmeier. %In the second approach, the problem is solved maximizing a function of the sizes and interactions of subsets identified by successions of partitions obtained via Multilevel partitioning approach. This approach is applied to the determination of the Functional Regions for the Italian administrative regions. The target population of a statistical survey, also called scope, is the set of statistical units that should be surveyed. In the case of some large surveys or censuses, the scope cannot be the set of all available units, but it must be selected from this set. Surveying each unit has a cost and brings a different portion of the whole information. In this thesis, we focus on the case of Agricultural Census. In this case, the units are farms, and we want to determine a subset of units producing the minimum total cost and safeguarding at least a certain portion of the total information, according to the coverage levels assigned by the European regulations. Uncertainty aspects also occur, because the portion of information corresponding to each unit is not perfectly known before surveying it. The basic decision aspect is to establish the inclusion criteria before surveying each unit. We propose here to solve the described problem using multidimensional binary knapsack models.
20-nov-2015
File allegati a questo prodotto
File Dimensione Formato  
Tesi di Dottorato di Alessandra Reale_VERSIONE_FINALE_PER_PADIS.pdf

accesso aperto

Note: Dissertation submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in OPERATIONS RESEARCH
Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 3.64 MB
Formato Adobe PDF
3.64 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/917556
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact