The increasing number of cybersecurity threats we are facing nowadays is fueling the development of new detection and contrast techniques based on the analysis of Big data[?]. In such a setting, the MapReduce paradigm has quickly become the de facto standard for carrying out this processing. This has led to a surge in the number of job offerings involving this skill. Moreover, we are experiencing a significant increase in the number of computer science courses covering this paradigm as well as its most popular implementations, Apache Spark and Apache Hadoop. In this paper, it is presented a solution for supporting the teaching of MapReduce through the use of software visualization. The proposed solution has two main goals. The first is to help students in understanding how the MapReduce paradigm succeeds in solving a complex problem by decomposing it in simpler sub problems, where each of these is solved by means of a map and/or a reduce operation. The second is about the capability of showing the way an input dataset is partitioned in blocks and processed in parallel by the different computing units of a distributed computing system. In both cases, the use of software visualization techniques with proper graphical metaphors helps the students in understanding what is going on, by providing them with a graphical representation that, on a side, describes how the considered algorithm works on a real dataset while, on the other side, illustrating the speed-up achieved thanks to the distributed approach. Our solution is based on the Spark implementation of the MapReduce paradigm. It allows a user (either a teacher or a student) to assemble a MapReduce distributed computation by interacting with a selection of supported distributed operations. Once an operation is selected, it is executed and visualized using a proper animation. This is assembled so as to reflect the distributed nature of the operation. The sequence of animations obtained by the execution of a flow of operations illustrates their behavior while showing how the input dataset is transformed along the time.
Using Software Visualization for Supporting the Teaching of Map Reduce / Ferraro-Petrillo, Umberto. - (2018). (Intervento presentato al convegno 12th International Conference on Network and System Security tenutosi a Hong Kong).
Using Software Visualization for Supporting the Teaching of Map Reduce
Ferraro-Petrillo Umberto
2018
Abstract
The increasing number of cybersecurity threats we are facing nowadays is fueling the development of new detection and contrast techniques based on the analysis of Big data[?]. In such a setting, the MapReduce paradigm has quickly become the de facto standard for carrying out this processing. This has led to a surge in the number of job offerings involving this skill. Moreover, we are experiencing a significant increase in the number of computer science courses covering this paradigm as well as its most popular implementations, Apache Spark and Apache Hadoop. In this paper, it is presented a solution for supporting the teaching of MapReduce through the use of software visualization. The proposed solution has two main goals. The first is to help students in understanding how the MapReduce paradigm succeeds in solving a complex problem by decomposing it in simpler sub problems, where each of these is solved by means of a map and/or a reduce operation. The second is about the capability of showing the way an input dataset is partitioned in blocks and processed in parallel by the different computing units of a distributed computing system. In both cases, the use of software visualization techniques with proper graphical metaphors helps the students in understanding what is going on, by providing them with a graphical representation that, on a side, describes how the considered algorithm works on a real dataset while, on the other side, illustrating the speed-up achieved thanks to the distributed approach. Our solution is based on the Spark implementation of the MapReduce paradigm. It allows a user (either a teacher or a student) to assemble a MapReduce distributed computation by interacting with a selection of supported distributed operations. Once an operation is selected, it is executed and visualized using a proper animation. This is assembled so as to reflect the distributed nature of the operation. The sequence of animations obtained by the execution of a flow of operations illustrates their behavior while showing how the input dataset is transformed along the time.File | Dimensione | Formato | |
---|---|---|---|
Ferraro Petrillo_Software-Visualization_2018.pdf
solo gestori archivio
Tipologia:
Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
326.34 kB
Formato
Adobe PDF
|
326.34 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.