Big Data Analysis: Methodologies, Frameworks and Real-World Applications
Date
2023-06-28
Journal Title
Journal ISSN
Volume Title
Publisher
Università della Calabria
Abstract
Inthelastyears,thecapacitytoproduceandcollectdatahasincreasedexpo-
nentially.Thehugeamountofdatagenerated,commonlyreferredtoasBigData,
thespeedatwhichitisproduced,anditsheterogeneityintermsofformatrepresent
a challengetocurrentstorage,processing,andanalysiscapabilities.Thisscenario
requiresthedesignandimplementationofnewarchitecturesandanalyticalplatform
solutionsthatmustprocessBigDatatoextractcomplexpredictiveanddescriptive
models.Today,high-performancecomputing(HPC)infrastructuressuchashighly
parallelclusters,supercomputers,andcloudscanbeusedforprocessingandanalyz-
ingmassivesourcesofreal-worlddatainvariousfields,includinggenomicsequencing
andmedicalresearch,frauddetection,andweatherforecasting.Followingthesepre-
liminaryobservations,thegoalofthisthesisistwofold.First,themainchallengesto
besolvedforimplementinginnovativedataanalysisapplicationsonHPCsystemsare
investigated.Inparticular,themainkeyresearchtopicsaddressedinclude:(i)stud-
iesofsoftwaresystemsforBigDatastoring,processing,andanalysis;(ii)methods,
techniques,andprototypesdesignedandusedtoimplementBigDatasolutionson
massivedatasourcesrequiringtheuseofhigh-performancecomputingsystems;and
(iii)designandprogrammingissuesforBigDataanalysisinExascalesystems,which
willrepresentthenextcomputingstep.Second,severalinnovativeapplicationsand
usecasesofBigDataanalyticsthatcanbeimplementedinlarge-scaleparallelsys-
temsareproposed.Theseresearchcontributionsprovidenewinsightsandsolutions
forextractingusefulknowledgefromlargevolumesofdata,describingmethodsand
mechanismstosupportusers,practitioners,andscientistsworkingintheareaofBig
Datainthedesignandexecutionofdataanalysistechniquesindifferentapplication
domains.
Description
Università della Calabria. Corso di laurea in Ingegneria Informatica, Modellistica, Elettronica e Sistemistica (DIMES). Dottorato di ricerca in Information and Communication Technologies (ICT). Ciclo XXXV
Keywords
big data analysis, high-performance computing (HPC), social data mining, machine learning, infectious diseases modelling