Dipartimento di Ingegneria Informatica, Modellistica, Elettronica e Sistemistica - Tesi di Dottorato

Permanent URI for this collectionhttps://lisa.unical.it/handle/10955/31

Questa collezione raccoglie le Tesi di Dottorato afferenti al Dipartimento di Ingegneria Informatica, Modellistica, Elettronica e Sistemistica dell'Università della Calabria.

Browse

Search Results

Now showing 1 - 10 of 219

Big Data Analysis: Methodologies, Frameworks and Real-World Applications
(Università della Calabria, 2023-06-28) Branda, Francesco; Fortino, Giancarlo; Talia, Domenico
Inthelastyears,thecapacitytoproduceandcollectdatahasincreasedexpo- nentially.Thehugeamountofdatagenerated,commonlyreferredtoasBigData, thespeedatwhichitisproduced,anditsheterogeneityintermsofformatrepresent a challengetocurrentstorage,processing,andanalysiscapabilities.Thisscenario requiresthedesignandimplementationofnewarchitecturesandanalyticalplatform solutionsthatmustprocessBigDatatoextractcomplexpredictiveanddescriptive models.Today,high-performancecomputing(HPC)infrastructuressuchashighly parallelclusters,supercomputers,andcloudscanbeusedforprocessingandanalyz- ingmassivesourcesofreal-worlddatainvariousfields,includinggenomicsequencing andmedicalresearch,frauddetection,andweatherforecasting.Followingthesepre- liminaryobservations,thegoalofthisthesisistwofold.First,themainchallengesto besolvedforimplementinginnovativedataanalysisapplicationsonHPCsystemsare investigated.Inparticular,themainkeyresearchtopicsaddressedinclude:(i)stud- iesofsoftwaresystemsforBigDatastoring,processing,andanalysis;(ii)methods, techniques,andprototypesdesignedandusedtoimplementBigDatasolutionson massivedatasourcesrequiringtheuseofhigh-performancecomputingsystems;and (iii)designandprogrammingissuesforBigDataanalysisinExascalesystems,which willrepresentthenextcomputingstep.Second,severalinnovativeapplicationsand usecasesofBigDataanalyticsthatcanbeimplementedinlarge-scaleparallelsys- temsareproposed.Theseresearchcontributionsprovidenewinsightsandsolutions forextractingusefulknowledgefromlargevolumesofdata,describingmethodsand mechanismstosupportusers,practitioners,andscientistsworkingintheareaofBig Datainthedesignandexecutionofdataanalysistechniquesindifferentapplication domains.
Semantic control for the Cybersecurity domain: investigation on the representativeness of a domain-specific terminology referring to lexical variation
(Università della Calabria, 2021-05-12) Lanza, Claudia; Guarasci, Roberto; Crupi, Felice
The underlying idea of this PhD research project is to develop a model meant to guarantee the terminological coverage of a semantic resource, such as a thesaurus, and its representativeness threshold with reference to semantic variation over time within a highly specialized domain, such as the Cybersecurity. By building an Italian thesaurus related to the Cybersecurity domain, this project wants to offer organizations a knowledge representation of the field of study in Information and Communications Technology (ICT) security as complete as possible. The development of an Italian thesaurus for the Cybersecurity knowledge domain is part of the activities included in the main project “Cybersecurity Observatory” held by the Institution of Informatics and Telematics (IIT) at the National Research Council (CNR) sited in Pisa (Italy). The thesis describes the steps followed for the construction of the Italian Cybersecurity thesaurus and for the assessment of a multi-domain methodology to fix a semantic representativeness threshold with reference to qualitative terms richness within a specialized domain and the variation in information related to the latter over time. The main phases henceforth described are related to (1) a presentation of the principal reasons for building a semantic tool, such as a thesaurus, as a means of semantic control for a specific domain; (2) a description of the steps which characterize the corpus creation and the terminological extraction through the use of specific Natural Language Processing (NLP) tasks and linguistic pattern configuration within the employed software; (3) the way a bilingual thesaurus and a bilingual ontology have been realized by creating parallel and comparable corpora; (4) a presentation of a model of mapping existing standards on Cybersecurity in English to all the head terms contained in the source corpus in Italian through Python scripts in order to evaluate which candidate terms should be chosen for inclusion in the thesaurus; (5) a descriptive section on the work done in migrating the terms and their relationships from the Italian thesaurus on Cybersecurity to an ontology system; (6) the phase related to keyphrases extraction, with the help of document oriented algorithms, i.e., Multipartite Rank or TopicRank, from the source documents. This was carried out to obtain a targeted clustering of the domain and as an aide in the process of semantic abstraction, needed to better systematize the structure of thesaurus’ main entry categories; (7) the exploration of new methodologies, i.e., distributional semantics, term variation, pattern-based detection schemes or inference from the Web Ontology Language (OWL) properties, to deduce the technical information included in the source corpus with the goal of automatically generating the semantic network of connections between the representative terms of the Cybersecurity domain in a thesaurus system; (8) a future perspective, accompanied by evolving examples in practice, of creating an additional database to populate the Cybersecurity source corpus through the use of the social media world. Twitter is one of the preferred web portals from which to retrieve information about the domain: this new information flow should give to the semantic resources, set up for Cybersecurity knowledge organization, an increased level of terminological density to be analyzed in order to improve the semantic coverage.
DEM-CFD simulation of fluid-particle flow in carrier-based Dry Powder Inhalers for pharmaceutical applications
(Università della Calabria, 2023-01-05) Alfano, Francesca Orsola; Conte, Enrico; Di Maio, Francesco Paolo; Di Renzo, Alberto
Dry powder inhalers (DPI) are medical devices speci cally engineered to ensure maximum and e ective delivery of active pharmaceutical ingredients (API) in powder form upon inhalation by a patient. In this work, highly challenging CFD{DEM simulations are utilized to deterministically track the motion of both carrier and API particles in dry powder formulations along their ow from the dose cup through the exit of a swirl- ow-based dry powder inhaler. To achieve this purpose, a combination of di erent solutions is adopted: a su ciently small time-step is coupled to scaled contact/adhesive interaction parameters; grid-based contact detection and uid-to-particle and particleto- uid interpolation of the gas-solid interaction variables, i.e. gas velocity and voidage and drag force; a rolling friction model to allow for appropriate adhesion behaviour of the particles. Single phase air- ow, coupled air-carrier particle ow and coupled aircarrier- API particles are characterized in the device for di erent typical inhalation conditions. The aim is to investigate and gain detailed insight on all stages of the particles' lift-up, aero-dispersion, de-aggregation, interparticle and particle-wall collisions across the scales from few micron sized API powders to a commercial sized device. Thanks to a 4-way coupled CFD{DEM model, inertial, collisional, rotational and inter-particle adhesion e ects can be taken into account in modelling the coupled air and particle dynamics.
Ensemble of deep learning prediction models for data analytics
(Università della Calabria, 2021-06-21) Zicari, Paolo; Fortino, Giancarlo; Folino, Gianluigi
The abundance of available unstructured or raw text requires the automatic extraction of information for di↵erent tasks. One of the most relevant, Text Classification, extracts this information by assigning informative labels to raw texts from a pre-defined set. Deep Learning (DL) o↵ers challenging solutions to the automatic text classification problem. Despite the great potentialities of DL-based text classifiers, current solutions are exposed to a number of challenging issues that frequently occur in scenarios where text categorization is used in reallife applications. First of all, a large number of labelled data are usually necessary to train a deep model adequately, while labelling texts is timeconsuming, expensive, and very often requires specific knowledge. Moreover, configuring the structure and hyper-parameters of a Deep Neural Network (DNN) architecture is a difficult task, which entails long and careful design and tuning activities to make the DNN perform well. Typical scenarios are characterized by the fact that classes are often imbalanced. These issues entail a high risk of eventually obtaining a DNN-based classifier that overfits the training data and relies on non-general, biased and unreliable classification patterns. On the other hand, the black-box nature of a DNN model does not allow for easy reasoning on which features of a data instance drove the model to its classification decision. The work in this thesis, starting from the general problem of text classification, focuses on some challenging aspects associated with using an ensemble of deep learning methods to classify raw texts. More in detail, this work focuses on the analysis, exploration, study and test of algorithms and learning models to be employed in the proposal of novel techniques of Ensemble Deep Learning (EDL) aimed at performing classification and explanation tasks and on the research of semi-supervised strategies based on pseudo-labelling for improving classifier prediction performances in case of scarcity of labelled data. To this aim, this thesis proposes a complete framework based on the paradigm of ensembles of deep learning algorithms. The proposed framework is designed to furnish a valid instrument for exploring, validating and testing the proposed novel deep ensemble techniques contextualised in reallife applications, covering the entire classification process, including preprocessing, learning model building, explanation of the results, self-training for scarce labelled data, human-in-the-loop validating and model refining. Even though the methods proposed in this work could be used in any field of interest, the problem of extracting information from the raw text was specialised for two specific application contexts: automatic customer support ticket classification and the problem of fake detection. The first application scenario deals with the necessity of the Customer Care Department of most companies to answer their customer requests applied as tickets through several common channels like email, short message texts, social posts, etc. Ticket classification is necessary for automatic answer generation and routing to the specific human operator. Limiting the spread of misinformation, related to the high growth of social media dissemination and sharing of information, has raised the issue of distinguishing true news from fakes, with the challenging problem of processing long texts like news for fake detection. For this reason, the second scenario deals with the critical problem of discerning fake news from the vast amount of information circulating on the Web. In these research areas, the ensemble paradigm has been adopted only recently; thus, discovering the possible advantages when applying this technique is challenging. Experimental tests conducted on real data collected by two Customer Relationship Management (CRM) systems have proven the framework’s effectiveness in di↵erent ticket categorisation tasks and the practical value of their associated explanations. In addition, experiments conducted on two fake news datasets have proven the e↵ectiveness of the proposed semisupervised self-training ensemble-based strategy for improving performances when a few labelled data are available.
Feature Selection in Classification by means of Optimization and Multi-Objective Optimization
(Università della Calabria, 2023-05-10) Pirouz, Behzad; Fortino, Giancarlo; Gaudioso, Manlio
The thesis is in the area of mathematical optimization with application to Machine Learning. The focus is on Feature Selection (FS) in the framework of binary classification via Support Vector Machine paradigm. We concentrate on the use of sparse optimization techniques, which are widely considered as the election tool for tackling FS. We study the problem both in terms of single and multi-objective optimization. We propose first a novel Mixed-Integer Nonlinear Programming (MINLP) model for sparse optimization based on the polyhedral k-norm. We introduce a new way to take into account the k-norm for sparse optimization by setting a model based on fractional programming (FP). Then we address the continuous relaxation of the problem, which is reformulated via a DC (Difference of Convex) decomposition. On the other hand, designing supervised learning systems, in general, is a multi-objective problem. It requires finding appropriate trade-offs between several objectives, for example, between the number of misclassified training data (minimizing the squared error) and the number of nonzero elements separating the hyperplane (minimizing the number of nonzero elements). When we deal with multi-objective optimization problems, the optimization problem has yet to have a single solution that represents the best solution for all objectives simultaneously. Consequently, there is not a single solution but a set of solutions, known as the Pareto-optimal solutions. We overview the SVM models and the related Feature Selection in terms of multi-objective optimization. Our multi-objective approach considers two simultaneous objectives: minimizing the squared error and minimizing the number of nonzero elements of the normal vector of the separator hyperplane. In this thesis, we propose a multi-objective model for sparse optimization. Our primary purpose is to demonstrate the advantages of considering SVM models as multi-objective optimization problems. In multi-objective cases, we can obtain a set of Pareto optimal solutions instead of one in single-objective cases. Therefore, our main contribution in this thesis is of two levels: first, we propose a new model for sparse optimization based on the polyhedral k-norm for SVM classification, and second, use multi-objective optimization to consider this new model. The results of several numerical experiments on some classification datasets are reported. We used all the datasets for single-objective and multi-objective models.
Design of physically unclonable functions in cmos and emerging technologies for hardware security applications
(Università della Calabria, 2023-02-23) Vatalaro, Massimo; Fortino, Giancarlo; Crupi, Felice
The advent of the IoT scenario heavily pushed the demand of preserving the information down to the chip level due to the increasing demand of interconnected devices. Novel algorithms and hardware architectures are developed every year with the aim of making these systems more and more secure. However, IoT devices operate with constrained area, energy and budget thus making the hardware implementation of these architectures not always feasible. Moreover, these algorithms require truly random key for guarantying a certain security degree. Typically, these secret keys are generated off chip and stored in a non-volatile manner. Unfortunately, this approach requires additional costs and suffers from reverse engineering attacks. Physically unclonable functions (PUFs) are emerging cryptographic primitives which exploit random phenomena, such as random process variations in CMOS manufacturing processes, for generating a unique, repeatable, random, and secure keys in a volatile manner, like a digital fingerprint. PUFs represent a secure and low-cost solution for implementing lightweight cryptographic algorithms. Ideally PUF data should be unique and repeatable even under noisy or different environmental conditions. Unfortunately, guarantying a proper stability is still challenging, especially under PVT variations, thus requiring stability enhancement techniques which overtake the PUF itself in terms of required area and energy. Nowadays, different PUF solutions have been proposed with the aim of achieving ever more stable responses while keeping the area overhead low. This thesis presents a novel class of static monostable PUFs based on a voltage divider between two nominally identical sub-circuits. The fully static behavior along with the use of nominally identical sub-circuits ensure that the correct output is always delivered even when on-chip noise occasionally flips the bit, and that randomness is always guaranteed regardless of the PVT conditions. Measurement results in 180-nm CMOS technology demonstrates the effectiveness of the proposed solution with a native instability (BER) of only 0.61% (0.13%) along with a low sensitivity to both temperature and voltage variations. However, these results were achieved at the cost of more area-hungry design (i.e., 7,222𝐹 ) compared to other relevant works. The proposed solution was also implemented with emerging paper based MoS2 nFETs by exploiting a LUT-based Verilog-A model, calibrated with experimental 𝐼 vs 𝑉 at different 𝑉 curves, whose variability was extracted from different 𝐼 vs 𝑉 curves of 27 devices from the same manufacturing lot. Simulations results demonstrate that these devices can potentially used as building block for next generation electronics targeting hardware security applications. Finally, this thesis also provides an application scenario, in which the proposed PUF solution is employed as TRNG module for implementing a smart tag targeting anti-counterfeiting applications.
Distributed Big Social Data Analysis: Advanced Techniques and Execution Strategies
(Università della Calabria, 2023-05-16) Cantini, Riccardo; Fortino, Giancarlo; Trunfio, Paolo; Marozzo, Fabrizio
A logical and ontological framework for metadata extraction and modelling from heterogeneous document sources
(Università della Calabria, 2023-06-24) Cuconato, Simone; Fortino, Giancarlo; Folino, Antonietta; Cardillo, Elena
Design Methodologies for FPGA-based Deep Learning Accelerators and Their Characterization
(Università della Calabria, 2023) Sestito, Cristian; Fortino, Giancarlo; Perri, Stefania; Corsonello, Pasquale
Deep Neural Networks (DNNs) are widespread in many applications, including computer vision, speech recognition and robotics, thanks to the ability of such models to extract information by building a hierarchical representation of knowledge. Image processing benefits from the latter behavior by using Convolutional Neural Networks (CNNs), which consist of several Convolutional (CONV) layers to extract features from inputs at different levels of abstraction. However, CNNs usually require billions of computations to reach high accuracy levels. In order to sustain such computational load, proper hardware acceleration is needed. Field Programmable Gate Arrays (FPGAs) have been shown as promising candidates, because they are able to achieve high throughput at limited power dissipation. In addition, FPGAs are flexible architectures to accommodate several CNNs’ workloads. While the hardware acceleration of conventional CNN models has been widely investigated, the interest about more sophisticated tasks is still emerging. The latter includes CNNs based on Dilated Convolutions (DCONVs) and Transposed Convolutions (TCONVs), which deal with filter and image dilations, respectively. Accordingly, higher computational complexity is exhibited by these architectures, thus requiring careful hardware management. This PhD dissertation deals with the FPGA acceleration of CNNs for Image Processing based on DCONVs and TCONVs. Specifically, several designs using both the Very High-Speed Integrated Circuits Hardware Description Language (VHDL) and the High-Level Synthesis (HLS) are presented. Detailed characterization is discussed, based on the evaluation of resources occupation, throughput, power dissipation, as well as the impact of data quantization. Overall, the proposed circuits show noticeable energyefficiency when compared to several state-of-the-art counterparts. For instance, hardware acceleration of run-time reconfigurable CONVs and TCONVs for super-resolution imaging has shown an energy-efficiency of up to 518.5 GOPS/W, by outperforming stateof- the-art competitors by up to 2.3 times.
An active learning Approach based on learning models' parameters exploitation
(Università della Calabria, 2023-07-06) Scala, Francesco; Flesca, Sergio
Arti cial Intelligence (AI) techniques and in particular Machine and Deep Learning (ML and DL), have been widely adopted to enhance various aspects of human life. ML algorithms can be categorized into four main types: Supervised Learning, Unsupervised Learning, Semi-supervised Learning, and Reinforcement Learning. A signi cant challenge in these techniques is the requirement for su cient labeled data for training. Active Learning (AL) is a machine learning framework that addresses this issue by selecting instances to be labeled in a smart way to optimize model training, i.e., AL reduces labeling time and leads to better-performing models by dynamically selecting the most representative samples to be labeled during the training phase. AL was proven to be e ective in di erent scenarios and its choice of querying a label depends on the cost and gain of obtaining the information. In this thesis, are presented two novel approaches for active learning in meta-learning models. The proposed methods, called LAL-IGradV and LAL-IGradV-VAE, select instances to be labeled using an estimate of their impact on the current classi er. This is achieved by evaluating the importance of previously labeled instances in training the classi cation model and training another model that estimates the importance of unlabeled instances. The approaches can be instantiated with any classi er that is trainable through gradient descent optimization, and in this study, is provided a formulation using a deep neural network. These approaches have not been thoroughly investigated in previous learning-to-active-learn methods and experimental results demonstrate its promising performance in scenarios where there are only a limited number of initially labeled instances. 2

Dipartimento di Ingegneria Informatica, Modellistica, Elettronica e Sistemistica - Tesi di Dottorato

Browse

Filters

Settings

Sort By

Results per page

Search Results