Consistent hashing is used in distributed systems and networking applications to spread data evenly and efficiently across a cluster of nodes. In this paper, we present MementoHash, a novel consistent hashing algorithm that eliminates known limitations of state-of-the-art algorithms while keeping optimal performance and minimal memory usage. We describe the algorithm in detail, provide a pseudo-code implementation, and formally establish its solid theoretical guarantees. To measure the efficacy of MementoHash, we compare its performance, in terms of memory usage and lookup time, to that of state-of-theart algorithms, namely, AnchorHash, DxHash, and JumpHash. Unlike JumpHash, MementoHash can handle random failures. Moreover, MementoHash does not require fixing the overall capacity of the cluster (as AnchorHash and DxHash do), allowing it to scale indefinitely. The number of removed nodes affects the performance of all the considered algorithms. Therefore, we conduct experiments considering three different scenarios: stable (no removed nodes), one-shot removals (90% of the nodes removed at once), and incremental removals. We report experimental results that averaged a varying number of nodes from ten to one million. Results indicate that our algorithm shows optimal lookup performance and minimal memory usage in its best-case scenario. It behaves better than AnchorHash and DxHash in its average-case scenario and at least as well as those two algorithms in its worst-case scenario

Preprint

A Note on Bayesian Networks with Latent Root Variables

We characterise the likelihood function computed from a Bayesian network with latent variables as root nodes. We show that the marginal distribution over the remaining, manifest, variables also factorises as a Bayesian network, which we call empirical. A dataset of observations of the manifest variables allows us to quantify the parameters of the empirical Bayesian net. We prove that (i) the likelihood of such a dataset from the original Bayesian network is dominated by the global maximum of the likelihood from the empirical one; and that (ii) such a maximum is attained if and only if the parameters of the Bayesian network are consistent with those of the empirical model.

Journal

Efficient Computation of Counterfactual Bounds

Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas, David Huber, and Dario Azzimonti

International Journal of Approximate Reasoning, Jan 2024

We assume to be given structural equations over discrete variables inducing a directed acyclic graph, namely, a structural causal model, together with data about its internal nodes. The question we want to answer is how we can compute bounds for partially identifiable counterfactual queries from such an input. We start by giving a map from structural casual models to credal networks. This allows us to compute exact counterfactual bounds via algorithms for credal nets on a subclass of structural causal models. Exact computation is going to be inefficient in general given that, as we show, causal inference is NP-hard even on polytrees. We target then approximate bounds via a causal EM scheme. We evaluate their accuracy by providing credible intervals on the quality of the approximation; we show through a synthetic benchmark that the EM scheme delivers accurate results in a fair number of runs. In the course of the discussion, we also point out what seems to be a neglected limitation to the trending idea that counterfactual bounds can be computed without knowledge of the structural equations. We also present a real case study on palliative care to show how our algorithms can readily be used for practical purposes.

Proceedings

On the Limitations of Zero-Shot Classification of Causal Relations by LLMs

In Proceedings of Text2Story — Seventh Workshop on Narrative Extraction From Texts held in conjunction with the 46th European Conference on Information Retrieval (ECIR 2024), Mar 2024

Edge labelling represents one of the most challenging processes for knowledge graph creation in unsu-pervised domains. Abstracting the relations between the entities, extracted in the form of triplets, and assigning a single label to a cluster of relations might be quite difficult without supervision and tedious if based on manual annotations. This seems to be particularly the case for applications in literary text understanding, which is the focus of this paper. We present a simple but efficient way to label the edges between the character entities in the knowledge graph extracted from a novel or a short story using a two-level clustering based on BERT-embedding with supersenses and hypernyms. The lack of benchmark datasets in the literary domain poses significant challenges for evaluations. In this work-in-progress paper, we discuss preliminary results to understand the potential for further research.

Workshop

Learning Credal Conditional Probability Tables with Qualitative Knowledge

Saurabh Mathur, Alessandro Antonucci, and Sriraam Natarajan

Bayesian networks are a popular class of directed probabilistic graphical models that allow for closed-form learning of the local parameters if complete data are available. However, learning the parameters is challenging when the data are sparse, incomplete, and uncertain. In this work, we present an approach to this problem based on credal networks, a generalization of Bayesian networks based on set-valued local parameters. We derive an algorithm to learn such set-valued parameters from data using qualitative knowledge in the form of monotonic influence statements. Our preliminary empirical evaluation shows that using qualitative knowledge reduces uncertainty about the parameters without significant loss in accuracy.

Workshop

Zero-shot Causal Graph Extrapolation from Text via LLMs

Alessandro Antonucci, Gregorio Piqué, and Marco Zaffalon

In First XAI4Sci Workshop on Explainable Machine Learning for Sciences co-located at AAAI 2024), Feb 2024

We evaluate the ability of large language models (LLMs) to infer causal relations from natural language. Compared to traditional natural language processing and deep learning techniques, LLMs show competitive performance in a benchmark of pairwise relations without needing (explicit) training samples. This motivates us to extend our approach to extrapolating causal graphs through iterated pairwise queries. We perform a preliminary analysis on a benchmark of biomedical abstracts with ground-truth causal graphs validated by experts. The results are promising and support the adoption of LLMs for such a crucial step in causal inference, especially in medical domains, where the amount of scientific text to analyse might be huge, and the causal statements are often implicit.

2023

Journal

Approximating counterfactual bounds while fusing observational, biased and randomised data sources

Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas, and David Huber

International Journal of Approximate Reasoning, Nov 2023

We address the problem of integrating data from multiple, possibly biased, observational and interventional studies, to eventually compute counterfactuals in structural causal models. We start from the case of a single observational dataset affected by a selection bias. We show that the likelihood of the available data has no local maxima. This enables us to use the causal expectation-maximisation scheme to approximate the bounds for partially identiﬁable counterfactual queries, which are the focus of this paper. We then show how the same approach can address the general case of multiple datasets, no matter whether interventional or observational, biased or unbiased, by remapping it into the former one via graphical transformations. Systematic numerical experiments and a case study on palliative care show the effectiveness of our approach, while hinting at the beneﬁts of fusing heterogeneous data sources to get informative outcomes in case of partial identiﬁability.

Workshop

Tractable Bounding of Counterfactual Queries by Knowledge Compilation

David Huber, Yizuo Chen, Alessandro Antonucci, Adnan Darwiche, and Marco Zaffalon

In Proceedings of the Sixth Workshop on Tractable Probabilistic Modeling (@UAI 2023), Aug 2023

We discuss the problem of bounding partially identifiable queries, such as counterfactuals, in Pearlian structural causal models. A recently proposed iterated EM scheme yields an inner approximation of those bounds by sampling the initialisation parameters. Such a method requires multiple (Bayesian network) queries over models sharing the same structural equations and topology, but different exogenous probabilities. This setup makes a compilation of the underlying model to an arithmetic circuit advantageous, thus inducing a sizeable inferential speed-up. We show how a single symbolic knowledge compilation allows us to obtain the circuit structure with symbolic parameters to be replaced by their actual values when computing the different queries. We also discuss parallelisation techniques to further speed up the bound computation. Experiments against standard Bayesian network inference show clear computational advantages with up to an order of magnitude of speed-up.

Proceedings

Hierarchical Multi-Agent Reinforcement Learning for Air Combat Maneuvering

Ardian Selmonaj, Oleg Szehr, Giacomo Del Rio, Alessandro Antonucci, Adrian Schneider, and
1 more author

In Proceedings of the 2023 International Conference on Machine Learning and Applications (ICMLA), Dec 2023

The application of artificial intelligence to simulate air-to-air combat scenarios is attracting increasing attention. To date the high-dimensional state and action spaces, the high complexity of situation information (such as imperfect and filtered information, stochasticity, incomplete knowledge about mission targets) and the nonlinear flight dynamics pose significant challenges for accurate air combat decision-making. These challenges are exacerbated when multiple heterogeneous agents are involved. We propose a hierarchical multi-agent reinforcement learning framework for air-to-air combat with multiple heterogeneous agents. In our framework, the decision-making process is divided into two stages of abstraction, where heterogeneous low-level policies control the action of individual units, and a high-level commander policy issues macro commands given the overall mission targets. Low-level policies are trained for accurate unit combat control. Their training is organized in a learning curriculum with increasingly complex training scenarios and league-based self-play. The commander policy is trained on mission targets given pre-trained low-level policies. The empirical validation advocates the advantages of our design choices.

Proceedings

Probabilistic Modelling for Trustworthy Artificial Intelligence in Drone-Supported Autonomous Wheelchairs

Franca Corradini, Francesco Flammini, and Alessandro Antonucci

In Proceedings of the First International Symposium on Trustworthy Autonomous Systems, Jul 2023

In this work, we address the potential of probabilistic modelling approaches for ensuring trustworthy AI in sensing subsystems within drone-supported autonomous wheelchairs. The combination of drones and autonomous wheelchairs provides an innovative solution for enhancing mobility and independence of motion-impaired people. However, safety is a critical concern when deploying such systems in real-world scenarios. To address this challenge, probabilistic models can be used to capture uncertainty and non-stationarity in the environment and sensory system, enabling the device to make informed decisions while ensuring safe autonomy. The approach is being developed in the context of a recently started European project named REXASI-PRO, which addresses the modelling methodology, tools, reference architecture, design and implementation guidelines. In the project, relevant indoor and outdoor navigation use cases are addressed to demonstrate the effectiveness of the proposed approach in providing trustworthy autonomous wheelchairs in real-world environments.

Journal

Imprecise Probabilistic Model Checking for Stochastic Multi-agent Systems

Alberto Termine, Alessandro Antonucci, Giuseppe Primiero, and Alessandro Facchini

Standard techniques for model checking stochastic multi-agent systems usually assume the transition probabilities describing the system dynamics to be stationary and completely specified. As a consequence, neither non-stationary systems nor systems whose stochastic behaviour is partially unknown can be treated. So far, most of the approaches proposed to overcome this limitation suffer from complexity issues making them poorly efficient in the case of large state spaces. A fruitful but poorly explored way out is offered by the formalism of imprecise probabilities and the related imprecise Markov models. The aim of this paper is to show how imprecise probabilities can be fruitfully involved to model-check multi-agent systems characterised by non-stationary behaviours. Specifically, the paper introduces a new class of multi-agent models called Imprecise Probabilistic Interpreted Systems and their relative extensions with rewards. It also introduces a proper logical language to specify properties of such models and corresponding model checking algorithms based on iterative procedures to compute probabilistic and epistemic inferences over imprecise Markov models.

In Proceedings of Text2Story — Sixth Workshop on Narrative Extraction From Texts held in conjunction with the 45th European Conference on Information Retrieval (ECIR 2023), Apr 2023

Edge labelling represents one of the most challenging processes for knowledge graph creation in unsu-pervised domains. Abstracting the relations between the entities, extracted in the form of triplets, and assigning a single label to a cluster of relations might be quite difficult without supervision and tedious if based on manual annotations. This seems to be particularly the case for applications in literary text understanding, which is the focus of this paper. We present a simple but efficient way to label the edges between the character entities in the knowledge graph extracted from a novel or a short story using a two-level clustering based on BERT-embedding with supersenses and hypernyms. The lack of benchmark datasets in the literary domain poses significant challenges for evaluations. In this work-in-progress paper, we discuss preliminary results to understand the potential for further research.

Journal

Rubric-based Learner Modelling via Noisy Gates Bayesian Networks for Computational Thinking Skills Assessment

Giorgia Adorni, Francesca Mangili, Alberto Piatti, Claudio Bonesana, and Alessandro Antonucci

Journal of Communications Software and Systems, Mar 2023

In modern and personalised education, there is a growing interest in developing learners’ competencies and accurately assessing them. In a previous work, we proposed a procedure for deriving a learner model for automatic skill assessment from a task-specific competence rubric, thus simplifying the implementation of automated assessment tools. The previous approach, however, suffered two main limitations: (i) the ordering between competencies defined by the assessment rubric was only indirectly modelled; (ii) supplementary skills, not under assessment but necessary for accomplishing the task, were not included in the model. In this work, we address issue (i) by introducing dummy observed nodes, strictly enforcing the skills ordering without changing the network’s structure. In contrast, for point (ii), we design a network with two layers of gates, one performing disjunctive operations by noisy-OR gates and the other conjunctive operations through logical ANDs. Such changes improve the model outcomes’ coherence and the modelling tool’s flexibility without compromising the model’s compact parametrisation, interpretability and simple experts’ elicitation. We used this approach to develop a learner model for Computational Thinking (CT) skills assessment. The CT-cube skills assessment framework and the Cross Array Task (CAT) are used to exemplify it and demonstrate its feasibility.

Proceedings

Machine Learning Explanations by Surrogate Causal Models (MaLESCaMo)

Alberto Termine, Alessandro Antonucci, and Alessandro Facchini

In Joint Proceedings of the xAI-2023 Late-breaking Work, Demos and Doctoral Consortium co-located with the First World Conference on eXplainable Artificial Intelligence (xAI-2023), Jul 2023

Inferring causal explanations for machine learning models is a challenging task for eXplainable Artificial Intelligence (XAI). Counterfactual explanations, which are techniques to decide how to modify the model input to achieve a desired outcome, represents a possible first step in this direction. However, existing counterfactual explanation methods do not produce genuinely causal counterfactuals. These methods only exploit the correlations between features and target variables while ignoring the causal mechanisms among them. The project presented in this paper (and called MaLESCaMo) aims to develop a novel local and model-agnostic XAI procedure to generate genuine causal counterfactual explanations. Given a black-box predictor and an instance of the features, the procedure computes a counterfactual query in a surrogate causal model trained from a local neighbourhood of the input instance. To ease the domain expert elicitation of the causal model, we propose to adopt algorithms for partial ancestral graphs as a possible pre-processing step. A specialisation of the expectation maximisation algorithm is used instead to practically compute the causal queries.

2022

Proceedings

Modelling Assessment Rubrics through Bayesian Networks: a Pragmatic Approach

Francesca Mangili, Giorgia Adorni, Alberto Piatti, Claudio Bonesana, and Alessandro Antonucci

In 30th International Conference on Software, Telecommunications and Computer Networks, SoftCOM 2022, Sep 2022

Automatic assessment of learner competencies is a fundamental task in intelligent tutoring systems. An assessment rubric typically and effectively describes relevant competencies and competence levels. This paper presents an approach to deriving a learner model directly from an assessment rubric defining some (partial) ordering of competence levels. The model is based on Bayesian networks and exploits logical gates with uncertainty (often referred to as noisy gates) to reduce the number of parameters of the model, so to simplify their elicitation by experts and allow real-time inference in intelligent tutoring systems. We illustrate how the approach can be applied to automatize the human assessment of an activity developed for testing computational thinking skills. The simple elicitation of the model starting from the assessment rubric opens up the possibility of quickly automating the assessment of several tasks, making them more easily exploitable in the context of adaptive assessment tools and intelligent tutoring systems.

Proceedings

Bounding Counterfactuals under Selection Bias

Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas, David Huber, and Dario Azzimonti

In Proceedings of The 11th International Conference on Probabilistic Graphical Models, Oct 2022

Causal analysis may be affected by selection bias, which is defined as the systematic exclusion of data from a certain subpopulation. Previous work in this area focused on the derivation of identifiability conditions. We propose instead a first algorithm to address both identifiable and unidentifiable queries. We prove that, in spite of the missingness induced by the selection bias, the likelihood of the available data is unimodal. This enables us to use the causal expectation-maximisation scheme to obtain the values of causal queries in the identifiable case, and to compute bounds otherwise. Experiments demonstrate the approach to be practically viable. Theoretical convergence characterisations are provided.

Proceedings

Intelligent Tutoring Systems by Bayesian Nets with Noisy Gates

Alessandro Antonucci, Francesca Mangili, Claudio Bonesana, and Giorgia Adorni

Directed graphical models such as Bayesian nets are often used to implement intelligent tutoring systems able to interact in real-time with learners in a purely automatic way. When coping with such models, keeping a bound on the number of parameters might be important for multiple reasons. First, as these models are typically based on expert knowledge, a huge number of parameters to elicit might discourage practitioners from adopting them. Moreover, the number of model parameters affects the complexity of the inferences, while a fast computation of the queries is needed for real-time feedback. We advocate logical gates with uncertainty for a compact parametrization of the conditional probability tables in the underlying Bayesian net used by tutoring systems. We discuss the semantics of the model parameters to elicit and the assumptions required to apply such approach in this domain. We also derive a dedicated inference scheme to speed up computations.

Preprint

Belief Revision in Sentential Decision Diagrams

Lilith Mattei, Alessandro Facchini, and Alessandro Antonucci

Belief revision is the task of modifying a knowledge base when new information becomes available, while also respecting a number of desirable properties. Classical belief revision schemes have been already specialised to binary decision diagrams (BDDs), the classical formalism to compactly represent propositional knowledge. These results also apply to \emphordered BDDs (OBDDs), a special class of BDDs, designed to guarantee canonicity. Yet, those revisions cannot be applied to \emphsentential decision diagrams (SDDs), a typically more compact but still canonical class of Boolean circuits, which generalizes OBDDs, while not being a subclass of BDDs. Here we fill this gap by deriving a general revision algorithm for SDDs based on a syntactic characterisation of Dalal revision. A specialised procedure for DNFs is also presented. Preliminary experiments performed with randomly generated knowledge bases show the advantages of directly perform revision within SDD formalism.

2021

Workshop

ADAPQUEST: A Software for Web-Based Adaptive Questionnaires based on Bayesian Networks

Claudio Bonesana, Francesca Mangili, and Alessandro Antonucci

In AI4EDU: Artificial Intelligence For Education (@ IJCAI2021), Jan 2021

We introduce ADAPQUEST, a software tool writ- ten in Java for the development of adaptive ques- tionnaires based on Bayesian networks. Adaptiveness is intended here as the dynamical choice of the question sequence on the basis of an evolving model of the skill level of the test taker. Bayesian networks offer a flexible and highly interpretable framework to describe such testing process, especially when coping with multiple skills. ADAPQUEST embeds dedicated elicitation strategies to simplify the elicitation of the questionnaire parameters. An application of this tool for the diag- nosis of mental disorders is also discussed together with some implementation details.

Proceedings

Cautious Classification with Data Missing not at Random using Generative Random Forests

Julissa V. Llerena, Denis. D. Mauá, and Alessandro Antonucci

In Symbolic and Quantitative Approaches to Reasoning With Uncertainty, Sep 2021

Missing data present a challenge for most machine learning approaches. When a generative probabilistic model of the data is available, an effective approach is to marginalize missing values out. Probabilistic circuits are expressive generative models that allow for efficient exact inference. However, data is often missing not at random, and marginalization can lead to overconfident and wrong conclusions. In this work, we develop an efficient algorithm for assessing the robustness of classifications made by probabilistic circuits to imputations of the non- ignorable portion of missing data at prediction time. We show that our algorithm is exact when the model satisfies certain constraints, which is the case for the recent proposed Generative Random Forests, that equip Random Forest Classifiers with a full probabilistic model of the data. We also show how to extend our approach to handle non-ignorable missing data at training time.

Proceedings

CREPO: An Open Repository to Benchmark Credal Network Algorithms

Rafael Cabañas, and Alessandro Antonucci

In International Symposium on Imprecise Probabilities: Theories and Applications (ISIPTA-2021), Sep 2021

Credal networks are a popular class of imprecise prob- abilistic graphical models obtained as a Bayesian net- work generalization based on, so-called credal, sets of probability mass functions. A Java library called CREMA has been recently released to model, process and query credal networks. Despite the NP-hardness of the (exact) task, a number of algorithms is available to approximate credal network inferences. In this paper we present CREPO, an open repository of synthetic credal networks, provided together with the exact re- sults of inference tasks on these models. A Python tool is also delivered to load these data and interact with CREMA, thus making extremely easy to evaluate and compare existing and novel inference algorithms. To demonstrate such benchmarking scheme, we propose an approximate heuristic to be used inside variable elimination schemes to keep a bound on the maximum number of vertices generated during the combination step. A CREPO-based validation against approximate procedures based on linearization and exact techniques performed in CREMA is finally discussed.

Proceedings

Robust Model Checking with Imprecise Markov Reward Models

Alberto Termine, Alessandro Antonucci, Alessandro Facchini, and Giuseppe Primiero

In Proceedings of the Twelfth International Symposium on Imprecise Probabilities: Theories and Applications (ISIPTA ’21), Sep 2021

In recent years probabilistic model checking has become an important area of research because of the diffusion of computational systems of stochastic nature. Despite its great success, standard probabilistic model checking suffers the limitation of requiring a sharp specification of the probabilities governing the model behaviour. The theory of imprecise probabilities offers a natural approach to overcome such limitation by a sensitivity analysis with respect to the values of these parameters. However, only extensions based on discrete-time imprecise Markov chains have been considered so far for such a robust approach to model checking. We present a further extension based on imprecise Markov reward models. In particular, we derive efficient algorithms to compute lower and upper bounds of the expected cumulative reward and probabilistic bounded rewards based on existing results for imprecise Markov chains. These ideas are tested on a real case study involving the spend-down costs of geriatric medicine departments.

Proceedings

Logic and Model Checking by Imprecise Probabilistic Interpreted Systems

Alberto Termine, Alessandro Antonucci, Giuseppe Primiero, and Alessandro Facchini

In European Conference on Multi-Agent Systems (EUMAS 2021), Sep 2021

Stochastic multi-agent systems raise the necessity to extend probabilistic model checking to the epistemic domain. Results in this direction have been achieved by epistemic extensions of Probabilistic Computation Tree Logic and related Probabilistic Interpreted Systems. The latter, however, suffer of an important limitation: they require the probabilities governing the system’s behaviour to be fully specified. A promising way to overcome this limitation is represented by imprecise probabilities. In this paper we introduce imprecise probabilistic interpreted systems and present a related logical language and model-checking procedures based on recent advances in the study of imprecise Markov processes.

Proceedings

A New score for Adaptive Tests in Bayesian and Credal Networks

Alessandro Antonucci, Francesca Mangili, Claudio Bonesana, and Giorgia Adorni

In Symbolic and Quantitative Approaches to Reasoning With Uncertainty, Sep 2021

A test is adaptive when the sequence and number of ques- tions is dynamically tuned on the basis of the estimated skills of the taker. Graphical models, such as Bayesian networks, are used for adaptive tests as they allow to model the uncertainty about the questions and the skills in an explainable fashion, especially when coping with multiple skills. A better elicitation of the uncertainty in the question/skills relations can be achieved by interval probabilities. This turns the model into a credal net- work, thus increasing the inferential complexity of the queries required to select questions. This is especially the case for the information-theoretic quantities used as scores to drive the adaptive mechanism. We present an alternative family of scores, based on the mode of the posterior proba- bilities, and hence easier to explain. This makes considerably simpler the evaluation in the credal case, without significantly affecting the quality of the adaptive process. Numerical tests on synthetic and real-world data are used to support this claim.

Workshop

Causal Expectation-Maximisation

Marco Zaffalon, Alessandro Antonucci, and Rafael Cabañas

Structural causal models are the basic modelling unit in Pearl’s causal theory; in principle they allow us to solve counterfactuals, which are at the top rung of the ladder of causation. But they often contain latent variables that limit their application to special settings. This appears to be a consequence of the fact, proven in this paper, that causal inference is NP-hard even in models characterised by polytree-shaped graphs. To deal with such a hardness, we introduce the causal EM algorithm. Its primary aim is to reconstruct the uncertainty about the latent variables from data about categorical manifest variables. Counterfactual inference is then addressed via standard algorithms for Bayesian networks. The result is a general method to approximately compute counterfactuals, be they identifiable or not (in which case we deliver bounds). We show empirically, as well as by deriving credible intervals, that the approximation we provide becomes accurate in a fair number of EM runs. These results lead us finally to argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.

Workshop

Structural Learning of Probabilistic Sentential Decision Diagrams under Partial Closed-World Assumption

Alessandro Antonucci, Alessandro Facchini, and Lilith Mattei

In The Fourth Tractable Probabilistic Modelling Workshop (@ UAI 2021), Jul 2021

Probabilistic sentential decision diagrams are a class of structured-decomposable probabilistic circuits especially designed to embed logical constraints. To adapt the classical LearnSPN scheme to learn the structure of these models, we propose a new scheme based on a partial closed-world assumption: data implicitly provide the logical base of the circuit. Sum nodes are thus learned by recursively clustering batches in the initial data base, while the partitioning of the variables obeys a given input vtree. Preliminary experiments show that the proposed approach might properly fit training data, and generalize well to test data, provided that these remain consistent with the underlying logical base, that is a relaxation of the training data base.

In Proceedings of AI4Narratives — Workshop on Artificial Intelligence for Narratives
in conjunction with the 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence (IJCAI 2020), Jul 2020

When coping with literary texts such as novels or short stories, the extraction of structured informa- tion in the form of a knowledge graph might be hindered by the huge number of possible relations between the entities corresponding to the characters in the novel and the consequent hurdles in gathering supervised information about them. Such issue is addressed here as an unsupervised task empow- ered by transformers: relational sentences in the original text are embedded (with SBERT) and clustered in order to merge together semantically simi- lar relations. All the sentences in the same cluster are finally summarized (with BART) and a descriptive label extracted from the summary. Preliminary tests show that such clustering might successfully detect similar relations, and provide a valuable pre- processing for semi-supervised approaches.

Proceedings

SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in BERT-based Embedding Spaces

Vani Kanjirangat, Sandra Mitrovic, Alessandro Antonucci, and Fabio Rinaldi

In Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection), Dec 2020

Lexical semantic change detection (also known as semantic shift tracing) is a task of identifying words that have changed their meaning over time. Unsupervised semantic shift tracing, focal point of SemEval2020, is particularly challenging. Given the unsupervised setup, in this work, we propose to identify clusters among different occurrences of each target word, considering these as representatives of different word meanings. As such, disagreements in obtained clusters naturally allow to quantify the level of semantic shift per each target word in four target languages. To leverage this idea, clustering is performed on contextualized (BERT-based) embeddings of word occurrences. The obtained results show that our approach performs well both measured separately (per language) and overall, where we surpass all provided SemEval baselines.

Journal

Tractable Inference in Credal Sentential Decision Diagrams

Lilith Lilith Mattei, Alessandro Antonucci, Denis Deratani Mauá, Alessandro Facchini, and Julissa Villanueva Llerena

International Journal of Approximate Reasoning, Oct 2020

Probabilistic sentential decision diagrams are logic circuits where the inputs of disjunctive gates are annotated by probability values. They allow for a compact representation of joint probability mass functions defined over sets of Boolean variables, that are also consistent with the logical constraints defined by the circuit. The probabilities in such a model are usually “learned” from a set of observations. This leads to overconfident and prior-dependent inferences when data are scarce, unreliable or conflicting. In this work, we develop the credal sentential decision diagrams, a generalisation of their probabilistic counterpart that allows for replacing the local probabilities with (so-called credal) sets of mass functions. These models induce a joint credal set over the set of Boolean variables, that sharply assigns probability zero to states inconsistent with the logical constraints. Three inference algorithms are derived for these models. These allow to compute: (i) the lower and upper probabilities of an observation for an arbitrary number of variables; (ii) the lower and upper conditional probabilities for the state of a single variable given an observation; (iii) whether or not all the probabilistic sentential decision diagrams compatible with the credal specification have the same most probable explanation of a given set of variables given an observation of the other variables. These inferences are tractable, as all the three algorithms, based on bottom-up traversal with local linear programming tasks on the disjunctive gates, can be solved in polynomial time with respect to the circuit size. The first algorithm is always exact, while the remaining two might induce a conservative (outer) approximation in the case of multiply connected circuits. A semantics for this approximation together with an auxiliary algorithm able to decide whether or not the result is exact is also provided together with a brute-force characterization of the exact inference in these cases. For a first empirical validation, we consider a simple application based on noisy seven-segment display images. The credal models are observed to properly distinguish between easy and hard-to-detect instances and outperform other generative models not able to cope with logical constraints.

Proceedings

Approximate MMAP by Marginal Search

Alessandro Antonucci, and Thomas Tiotto

In Proceedings of the Thirty-Third International Florida Artificial Intelligence Research Society Conference (FLAIRS-33), May 2020

We present a heuristic strategy for marginal MAP (MMAP) queries in graphical models. The algorithm is based on a reduction of the task to a polynomial number of marginal inference computations. Given an input evidence, the marginals mass functions of the variables to be explained are computed. Marginal information gain is used to decide the variables to be explained first, and their most probable marginal states are consequently moved to the evidence. The sequential iteration of this procedure leads to a MMAP explanation and the minimum information gain obtained during the process can be regarded as a confidence measure for the explanation. Preliminary experiments show that the proposed confidence measure is properly detecting instances for which the algorithm is accurate and, for sufficiently high confidence levels, the algorithm gives the exact solution or an approximation whose Hamming distance from the exact one is small.

Proceedings

Temporal Embeddings and Transformer Models for Narrative Text Understanding

In Proceedings of Text2Story - Third Workshop on Narrative Extraction From Texts co-located with 42nd European Conference on Information Retrieval (ECIR 2020), Apr 2020

We present two deep learning approaches to narrative text understanding for character relationship modelling. The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time. An empirical analysis of the corresponding character trajectories shows that such approaches are e↵ective in depicting dynamic evolution. A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters. The empirical validation shows that such events (e.g., two characters belonging to the same family) might be spotted with good accuracy, even when using automatically annotated data. This provides a deeper understanding of narrative plots based on the identification of key facts. Standard clustering techniques are finally used for character de-aliasing, a necessary pre-processing step for both approaches. Overall, deep learning models appear to be suitable for narrative text understanding, while also providing a challenging and unexploited benchmark for general natural language understanding.

Proceedings

Temporal Word Embeddings for Narrative Understanding

Claudia Claudia Volpetti, Vani Kanjirangat, and Alessandro Antonucci

In Proceedings of the Twelfth International Conference on Machine Learning and Computing (ICMLC 2020), Feb 2020

We propose temporal word embeddings as a suitable tool to study the evolution of characters and their sentiments across the plot of a narrative text. The dynamic evolution of instances within a narrative text is a challenging task, where complex behavioral evolutions and other characteristics specific to the narrative text need to be inferred and interpreted. While starting from an existing approach to the learning of these models, we propose an alternative initialization procedure which seems to be especially suited for the case of narrative text. As a validation benchmark, we use the Harry Potter series of books as a challenging case study for such character trait evolution. A benchmark data set based on temporal word analogies related to the characters in the plot of the series is considered. The results are promising, and the empirical validation seems to support the working ideas behind this proposal.

Workshop

A Bayesian Approach to Conversational Recommendation Systems

Francesca Mangili, Denis Broggini, Alessandro Antonucci, Marco Alberti, and Lorenzo Cimasoni

In AAAI 2020 Workshop on Interactive and Conversational Recommendation Systems (WICRS-20), Feb 2020

We present a Bayesian approach to conversational recommender systems. After any interaction with the user, a probability mass function over the items is updated by the system. The conversational feature corresponds to a sequential discovery of the user preferences based on questions. Information-theoretic criteria are used to optimally shape the interactions and decide when the conversation ends. Most probable items are consequently recommended. Dedicated elicitation techniques for the prior probabilities of the parameters modelling the interactions are derived from basic structural judgements based on logical compatibility and symmetry assumptions. Such prior knowledge is combined with data for better item discrimination. Our Bayesian approach is validated against matrix factorization techniques for cold-start recommendations based on metadata using the popular benchmark data set MovieLens. Results show that the proposed approach allows to considerably reduce the number of interactions while maintaining good ranking performance.

Proceedings

Two Reformulation Approaches to Maximum-A-Posteriori Inference in Sum-Product Networks

Denis Deratani Mauà, Heitor Ribeiro, Gustavo Katague, and Alessandro Antonucci

In Proceedings of the 10th International Conference on Probabilistic Graphical Models (PGM 2020), Sep 2020

There exists a dichotomy between classical probabilistic graphical models, such as Bayesian net- works (BNs), and modern tractable models, such as sum-product networks (SPNs). The former generally have intractable inference, but provide a high level of interpretability, while the latter admit a wide range of tractable inference routines, but are typically harder to interpret. Due to this dichotomy, tools to convert between BNs and SPNs are desirable. While one direction – compiling BNs into SPNs – is well discussed in Darwiche’s seminal work on arithmetic circuit compilation, the converse direction – decompiling SPNs into BNs – has received surprisingly little attention. In this paper, we fill this gap by proposing SPN2BN, an algorithm that decompiles an SPN into a BN. SPN2BN has several salient features when compared to the only other two works decompiling SPNs. Most significantly, the BNs returned by SPN2BN are minimal independence-maps that are more parsimonious with respect to the introduction of latent variables. Secondly, the output BN produced by SPN2BN can be precisely characterized with respect to a compiled BN. More specifically, a certain set of directed edges will be added to the input BN, giving what we will call the moral-closure. Lastly, it is established that our compilation-decompilation process is idempotent. This has practical significance as it limits the size of the decompiled SPN.

Proceedings

CREDICI: A Java Library for Causal Inference by Credal Networks

Rafael Cabañas, Alessandro Antonucci, David Huber, and Marco Zaffalon

In Proceedings of the 10th International Conference on Probabilistic Graphical Models (PGM 2020), Sep 2020

We present CREDICI, a Java open-source tool for causal inference based on credal networks. Credal networks are an extension of Bayesian networks where local probability mass functions are only constrained to belong to given, so-called credal, sets. CREDICI is based on the recent work of Zaffalon et al. (2020), where an equivalence between Pearl’s structural causal models and credal networks has been derived. This allows to reduce a counterfactual query in a causal model to a standard query in a credal network, even in the case of unidentifiable causal effects. The necessary transformations and data structures are implemented in CREDICI, while inferences are eventually computed by CREMA (Huber et al., 2020), a twin library for general credal network inference. Here we discuss the main implementation challenges and possible outlooks.

Proceedings

CREMA: A Java Library for Credal Network Inference

David Huber, Rafael Cabañas, Alessandro Antonucci, and Marco Zaffalon

In Proceedings of the 10th International Conference on Probabilistic Graphical Models (PGM 2020), Sep 2020

We present CREMA (Credal Models Algorithms), a Java library for inference in credal networks. These models are analogous to Bayesian networks, but their local parameters are only con- strained to vary in, so-called credal, sets. Inference in credal networks is intended as the com- putation of the bounds of a query with respect to those local variations. For credal networks the task is harder than in Bayesian networks, being NPPP-hard in general models. Yet, scalable approximate algorithms have been shown to provide good accuracies on large or dense mod- els, while exact techniques can be designed to process small or sparse models. CREMA embeds these algorithms and also offers an API to build and query credal networks together with a specification format. This makes CREMA, whose features are discussed and described by a simple example, the most advanced tool for credal network modelling and inference developed so far.

Proceedings

Structural Causal Models are (Solvable by) Credal Networks

Marco Zaffalon, Alessandro Antonucci, and Rafael Cabañas

In Proceedings of the 10th International Conference on Probabilistic Graphical Models (PGM 2020), Sep 2020

A structural causal model is made of endogenous (manifest) and exogenous (latent) variables. We show that endogenous observations induce linear constraints on the probabilities of the exogenous variables. This allows to exactly map a causal model into a credal network. Causal inferences, such as interventions and counterfactuals, can consequently be obtained by standard algorithms for the updating of credal nets. These natively return sharp values in the identifiable case, while intervals corresponding to the exact bounds are produced for unidentifiable queries. A characterization of the causal models that allow the map above to be compactly derived is given, along with a discussion about the scalability for general models. This contribution should be regarded as a systematic approach to represent structural causal models by credal networks and hence to systematically compute causal inferences. A number of demonstrative examples is presented to clarify our methodology. Extensive experiments show that approximate algorithms for credal networks can immediately be used to do causal inference in real-size problems.

Proceedings

Learning Probabilistic Sentential Decision Diagrams by Sampling

Renato Geh, Denis Deratani Mauá, and Alessandro Antonucci

In Proceedings of the Eight Symposium on Knowledge Discovery, Mining and Learning (KMILE), Oct 2020

Probabilistic circuits are deep probabilistic models with neural-network-like semantics capable of accurately and efficiently answering probabilistic queries without sacrificing expressiveness. Probabilistic Sentential Decision Diagrams (PSDDs) are a subclass of probabilistic circuits able to embed logical constraints to the circuit’s structure. In doing so, they obtain extra expressiveness with empirical optimal performance. Despite achieving competitive performance compared to other state-of-the-art competitors, there have been very few attempts at learning PSDDs from a combination of both data and knowledge in the form of logical formulae. Our work investigates sampling random PSDDs consistent with domain knowledge and evaluating against state-of-the-art probabilistic models. We propose a method of sampling that retains important structural constraints on the circuit’s graph that guarantee query tractability. Finally, we show that these samples are able to achieve competitive performance even on larger domains.

Proceedings

Conversational Recommender System by Bayesian Methods

Francesca Mangili, Denis Broggini, and Alessandro Antonucci

In Proceedings of the Fourteenth International Conference on Scalable Uncertainty Management (SUM 2020), Sep 2020

We present a Bayesian approach to conversational recom- mender systems. After any interaction with the user, a probability mass function over the items is updated by the system. The conversational feature corresponds to a sequential discovery of the user preferences based on questions. Information-theoretic criteria are used to optimally shape the interactions and decide when the conversation ends. Most probable items are consequently recommended. Dedicated elicitation techniques for the prior probabilities of the parameters modelling the interactions are derived from basic structural judgements based on logical compatibility and symmetry assumptions. Such prior knowledge is combined with data for better item discrimination. Our Bayesian approach is validated against matrix factorization techniques for cold-start recommendations based on metadata using the popular benchmark data set MovieLens. Results show that the proposed approach allows to considerably reduce the number of interactions while maintaining good ranking performance.

2019

Proceedings

Credal Sentential Decision Diagrams

Alessandro Antonucci, Alessandro Facchini, and Lilith Mattei

In Proceedings of the Eleventh International Symposium on Imprecise Probabilities: Theories and Applications (ISIPTA ’19), Jul 2019

Probabilistic sentential decision diagrams are logical circuits annotated by probability mass functions on the disjunctive gates. This allows for a compact representation of joint mass functions consistent with logical constraints. We propose a \emphcredal generalisation of the probabilistic quantification of these models, that allows to replace the local probabilities with (credal) sets of mass functions specified by linear constraints. This induces a joint credal set, that sharply assigns probability zero to states inconsistent with the constraints. These models can support cautious estimates of the local parameters when only small amounts of training data are available. Algorithmic strategies to compute lower and upper bounds of marginal and conditional queries are provided. The task can be achieved in linear time with respect to the diagram size for marginal queries. The same can be done for conditional queries if the topology of the circuit is singly connected.

Workshop

Exploring the Space of Probabilistic Sentential Decision Diagrams

Lilith Mattei, Décio L. Soares, Alessandro Antonucci, Denis Deratani Mauà, and Alessandro Facchini

In Proceedings of the third Tractable Probabilistic Modeling Workshop, 36th International Conference on Machine Learning, June 2019, Jun 2019

Probabilistic sentential decision diagrams (PSDDs) are annotated circuits providing a possibly compact specification of joint probability mass functions consistent with a formula over a set of propositional variables. PSDD inference is tractable in the sense that marginal queries can be achieved in linear time with respect to the circuit size by traversal algorithms. Unlike other probabilistic graphical models such as Bayesian networks, the problem of learning the structure for PSDDs received relatively little attention. We discuss some preliminary ideas related to the development of pure likelihood-score-based search methods for the learning of PSDD structures fitting a formula and a data set of consistent observations. A sampling algorithm for these models is also provided.

Proceedings

Reliable Discretisation of Deterministic Equations in Bayesian Networks

Alessandro Antonucci

In Proceedings of the Thirty-Second International Florida Artificial Intelligence Research Society Conference (FLAIRS-32), Jun 2019

We focus on the problem of modeling deterministic equations over continuous variables in discrete Bayesian networks. This is typically achieved by a discretisation of both input and output variables and a degenerate quantification of the corresponding conditional probability tables. This approach, based on classical probabilities, cannot properly model the information loss induced by the discretisation. We show that a reliable modeling of such epistemic uncertainty can be instead achieved by credal sets, i.e., convex sets of probability mass functions. This transforms the original Bayesian network in a credal network, possibly returning interval-valued inferences, that are robust with respect to the information loss induced by the discretisation. Algorithmic strategies for an optimal choice of the discretisation bins are also discussed.

Proceedings

NOVEL2GRAPH: Visual Summaries of Narrative Text Enhanced by Machine Learning

In Proceedings of Text2Story - Second Workshop on Narrative Extraction From Texts co-located with 41th European Conference on Information Retrieval (ECIR 2019), Jun 2019

A machine learning approach to the creation of visual summaries for narrative text is presented. Standard natural language processing tools for named entities recognition are used together with a clustering algorithm to detect the characters of the novel and their aliases. The most relevant ones and their relations are evaluated on the basis of a simple statistical analysis. These characters are visually depicted as nodes of an undirected graph whose edges describe relations with other characters. Specialized sentiment analysis techniques based on sentence embedding decide the colours of characters/nodes and their relations/edges. Additional information about the characters (e.g., gender) and their relations (e.g., siblings or partnerships) are returned by binary classifiers and visually depicted in the graph. For those specialized tasks, small amounts of manually annotated data are sufficient to achieve good accuracy. Compared to analogous tools, the machine learning approach we present allows for a richer representation of texts of this kind. A case study to demonstrate this approach for a series of books is also reported.

2018

Proceedings

Imaginary Kinematics

Sabina Marchetti, and Alessandro Antonucci

In UAI 2018: Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, Jun 2018

We introduce a novel class of adjustment rules for a collection of beliefs. This is an extension of Lewis’ imaging to absorb probabilistic evidence in generalized settings. Unlike standard tools for belief revision, our proposal may be used when information is inconsistent with an agent’s belief base. We show that the functionals we introduce are based on the imaginary counterpart of probability kinematics for standard belief revision, and prove that, under certain conditions, all standard postulates for belief revision are satisfied.

Journal

Modeling Deterministic Equations in Discrete Bayesian Networks by Credal Networks

A reliable modeling of uncertain evidence in Bayesian networks based on a set-valued quantification is proposed. Both soft and virtual evidences are considered. We show that evidence propagation in this setup can be reduced to standard updating in an augmented credal network, equivalent to a set of consistent Bayesian networks. A characterization of the computational complexity for this task is derived together with an efficient exact procedure for a subclass of instances. In the case of multiple uncertain evidences over the same variable, the proposed procedure can provide a set-valued version of the geometric approach to opinion pooling.

Proceedings

A linear programming based approach for evaluating interval-valued influence diagrams

Rafael Cabañas, Andrés Cano, Manuel Gómez-Olmedo, and Alessandro Antonucci

In XVIII Conferencia de la Asociación Española para la Inteligencia Artificial (CAEPIA 2018): avances en Inteligencia Artificial., Jun 2018

In Proceedings of the 5th International Workshop on Probabilistic Logic Programming co-located with the 28th International Conference on Inductive Logic Programming (ILP 2018), Jun 2018

Probabilistic sentential decision diagrams are a class of arithmetic circuits locally annotated by probability mass functions. This allows for a compact representation of joint mass functions in the presence of logical constraints. Here we propose a set-valued generalisation of the probabilistic quantification in these models, that allows to replace the sharp specification of the local probabilities with linear constraints over them. This induces a (convex) set of joint probability mass functions, all consistent with the assigned logical constraints. These models should be adopted for a cautious learning of the local parameters if only small amount of data are available. Algorithmic strategies to compute the lower and upper bounds of marginal and conditional queries with respect to these sets of joint mass functions are sketched. The task can be achieved in linear time with respect to the diagram size for marginal queries and, if the diagram is singly connected, the same can be done for conditional queries.

Proceedings

A Credal Extension of Independent Choice Logic

Alessandro Antonucci, and Alessandro Facchini

In Scalable Uncertainty Management (SUM 2018): Proceedings of the Twelfth International Conference on Scalable Uncertainty Management), Jun 2018

We propose an extension of Poole’s independent choice logic based on a relaxation of the underlying independence assumptions. A credal semantics involving multiple joint probability mass functions over the possible worlds is adopted. This represents a conservative approach to probabilistic logic programming achieved by considering all the mass functions consistent with the probabilistic facts. This allows to model tasks for which independence among some probabilistic choices cannot be assumed, and a specific dependence model cannot be assessed. Preliminary tests on an object ranking application show that, despite the loose underlying assumptions, informative inferences can be extracted.

2017

Journal

Evaluating Interval-valued Influence Diagrams

Rafael Cabañas, Alessandro Antonucci, Andrés Cano, and Manuel Gómez-Olmedo

International Journal of Approximate Reasoning, Jan 2017

Influence diagrams are probabilistic graphical models used to represent and solve sequential decision problems under uncertainty. Sharp numerical values are required to quantify probabilities and utilities. This might be an issue with real models, whose parameters are typically obtained from expert judgements or partially reliable data. We consider an interval-valued quantification of the parameters to gain realism in the modeling and evaluate the sensitivity of the inferences with respect to perturbations in the sharp values of the parameters. An extension of the classical influence diagrams formalism to support such interval-valued potentials is presented. The variable elimination and arc reversal inference algorithms are generalized to cope with these models. At the price of an outer approximation, the extension keeps the same complexity as with sharp values. Numerical experiments show improved performances with respect to previous methods. As a natural application, we propose these models for practical sensitivity analysis in traditional influence diagrams. The maximum perturbation level on single or multiple parameters preserving the optimal strategy can be computed. This allows the identification of the parameters deserving a more careful elicitation.

Proceedings

Reliable Knowledge-Based Adaptive Tests by Credal Networks

Francesca Mangili, Claudio Bonesana, and Alessandro Antonucci

In Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Jun 2017

An adaptive test is a computer-based testing technique which adjusts the sequence of questions on the basis of the estimated ability level of the test taker. We suggest the use of credal networks, a generalization of Bayesian networks based on sets of probability mass functions, to implement adaptive tests exploiting the knowledge of the test developer instead of training on databases of answers. Compared to Bayesian networks, these models might offer higher expressiveness and hence a more reliable modeling of the qualitative expert knowledge. The counterpart is a less straightforward identification of the information-theoretic measure controlling the question-selection and the test-stopping criteria. We elaborate on these issues and propose a sound and computationally feasible procedure. Validation against a Bayesian-network approach on a benchmark about German language proficiency assessments suggests that credal networks can be reliable in assessing the student level and effective in reducing the number of questions required to do it.

Journal

The Multilabel Naive Credal Classifier

Alessandro Antonucci, and Giorgio Corani

International Journal of Approximate Reasoning, Apr 2017

A credal classifier for multilabel data is presented. This is obtained as an extension of Zaffalon’s naive credal classifier to the case of non-exclusive class labels. The dependence relations among the labels are shaped with a tree topology. The classifier, based on a polynomial-time algorithm to compute whether or not a class label is optimal, returns a compact description of the set of optimal sequences of labels. Extensive experiments on real multilabel data show that the classifier gives more robust predictions than its Bayesian counterpart. In practice, when multiple sequences are returned in output, the Bayesian model is more likely to be inaccurate, while the sequences returned by the credal classifier are more likely to include the correct one.

2016

Proceedings

Technical Gestures Recognition by Set-valued Hidden Markov Models with Prior Knowledge

Yann Soullard, Alessandro Antonucci, and Sébastien Destercke

In Soft Methods for Data Science (SMPS 2016: Proceedings of the Eight International Conference on Soft Methods in Probability and Statistics), Sep 2016

Hidden Markov models are popular tools for gesture recognition. Once the generative processes of gestures have been identified, an observation sequence is usually classified as the gesture having the highest likelihood, thus ignoring possible prior information. In this paper, we consider two potential improvements of such methods: the inclusion of prior information, and the possibility of considering convex sets of probabilities (in the likelihoods and the prior) to infer imprecise, but more reliable, predictions when information is insufficient. We apply the proposed approach to technical gestures, typically characterized by severe class imbalance. By modelling such imbalances as a prior information, we achieve more accurate results, while the imprecise quantification is shown to produce more reliable estimates.

Proceedings

Adaptive Testing by Bayesian Networks with Application to Language Assessment

Francesca Mangili, Claudio Bonesana, Alessandro Antonucci, Marco Zaffalon, Elisa Rubegni, and
1 more author

In ITS 2016: Intelligent Tutoring Systems: 13th International Conference, Jun 2016

We present a general procedure for computerized adaptive testing based on probabilistic graphical models, and show on a real-world benchmark how this procedure can increase the internal consistency of the test and reduce the number of questions without affecting accuracy.

Journal

Hidden Markov models with set-valued parameters

Denis Deratani Mauá, Alessandro Antonucci, and Cassio Polpo Campos

Hidden Markov models (HMMs) are widely used probabilistic models of sequential data. As with other probabilistic models, they require the specification of local conditional probability distributions, whose assessment can be too difficult and error-prone, especially when data are scarce or costly to acquire. The imprecise HMM (iHMM) generalizes HMMs by allowing the quantification to be done by sets of, instead of single, probability distributions. iHMMs have the ability to suspend judgment when there is not enough statistical evidence, and can serve as a sensitivity analysis tool for standard non-stationary HMMs. In this paper, we consider iHMMs under the strong independence interpretation, for which we develop efficient infer- ence algorithms to address standard HMM usage such as the computation of likelihoods and most probable explanations, as well as performing filter- ing and predictive inference. Experiments with real data show that iHMMs produce more reliable inferences without compromising the computational efficiency.

2015

Journal

Approximate Credal Network Updating by Linear Programming with Applications to Decision Making

Alessandro Antonucci, Cassio Polpo Campos, David Huber, and Marco Zaffalon

International Journal of Approximate Reasoning, Mar 2015

Credal nets are probabilistic graphical models which extend Bayesian nets to cope with sets of distributions. An algorithm for approximate credal network updating is presented. The problem in its general formulation is a multilinear optimization task, which can be linearized by an appropriate rule for fixing all the local models apart from those of a single variable. This simple idea can be iterated and quickly leads to accurate inferences. A transformation is also derived to reduce decision making in credal networks based on the maximality criterion to updating. The decision task is proved to have the same complexity of standard inference, being NP^PP-complete for general credal nets and NP-complete for polytrees. Similar results are derived for the E-admissibility criterion. Numerical experiments confirm a good performance of the method.

Journal

Imprecision in Machine Learning and AI

Cassio Polpo Campos, and Alessandro Antonucci

The IEEE Intelligent Informatics Bulletin, Dec 2015

Hidden Markov models are popular tools for the modeling of multivariate time series. A set-valued quantification of the parameters might increase realism in the description of non-stationarity. A recent work shows that the computation of the bounds of the likelihood of a sequence with respect to such imprecise quantification can be performed in the same polynomial time of a precise computation with sharp values. This is the basis for a credal classifier of time series. For each training sequence we learn a set-valued model and compute the interval likelihood of the test sequence. The returned classes are those of the models associated to the undominated intervals. The approach is particularly accurate on the instances for which single classes are returned. We therefore apply this method to early classi- fication of streaming data. As soon as the credal classifier returns a single output we assign a class label even if the stream is not terminated. Tests on a speech recognition benchmark suggest that the proposed approach might outperform a thresholding of the precise likelihoods with the classical models.

Inproceedings

The Multilabel Naive Credal Classifier

Alessandro Antonucci, and Giorgio Corani

In ISIPTA ’15: Proceedings of the Ninth International Symposium on Imprecise Probability: Theories and Applications, Jul 2015

We present a credal classifier for multilabel data. The model generalizes the naive credal classifier to the multilabel case. An imprecise-probabilistic quantifi- cation is achieved by means of the imprecise Dirichlet model in its global formulation. A polynomial-time algorithm to compute whether or not a label is optimal according to the maximality criterion is derived. Experimental results show the importance of robust predictions in multilabel problems.

Proceedings

Variable Elimination for Interval-valued Influence Diagrams

Rafael Cabañas, Alessandro Antonucci, Andrés Cano, and Manuel Gómez-Olmedo

In ECSQARU 2013: Proceedings of the 13th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Jul 2015

Influence diagrams are probabilistic graphical models used to represent and solve decision problems under uncertainty. Sharp numerical values are required to quantify probabilities and utilities. Yet, real models are based on data streams provided by partially reliable sensors or experts. We propose an interval-valued quantification of these parameters to gain realism in the modelling and to analyse the sensitivity of the inferences with respect to perturbations of the sharp values. An extension of the classical influence diagrams formalism to support interval-valued potentials is provided. Moreover, a variable elimination algorithm especially designed for these models is developed and evalu- ated in terms of complexity and empirical performances.

Journal

Robust Classification of Multivariate Time Series by Imprecise Hidden Markov Models

Alessandro Antonucci, Rocco Rosa, Alessandro Giusti, and Fabio Cuzzolin

International Journal of Approximate Reasoning, Jul 2015

A novel technique to classify time series with imprecise hidden Markov models is presented. The learning of these models is achieved by coupling the EM algorithm with the imprecise Dirichlet model. In the stationarity limit, each model corresponds to an imprecise mixture of Gaussian densities, this reducing the problem to the classification of static, imprecise-probabilistic, information. Two classifiers, one based on the expected value of the mixture, the other on the Bhattacharyya distance between pairs of mixtures, are developed. The computation of the bounds of these descriptors with respect to the imprecise quantification of the parameters is reduced to, respectively, linear and quadratic optimization tasks, and hence efficiently solved. Classification is performed by extending the k-nearest neighbors approach to interval-valued data. The classifiers are credal, meaning that multiple class labels can be returned in the output. Experiments on benchmark datasets for computer vision show that these methods achieve the required robustness whilst outperforming other precise and imprecise methods.

2014

Proceedings

Algorithms for Hidden Markov Models with Imprecisely Specified Parameters

Denis Deratani Mauá, Cassio Polpo Campos, and Alessandro Antonucci

In BRACIS 2014: Proceedings of the Brazilian Conference on Intelligent Systems, Oct 2014

Hidden Markov models (HMMs) are widely used models for sequential data. As with other probabilistic graphical models, they require the specification of precise probability values, which can be too restrictive for some domains, espe- cially when data are scarce or costly to acquire. We present a generalized version of HMMs, whose quantification can be done by sets of, instead of single, probability distributions. Our models have the ability to suspend judgment when there is not enough statistical evidence, and can serve as a sensitivity analysis tool for standard non-stationary HMMs. Efficient inference algorithms are developed to address standard HMM usage such as the computation of likelihoods and most probable explanations. Experiments with real data show that the use of imprecise probabilities leads to more reliable inferences without compromising efficiency.

Journal

Probabilistic Inference in Credal Networks: New Complexity Results

Credal networks are graph-based statistical models whose parameters take values in a set, instead of being sharply specified as in traditional statistical models (e.g., Bayesian networks). The computational complexity of inferences on such models depends on the irrelevance/independence concept adopted. In this paper, we study inferential complexity under the concepts of epistemic irrelevance and strong independence. We show that infer- ences under strong independence are NP-hard even in trees with binary variables except for a single ternary one. We prove that under epistemic irrelevance the polynomial-time complexity of inferences in credal trees is not likely to extend to more general models (e.g., singly connected topologies). These results clearly distinguish networks that admit efficient inferences and those where inferences are most likely hard, and settle several open questions regarding their computational complexity. We show that these results remain valid even if we disallow the use of zero probabilities. We also show that the computation of bounds on the probability of the future state in a hidden Markov model is the same whether we assume epistemic irrelevance or strong independence, and we prove a similar result for inference in naive Bayes structures. These inferential equivalences are important for practitioners, as hidden Markov models and naive Bayes structures are used in real applications of imprecise probability.

Journal

Credal ensembles of classifiers

G. Corani, and A. Antonucci

Computational Statistics & Data Analysis, Mar 2014

It is studied how to aggregate the probabilistic predictions generated by different SPODE (Super-Parent-One-Dependence Estimators) classifiers. It is shown that aggregating such predictions via compression-based weights achieves a slight but consistent improvement of performance over previously existing aggregation methods, including Bayesian Model Averaging and simple average (the approach adopted by the AODE algorithm). Then, attention is given to the problem of choosing the prior probability distribution over the models; this is an important issue in any Bayesian ensemble of models. To robustly deal with the choice of the prior, the single prior over the models is substituted by a set of priors over the models (credal set), thus obtaining a credal ensemble of Bayesian classifiers. The credal ensemble recognizes the prior-dependent instances, namely the instances whose most probable class varies when different prior over the models are considered. When faced with prior-dependent instances, the credal ensemble remains reliable by returning a set of classes rather than a single class. Two credal ensembles of SPODEs are developed; the first generalizes the Bayesian Model Averaging and the second the compression-based aggregation. Extensive experiments show that the novel ensembles compare favorably to traditional methods for aggregating SPODEs and also to previous credal classifiers.

Proceedings

Global Sensitivity Analysis for MAP Inference in Graphical Models

Jasper De Bock, Cassio Polpo Campos, and Alessandro Antonucci

In Advances in Neural Information Processing Systems 27 (NIPS 2014), Dec 2014

We study the sensitivity of a MAP configuration of a discrete probabilistic graph- ical model with respect to perturbations of its parameters. These perturbations are global, in the sense that simultaneous perturbations of all the parameters (or any chosen subset of them) are allowed. Our main contribution is an exact algorithm that can check whether the MAP configuration is robust with respect to given per- turbations. Its complexity is essentially the same as that of obtaining the MAP configuration itself, so it can be promptly used with minimal effort. We use our algorithm to identify the largest global perturbation that does not induce a change in the MAP configuration, and we successfully apply this robustness measure in two practical scenarios: the prediction of facial action units with posed images and the classification of multiple real public data sets. A strong correlation between the proposed robustness measure and accuracy is verified in both scenarios.

Proceedings

Trading Off Speed and Accuracy in Multilabel Classification

Giorgio Corani, Alessandro Antonucci, Denis Deratani Mauá, and Sandra Gabaglio

In PGM ’14: Proceedings of the Seventh European Workshop on Probabilistic Graphical Models, Sep 2014

In previous work, we devised an approach for multilabel classification based on an ensemble of Bayesian networks. It was characterized by an efficient structural learning and by high accuracy. Its shortcoming was the high computational complexity of the MAP inference, necessary to identify the most probable joint configuration of all classes. In this work, we switch from the ensemble approach to the single model approach. This allows important computational savings. The reduction of inference times is exponential in the difference between the treewidth of the single model and the number of classes. We adopt moreover a more sophisticated approach for the structural learning of the class subgraph. The proposed single models outperforms alternative approaches for multilabel classification such as binary relevance and ensemble of classifier chains.

Proceedings

Decision Making with Hierarchical Credal Sets

Alessandro Antonucci, Alexander Karlsson, and David Sundgren

In Information Processing and Management of Uncertainty in Knowledge-Based Systems, Jul 2014

Credal networks are graph-based statistical models whose parameters take values in a set, instead of being sharply specified as in traditional statistical models (e.g., Bayesian networks). The computational complexity of inferences on such models depends on the irrelevance/independence concept adopted. In this paper, we study inferential complexity under the concepts of epistemic irrelevance and strong independence. We show that inferences under strong independence are NP-hard even in trees with ternary variables. We prove that under epistemic irrelevance the polynomial time complexity of inferences in credal trees is not likely to extend to more general models (e.g. singly connected networks). These results clearly distinguish networks that admit efficient inferences and those where inferences are most likely hard, and settle several open questions regarding computational complexity.

Proceedings

CREDO: A Military Decision-Support System Based on Credal Networks

Alessandro Antonucci, David Huber, Marco Zaffalon, Philippe Luginbühl, Ian Chapman, and
1 more author

In Proceedings of the 16th Conference on Information Fusion (FUSION 2013), Jul 2013

A software tool especially designed for military domains to create and query decision-support systems is presented. Credal networks, which are Bayesian networks whose parameters have the freedom to vary in convex sets, are used to model the relations among the system variables. A novel elicitation procedure of these sets, which allows the military experts to report their knowledge by purely qualitative judgements, is proposed. Two high-level fusion procedures to cope with multiple experts in this framework are also derived. All these features are supported by the software and demonstrated in an application to space security tested during the last NATO multinational experiment.

Proceedings

An Ensemble of Bayesian Networks for Multilabel Classification

Alessandro Antonucci, Giorgio Corani, Denis Deratani Mauá, and Sandra Gabaglio

In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI-13), Aug 2013

We present a novel approach for multilabel classification based on an ensemble of Bayesian networks. The class variables are connected by a tree; each model of the ensemble uses a different class as root of the tree. We assume the features to be conditionally independent given the classes, thus generalizing the naive Bayes assumption to the multiclass case. This assumption allows us to optimally identify the correlations between classes and features; such correlations are moreover shared across all models of the ensemble. Inferences are drawn from the ensemble via logarithmic opinion pooling. To minimize Hamming loss, we compute the marginal probability of the classes by running standard inference on each Bayesian network in the ensemble, and then pooling the inferences. To instead minimize the subset 0/1 loss, we pool the joint distributions of each model and cast the problem as a MAP inference in the corresponding graphical model. Experiments show that the approach is competitive with state-of-the-art methods for multilabel classification.

Proceedings

Temporal Data Classification by Imprecise Dynamical Models

Alessandro Antonucci, Rocco Rosa, Alessandro Giusti, and Fabio Cuzzolin

In ISIPTA ’13: Proceedings of the Eighth International Symposium on Imprecise Probability: Theories and Applications, Aug 2013

We propose a new methodology to classify temporal data with imprecise hidden Markov models. For each sequence we learn a different model by coupling the EM algorithm with the imprecise Dirichlet model. As a model descriptor, we consider the expected value of the observable variable in the limit of stationarity of the Markov chain. In the imprecise case, only the bounds of this descriptor can be evaluated. In practice the sequence, which can be regarded as a trajectory in the feature space, is summarized by a hyperbox in the same space. We classify these static but interval-valued data by a credal generalization of the k-nearest neighbors algorithm. Experiments on benchmark datasets for computer vision show that the method achieves the required robustness whilst outperforming other precise and imprecise methods.

Proceedings

Approximating Credal Network Inferences by Linear Programming

Alessandro Antonucci, Cassio Polpo Campos, David Huber, and Marco Zaffalon

In Proceedings of the 12th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Jul 2013

An algorithm for approximate credal network updating is presented. The problem in its general formulation is a multilinear optimization task, which can be linearized by an appropriate rule for fixing all the local models apart from those of a single variable. This simple idea can be iterated and quickly leads to very accurate inferences. The approach can also be specialized to classification with credal networks based on the maximality criterion. A complexity analysis for both the problem and the algorithm is reported together with numerical experiments, which confirm the good performance of the method. While the inner approximation produced by the algorithm gives rise to a classifier which might return a subset of the optimal class set, preliminary empirical results suggest that the accuracy of the optimal class set is seldom affected by the approximate probabilities.

2012

Proceedings

Active Learning by the Naive Credal Classifier

Alessandro Antonucci, Giorgio Corani, and Sandra Gabaglio

In Proceeding of the sixth European Workshop on Probabilistic Graphical Models (PGM 2012), Jul 2012

In standard classification a training set of supervised instances is given. In a more general setup, some supervised instances are available, while further ones should be chosen from an unsupervised set and then annotated. As the annotation step is costly, active learning algorithms are used to select which instances to annotate to maximally increase the classification performance while annotating only a limited number of them. Several active learning algorithms are based on the naive Bayes classifier. We work instead with the naive credal classifier, namely an extension of naive Bayes to imprecise probability. We propose two novel methods for active learning based on the naive credal classifier. Empirical comparisons show performance comparable or slightly superior to that of approaches solely based on the naive Bayes.

ECAI

Compression-based AODE Classifiers

Giorgio Corani, Alessandro Antonucci, and Rocco Rosa

In Proceedings 20th European Conference on Artificial Intelligence (ECAI 2012), Aug 2012

We propose the COMP-AODE classifier, which adopts the compression-based approach to average the posterior probabilities computed by different non-naive classifier (SPODEs). COMP-AODE improves classification performance over the well-known AODE model. COMP-AODE assumes a uniform prior over the SPODEs; we then develop the credal classifier COMP-AODE*, substituting the uniform prior by a set of priors. COMP-AODE* returns more classes when the classification is prior-dependent, namely if the most probable class varies with the prior adopted over the SPODEs. COMP-AODE* achieves higher classification utility than both COMP-AODE and AODE.

Proceedings

An Interval-Valued Dissimilarity Measure for Belief Functions Based on Credal Semantics

Alessandro Antonucci

In Belief Functions: Theory and Applications - Proceedings of the 2nd International Conference on Belief Functions, May 2012

Evidence theory extends Bayesian probability theory by allowing for a more expressive model of subjective uncertainty. Besides standard interpretation of belief functions, where uncertainty corresponds to probability masses which might refer to whole subsets of the possibility space, credal semantics can be also considered. Accordingly, a belief function can be identified with the whole set of probability mass functions consistent with the beliefs induced by the masses. Following this interpretation, a novel, set-valued, dissimilarity measure with a clear behavioral interpretation can be defined. We describe the main features of this new measure and comment the relation with other measures proposed in the literature.

Proceedings

Likelihood-Based Robust Classification with Bayesian Networks

Alessandro Antonucci, Marco Cattaneo, and Giorgio Corani

In Advances in Computational Intelligence, Jul 2012

Bayesian networks are commonly used for classification: a structural learning algorithm determines the network graph, while standard approaches estimate the model parameters from data. Yet, with few data the corresponding assessments can be unreliable. To gain robustness in this phase, we consider a likelihood-based learning approach, which takes all the model quantifications whose likelihood exceeds a given threshold. A new classification algorithm based on this approach is presented. Notably, this is a credal classifier, i.e., more than a single class can be returned in output. This is the case when the Bayesian networks consistent with the threshold constraint assign different class labels to a test instance. This is the first classifier of this kind for general topologies. Experiments show how this approach provide the desired robustness.

2011

Proceedings

Decision Making by Credal Nets

Alessandro Antonucci, and Cassio Polpo Campos

In Proceedings of the International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC 2011), Aug 2011

Credal nets are probabilistic graphical models which extend Bayesian nets to cope with sets of distributions. This feature makes the model particularly suited for the implementation of classifiers and knowledge-based systems. When working with sets of (instead of single) probability distributions, the identification of the optimal option can be based on different criteria, some of them eventually leading to multiple choices. Yet, most of the inference algorithms for credal nets are designed to compute only the bounds of the posterior probabilities. This prevents some of the existing criteria from being used. To overcome this limitation, we present two simple transformations for credal nets which make it possible to compute decisions based on the maximality and E-admissibility criteria without any modification in the inference algorithms. We also prove that these decision problems have the same complexity of standard inference, being NP^PP-hard for general credal nets and NP-hard for polytrees.

The naive credal classifier extends the classical naive Bayes classifier to imprecise probabilities, substituting the uniform prior by the imprecise Dirichlet model. As an alternative to the naive credal classifier, we present a hierarchical likelihood-based approach, which extends in a novel way the naive Bayes towards imprecise probabilities; in particular, it considers any possible quantification (each one defining a naive Bayes classifier) apart from those assigning to the available data a probability below a given threshold level. Besides the available supervised data, in the likelihood evaluation we also consider the instance to be classified, for which the value of the class variable is assumed missing-at-random. We obtain a closed formula to compute the dominance according to the maximality criterion for any threshold level. As there are currently no well-established metrics for comparing credal classifiers which have considerably different determinacy, we compare the two classifiers when they have comparable determinacy, finding that in those cases they generate almost equivalent classifications.

Proceedings

The Imprecise Noisy-OR Gate

Alessandro Antonucci

In FUSION 2011: Proceedings of the 14th International Conference on Information Fusion, Jul 2011

The noisy-OR gate is an important tool for a compact elicitation of the conditional probabilities of a Bayesian network. An imprecise-probabilistic version of this model, where sets instead of single distributions are used to model uncertainty about the inhibition of the causal factors, is proposed. This transforms the original Bayesian network into a so-called credal network. Despite the higher computational complexity generally characterizing inference on credal networks, it is possible to prove that, exactly as for Bayesian networks, the local complexity to update probabilities on an imprecise noisy-OR gate takes only linear, instead of exponential, time in the number of causes. This result is also extended to fault tree analysis and allows for a fast fusion of the causal effects on models with an imprecise-probabilistic quantification of the initiating events.

Proceedings

Action Recognition by Imprecise Hidden Markov Models

Alessandro Antonucci, Rocco Rosa, and Alessandro Giusti

In Proceedings of the 2011 International Conference on Image Processing, Computer Vision and Pattern Recognition, IPCV 2011, Jul 2011

Hidden Markov models (HMMs) are powerful tools to capture the dynamics of a human action by providing a sufficient level of abstraction to recognise what two video sequences, depicting the same kind of action, have in common. If the sequence is short and hence only few data are available, the EM algorithm, which is generally employed to learn HMMs, might return unreliable estimates. As a possible solution to this problem, a robust version of the EM algorithm, which provides an interval-valued quantification of the HMM probabilities is provided. This takes place in an imprecise-probabilistic framework, where action recognition can be based on the (bounds of the) likelihood assigned by an imprecise HMM to the considered video sequence. Experiments show that this approach is quite effective in discriminating the hard-to-recognise sequences from the easy ones. In practice, either the recognition algorithm returns a set of action labels, which typically includes the right one, either a single answer, which is very likely to be correct, is provided.

2010

Journal

An Aggregation Framework Based on Coherent Lower Previsions: Application to Zadeh’s Paradox and Sensor Networks

Alessio Benavoli, and Alessandro Antonucci

International Journal of Approximate Reasoning, Nov 2010

The problem of aggregating two or more sources of information containing knowl- edge about a common domain is considered. We propose an aggregation framework for the case where the available information is modelled by coherent lower previsions, corresponding to convex sets of probability mass functions. The con- sistency between aggregated beliefs and sources of information is discussed. A closed formula, which specializes our rule to a particular class of models, is also derived. Two applications consisting in a possible explanation of Zadeh’s paradox and an algorithm for estimation fusion in sensor networks are finally reported.

Journal

Epistemic Irrelevance in Credal Nets: The Case of Imprecise Markov trees

Gert Cooman, Filip Hermans, Alessandro Antonucci, and Marco Zaffalon

International Journal of Approximate Reasoning, Nov 2010

We focus on credal nets, which are graphical models that generalise Bayesian nets to imprecise probability. We replace the notion of strong independence commonly used in credal nets with the weaker notion of epistemic irrelevance, which is arguably more suited for a behavioural theory of probability. Focusing on directed trees, we show how to combine the given local uncertainty models in the nodes of the graph into a global model, and we use this to construct and justify an exact message-passing algorithm that computes updated beliefs for a variable in the tree. The algorithm, which is linear in the number of nodes, is formulated entirely in terms of coherent lower previsions, and is shown to satisfy a number of rationality requirements. We supply examples of the algorithm’s operation, and report an application to on-line character recognition that illustrates the advantages of our approach for prediction. We comment on the perspectives, opened by the availability, for the first time, of a truly efficient algorithm based on epistemic irrelevance.

Journal

Generalized Loopy 2U: A New Algorithm for Approximate Inference in Credal Networks

Alessandro Antonucci, Sun Yi, Cassio Polpo Campos, and Marco Zaffalon

International Journal of Approximate Reasoning, Jan 2010

Credal networks generalize Bayesian networks by relaxing the requirement of precision of probabilities. Credal networks are considerably more expressive than Bayesian networks, but this makes belief updating NP-hard even on polytrees. We develop a new efficient algorithm for approximate belief updating in credal networks. The algorithm is based on an important representation result we prove for general credal networks: that any credal network can be equivalently reformulated as a credal network with binary variables; moreover, the transformation, which is considerably more complex than in the Bayesian case, can be implemented in polynomial time. The equivalent binary credal network is then updated by L2U, a loopy approximate algorithm for binary credal networks. Overall, we generalize L2U to non-binary credal networks, obtaining a scalable algorithm for the general case, which is approximate only because of its loopy nature. The accuracy of the inferences with respect to other state-of-the-art algorithms is evaluated by extensive numerical tests.

Chapter

Building Knowledge-based Expert Systems: A Tutorial

Alberto Piatti, Alessandro Antonucci, and Marco Zaffalon

Knowledge-based systems are computer programs achieving expert-level competence in solving problems for specific task areas. This chapter is a tutorial on the construction of knowledge-based systems in the theoretical framework of credal networks. Credal networks are a generalization of Bayesian networks where credal sets, i.e., closed convex sets of probability measures, are used instead of precise probabilities. This allows for a more flexible process of elicitation than in the case of Bayesian networks. In fact, credal sets allow to represent ambiguity, contrast and contradiction in a natural and realistic way. The procedure we propose is based on a sharp distinction between the domain knowledge and the process linking this knowledge to the perceived evidence, which we call the observational process. This distinction leads to a very flexible representation of both domain knowledge and knowledge about the way the information is collected, together with a procedure of aggregation of the information coming from the different sources. The overall procedure is illustrated along the chapter by a simple knowledge-based system for the prediction of the result of a football match.

Proceedings

Credal Sets Approximation by Lower Probabilities: Application to Credal Networks

Alessandro Antonucci, and Fabio Cuzzolin

In Computational Intelligence for Knowledge-Based Systems Design, 13th International Conference on Information Processing and Management of Uncertainty, IPMU 2010, Jun 2010

Credal sets are closed convex sets of probability mass functions. The lower probabilities specified by a credal set for each element of the power set can be used as constraints defining a second credal set. This simple procedure produces an outer approximation, with a bounded number of extreme points, for general credal sets. The approximation is optimal in the sense that no other lower probabilities can specify smaller supersets of the original credal set. Notably, in order to be computed, the approximation does not need the extreme points of the credal set, but only its lower probabilities. This makes the approximation particularly suited for credal networks, which are a generalization of Bayesian networks based on credal sets. Although most of the algorithms for credal networks updating only return lower posterior probabilities, the suggested approximation can be used to evaluate (as an outer approximation of) the posterior credal set. This makes it possible to adopt more sophisticated decision making criteria, without having to replace existing algorithms. The quality of the approximation is investigated by numerical tests.

2009

Proceedings

Modeling Unreliable Observations in Bayesian Networks by Credal Networks

Alessandro Antonucci, and Alberto Piatti

In Scalable Uncertainty Management, Third International Conference, SUM 2009, Jun 2009

Bayesian networks are probabilistic graphical models widely employed in AI for the implementation of knowledge-based systems. Standard inference algorithms can update the beliefs about a variable of interest in the network after the observation of some other variables. This is usually achieved under the assumption that the observations could reveal the actual states of the variables in a fully reliable way. We propose a procedure for a more general modeling of the observations, which allows for updating beliefs in different situations, including various cases of unreliable, incomplete, uncertain and also missing observations. This is achieved by augmenting the original Bayesian network with a number of auxiliary variables corresponding to the observations. For a flexible modeling of the observational process, the quantification of the relations between these auxiliary variables and those of the original Bayesian network is done by credal sets, i.e., convex sets of probability mass functions. Without any lack of generality, we show how this can be done by simply estimating the bounds for the likelihoods of the observations. Overall, the Bayesian network is transformed into a credal network, for which a standard updating problem has to be solved. Finally, a number of transformations that might simplify the updating of the resulting credal network is provided.

Proceedings

Multiple Model Tracking by Imprecise Markov Trees

Alessandro Antonucci, Alessio Benavoli, Marco Zaffalon, Gert Cooman, and Filip Hermans

In FUSION 2009: Proceedings of the 12th International Conference on Information Fusion, Jun 2009

We present a new procedure for tracking manoeuvring objects by hidden Markov chains. It leads to more reliable modelling of the transitions between hidden states compared to similar approaches proposed within the Bayesian framework: we adopt convex sets of probability mass functions rather than single ’precise probability’ specifications, in order to provide a more realistic and cautious model of the manoeuvre dynamics. In general, the downside of such increased freedom in the modelling phase is a higher inferential complexity. However, the simple topology of hidden Markov chains allows for efficient tracking of the object through a recently developed belief propagation algorithm. Furthermore, the imprecise specification of the transitions can produce so-called indecision, meaning that more than one model may be suggested by our method as a possible explanation of the target kinematics. In summary, our approach leads to a multiple-model estimator whose performance, investigated through extensive numerical tests, turns out to be more accurate and robust than that of Bayesian ones.

Journal

Credal Networks for Military Identification Problems

Alessandro Antonucci, Ralph Brühlmann, Alberto Piatti, and Marco Zaffalon

International Journal of Approximate Reasoning, Apr 2009

Credal networks are imprecise probabilistic graphical models generalizing Bayesian networks to convex sets of probability mass functions. This makes credal networks particularly suited to model expert knowledge under very general conditions, including states of qualitative and incomplete knowledge. In this paper, we present a credal network for risk evaluation in case of intrusion of civil aircrafts into a restricted flight area. The different factors relevant for this evaluation, together with an independence structure over them, are initially identified. These factors are observed by sensors, whose reliabilities can be affected by variable external factors, and even by the behaviour of the intruder. A model of these observation processes, and the necessary fusion scheme for the information returned by the sensors measuring the same factor, are both completely embedded into the structure of the credal network. A pool of experts, facilitated in their task by specific techniques to convert qualitative judgements into imprecise probabilistic assessments, has made possible the quantification of the network. We show the capabilities of the proposed model by means of some preliminary tests referred to simulated scenarios. Overall, we can regard this application as a useful tool to support military experts in their decision, but also as a quite general imprecise-probability paradigm for information fusion.

Proceedings

Epistemic Irrelevance in Credal Networks: The Case of Imprecise Markov Trees

Gert Cooman, Filip Hermans, Alessandro Antonucci, and Marco Zaffalon

In ISIPTA ’09: Proceedings of the Sixth International Symposium on Imprecise Probability: Theories and Applications, Apr 2009

We replace strong independence in credal networks with the weaker notion of epistemic irrelevance. Focusing on directed trees, we show how to combine local credal sets into a global model, and we use this to construct and justify an exact message-passing algorithm that computes updated beliefs for a variable in the tree. The algorithm, which is essentially linear in the number of nodes, is formulated entirely in terms of coherent lower previsions. We supply examples of the algorithm’s operation, and report an application to on-line character recognition that illustrates the advantages of our model for prediction.

Proceedings

Aggregating Imprecise Probabilistic Knowledge

Alessio Benavoli, and Alessandro Antonucci

In Proceedings of the Sixth International Symposium on Imprecise Probability: Theories and Applications (ISIPTA ’09), Jul 2009

The problem of aggregating two or more sources of information containing knowledge about a same domain is considered. We propose an aggregation rule for the case where the available information is modeled by coherent lower previsions, corresponding to convex sets of probability mass functions. The consistency between aggregated beliefs and sources of information is discussed. A closed formula, which specializes our rule to a particular class of models, is also derived. Finally, an alternative explanation of Zadeh’s paradox is provided.

2008

Journal

Decision-Theoretic Specification of Credal Networks: A Unified Language for Uncertain Modeling with Sets of Bayesian Networks

Credal networks are models that extend Bayesian nets to deal with imprecision in probability, and can actually be regarded as sets of Bayesian nets. Credal nets appear to be powerful means to represent and deal with many important and challenging problems in uncertain reasoning. We give examples to show that some of these problems can only be modeled by credal nets called non-separately specified. These, however, are still missing a graphical representation language and updating algorithms. The situation is quite the opposite with separately specified credal nets, which have been the subject of much study and algorithmic development. This paper gives two major contributions. First, it delivers a new graphical language to formulate any type of credal network, both separately and non-separately specified. Second, it shows that any non-separately specified net represented with the new language can be easily transformed into an equivalent separately specified net, defined over a larger domain. This result opens up a number of new outlooks and concrete outcomes: first of all, it immediately enables the existing algorithms for separately specified credal nets to be applied to non-separately specified ones. We explore this possibility for the 2U algorithm: an algorithm for exact updating of singly connected credal nets, which is extended by our results to a class of non-separately specified models. We also consider the problem of inference on Bayesian networks, when the reason that prevents some of the variables from being observed is unknown. The problem is first reformulated in the new graphical language, and then mapped into an equivalent problem on a separately specified net. This provides a first algorithmic approach to this kind of inference, which is also proved to be NP-hard by similar transformations based on our formalism.

PhD Thesis

Imprecise Probabilistic Graphical Models: Equivalent Representations, Inference Algorithms and Applications

Credal networks are probabilistic graphical models that extend Bayesian nets to deal with imprecision in probability, and can actually be regarded as sets of Bayesian nets. Credal nets appear to be powerful means to represent and deal with many important and challenging problems in uncertain reasoning. The counterpart of having more freedom in the modeling phase is an increased inferential complexity of inferences, e.g., the so-called belief updating becomes a hard task even on relatively simple topologies. In this thesis, I start my investigation on credal networks by considering equivalent representations of those models. More specifically, I first deliver a new graphical language, which is called decision-theoretic being inspired by the formalism of decision graphs, for a unified representation of credal networks of any kind. I also present another representation, which is called binarization, being in fact a reformulation of a credal network solely based on binary variables. Remarkably, I prove that if a credal net is first reformulated by its decision-theoretic representation and then by the corresponding binarization, the resulting representation is completely equivalent. An equivalence relation between Bayesian and credal nets, when the reason for the missingness of some of the variables in the Bayesian nets is unknown, is also provided. The developed equivalent representations are applied to inference problems. First, I show that, by a decision-theoretic formulation, the algorithms that have been already designed for credal networks, which are mostly referred to a specific class of models, called separately specified nets, can be generalized to credal networks of any kind. Similar formalisms are also employed to solve inference and classification problems with missing observations. I also present a state-ofthe- art updating algorithm which is based on the equivalent binary representation. This algorithm, called GL2U, offers an efficient procedure for approximate updating of general credal nets. The quality of the overall approximation is investigated by promising numerical experiments. As a further theoretical investigation, I consider a classification problem for Bayesian networks for which a hardness proof together with a fast algorithm for a subclass of models is provided. Finally, two real-world applications of credal networks are presented. First, I consider a military identification problem, consisting in the detection of the goal of an intruder entering a no-fly area. The problem, together with the necessary fusion of the information gathered by the sensors is mapped by our techniques into a credal network updating task. The solution is then obtained by the GL2U algorithm. The second application is an environmental model for hazard assessment of debris flows by credal networks. A credal network evaluates the level of risk, corresponding to the observed values of the triggering factors, for this specific natural hazard. For some factors, whose observations are more difficult, the corresponding soft evidential information is embedded by our formalism into the structure of the network. This model is employed for extensive numerical analysis on the Swiss territory.

Proceedings

Spatially Distributed Identification of Debris Flow Source Areas by Credal Networks

Andrea Salvetti, Alessandro Antonucci, and Marco Zaffalon

In iEMSs 2008: International Congress on Environmental Modelling and Software Integrating Sciences and Information Technology for Environmental Assessment and Decision Making (Transactions of the 4th Biennial Meeting of the International Environmental Modelling and Software Society), Apr 2008

Debris flows represent a very destructive natural hazard, affecting buildings, transport infrastructures, and, very often, causing human losses in mountain regions. That makes the identification of potential source areas of debris flows inside a watershed particularly important. In this paper we present a general identification procedure based on the credal network (that is an imprecise probabilistic graphical model generalizing Bayesian networks) originally introduced by Antonucci et al. (2004). That model is significantly improved by a more refined description of the meteorological and hydrological processes contributing to the debris flow initiation. As a counterpart of such improvement, the model pays a slight increase in terms of computational time for identifications. That does not prevent its extensive, spatially distributed, application to whole basins, thanks to a preliminary deterministic analysis that rejects local areas where the triggering of a debris flow cannot take place. The overall procedure is tested for a debris flow prone watershed in Southern Switzerland. The model detects the areas in the basin more prone to debris flow initiation and also shows that different rainfall return periods produce different patterns of hazard in the basin. That makes it possible with this procedure to determine the return period of the critical rainfall that triggers debris flow as a result of channel-bed failure in a specific point along the drainage network.

Proceedings

Generalized Loopy 2U: A New Algorithm for Approximate Inference in Credal Networks

Alessandro Antonucci, Marco Zaffalon, Sun Yi, and Cassio Polpo Campos

In PGM’08: Proceedings of the Fourth European Workshop on Probabilistic Graphical Models, Sep 2008

Credal nets generalize Bayesian nets by relaxing the requirement of precision of probabilities. Credal nets are considerably more expressive than Bayesian nets, but this makes belief updating NP-hard even on polytrees. We develop a new efficient algorithm for approximate belief updating in credal nets. The algorithm is based on an important representation result we prove for general credal nets: that any credal net can be equivalently reformulated as a credal net with binary variables; moreover, the transformation, which is considerably more complex than in the Bayesian case, can be implemented in polynomial time. The equivalent binary credal net is updated by L2U, a loopy approximate algorithm for binary credal nets. Thus, we generalize L2U to non-binary credal nets, obtaining an accurate and scalable algorithm for the general case, which is approximate only because of its loopy nature. The accuracy of the inferences is evaluated by empirical tests.

2007

Journal

Fast Algorithms for Robust Classification with Bayesian Nets

We focus on a well-known classification task with expert systems based on Bayesian networks: predicting the state of a target variable given an incomplete observation of the other variables in the network, i.e., an observation of a subset of all the possible variables. To provide conclusions robust to near-ignorance about the process that prevents some of the variables from being observed, it has recently been derived a new rule, called conservative updating. With this paper we address the problem to efficiently compute the conservative updating rule for robust classification with Bayesian networks. We show first that the general problem is NP-hard, thus establishing a fundamental limit to the possibility to do robust classification efficiently. Then we define a wide subclass of Bayesian networks that does admit efficient computation. We show this by developing a new classification algorithm for such a class, which extends substantially the limits of efficient computation with respect to the previously existing algorithm. The algorithm is formulated as a variable elimination procedure, whose computation time is linear in the input size.

Proceedings

Credal Networks for Operational Risk Measurement and Management

Alessandro Antonucci, Alberto Piatti, and Marco Zaffalon

In Proceedings of the 11th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems: KES2007, Lectures Notes in Computer Science, Sep 2007

According to widely accepted guidelines for self-regulation, the capital requirements of a bank should relate to the level of risk with respect to three different categories. Among them, operational risk is the more difficult to assess, as it requires merging expert judgments and quantitative information about the functional structure of the bank. A number of approaches to the evaluation of operational risk based on Bayesian networks have been recently considered. In this paper, we propose credal networks, which are a generalization of Bayesian networks to imprecise probabilities, as a more appropriate framework for the measurement and management of operational risk. The reason is the higher flexibility provided by credal networks compared to Bayesian networks in the quantification of the probabilities underlying the model: this makes it possible to represent human expertise required for these evaluations in a credible and robust way. We use a real-world application to demonstrate these features and to show how to measure operational risk by means of algorithms for inference over credal nets. This is shown to be possible, also in the case when the observation of some factor is vague.

Proceedings

Credal Networks for Military Identification Problems

Alessandro Antonucci, Ralph Brühlmann, Alberto Piatti, and Marco Zaffalon

In ISIPTA ’07: Proceedings of the fifth International Symposium on on Imprecise Probability: Theories and Applications, Jul 2007

Credal networks are imprecise probabilistic graphical models generalizing Bayesian networks to convex sets of probability mass functions. This makes credal networks particularly suited to capture and model expert knowledge under very general conditions, including states of qualitative and incomplete knowledge. In this paper, we present a credal network for risk evaluation in case of intrusion of civil aircrafts into a no-fly zone. The different factors relevant for this evaluation, together with an independence structure over them, are initially identified. These factors are observed by sensors, whose reliabilities can be affected by variable external factors, and even by the behavior of the intruder. A model of these observation mechanisms, and the necessary fusion scheme for the information returned by the sensors measuring the same factor, are both completely embedded into the structure of the credal network. A pool of experts, facilitated in their task by specific techniques to convert qualitative judgments into imprecise probabilistic assessments, has made possible the quantification of the network. We show the capabilities of the proposed network by means of some preliminary tests referred to simulated scenarios. Overall, we can regard this application as an useful tool to support military experts in their decision, but also as a quite general imprecise-probability paradigm for information fusion.

Chapter

Credal Networks for Hazard Assessment of Debris Flows

Alessandro Antonucci, Andrea Salvetti, and Marco Zaffalon

We establish an intimate connection between Bayesian and credal nets. Bayesian nets are precise graphical models, credal nets extend Bayesian nets to imprecise probability. We focus on traditional belief updating with credal nets, and on the kind of belief updating that arises with Bayesian nets when the reason for the missingness of some of the unobserved variables in the net is unknown. We show that the two updating problems are formally the same.

Proceedings

Binarization Algorithms for Approximate Updating in Credal Nets

Alessandro Antonucci, Marco Zaffalon, Jaime S. Ide, and Fabio Gagliardi Cozman

In STAIRS’06: Proceedings of the third European Starting AI Researcher Symposium, Aug 2006

Credal networks generalize Bayesian networks relaxing numerical parameters. This considerably expands expressivity, but makes belief updating a hard task even on polytrees. Nevertheless, if all the variables are binary, polytree-shaped credal networks can be efficiently updated by the 2U algorithm. In this paper we present a binarization algorithm, that makes it possible to approximate an updating problem in a credal net by a corresponding problem in a credal net over binary variables. The procedure leads to outer bounds for the original problem. The binarized nets are in general multiply connected, but can be updated by the loopy variant of 2U. The quality of the overall approximation is investigated by promising numerical experiments.

Credal networks are models that extend Bayesian nets to deal with imprecision in probability, and can actually be regarded as sets of Bayesian nets. Evidence suggests that credal nets are a powerful means to represent and deal with many important and challenging problems in uncertain reasoning. We give examples to show that some of these problems can only be modelled by credal nets called non-separately specified. These, however, are still missing a graphical representation language and solution algorithms. The situation is quite the opposite with separately specified credal nets, which have been the subject of much study and algorithmic development. This paper gives two major contributions. First, it delivers a new graphical language to formulate any type of credal network, both separately and non-separately specified. Second, it shows that any non-separately specified net represented with the new language can be easily transformed into an equivalent separately specified net, defined over a larger domain. This result opens up a number of new perspectives and concrete outcomes: first of all, it immediately enables the existing algorithms for separately specified credal nets to be applied to non-separately specified ones.

Proceedings

Fast Algorithms for Robust Classification with Bayesian Nets

We focus on a well-known classification task with expert systems based on Bayesian networks: predicting the state of a target variable given an incomplete observation of the other variables in the network, i.e., an observation of a subset of all the possible variables. To provide conclusions robust to near-ignorance about the process that prevents some of the variables from being observed, it has recently been derived a new rule, called conservative updating. With this paper we address the problem to efficiently compute the conservative updating rule for robust classification with Bayesian networks. We show first that the general problem is NP-hard, thus establishing a fundamental limit to the possibility to do robust classification efficiently. Then we define a wide subclass of Bayesian networks that does admit efficient computation. We show this by developing a new classification algorithm for such a class, which extends substantially the limits of efficient computation with respect to the previously existing algorithm.

2004

Proceedings

Assessing Debris Flow Hazard by Credal Nets

Alessandro Antonucci, Andrea Salvetti, and Marco Zaffalon

In Proceedings of the Second International Conference on Soft Methods in Probability and Statistics (SMPS-2004) - Soft Methodology and Random Information Systems, Sep 2004

Hazard Assessment of Debris Flows by Credal Networks

Alessandro Antonucci, Andrea Salvetti, and Marco Zaffalon

In iEMSs 2004: Complexity and Integrated Resources Management, Transactions of the 2nd Biennial Meeting of the International Environmental Modelling and Software Society, Jun 2004

Debris flows are destructive natural hazards that affect human life, buildings, and infrastructures. Despite their importance, debris flows are only partially understood, and human expertise still plays a key role for hazard identification. This paper proposes filling the modelling gap by using credal networks, an imprecise-probability model. The model uses a directed graph to capture the causal relationships between the triggering factors of debris flows. Quantitative influences are represented by probability intervals, determined from historical data, expert knowledge, and theoretical models. Most importantly, the model joins the empirical and the quantitative modelling levels, in the direction of more credible inferences. The model is evaluated on real case studies related to dangerous areas of the Ticino Canton, southern Switzerland. The case studies highlight the good capabilities of the model: for all the areas the model produces significant probabilities of hazard.

edited books and journal issues

2021

Journal

Uncertain reasoning

Alessandro Antonucci, Salem Benferhat, and Kamal Premaratne

2021

Annals of Mathematics and Artificial Intelligence (Springer), Volume 89, issue 10-11

Symbolic and Quantitative Approaches to Reasoning with Uncertainty - 14th European Conference, ECSQARU 2017, Lugano, Switzerland, July
10-14, 2017, Proceedings

Alessandro Antonucci, Laurence Cholvy, and Odile Papini

2017

Lecture Notes in Computer Science 10369 (Springer)