## Program

#### Thursday November 20th, 2014

##### 9h00 Registration

##### 9h15 Welcomming address

##### 9h30 Session 1

**Invited speaker**:

- "
*Preference Learning: Machine Learning meets MCDA*",

Eyke Hüllermeier,

Department of Computer Science, Universität Paderborn, Germany

The topic of "preferences" has recently attracted considerable attention in artificial intelligence in general and machine learning in particular, where the topic of preference learning has emerged as a new, interdisciplinary research field with close connections to related areas such as operations research, social choice and decision theory. Roughly speaking, preference learning is about methods for learning preference models from explicit or implicit preference information, which are typically used for predicting the preferences of an individual or a group of individuals. Approaches relevant to this area range from learning special types of preference models, such as lexicographic orders, over "learning to rank" for information retrieval to collaborative filtering techniques for recommender systems. The primary goal of this tutorial is to provide a brief introduction to the field of preference learning and, moreover, to elaborate on its connection to multiple criteria decision aid.

**Invited speaker**:

- "
*Preference Learning: Machine Learning meets MCDA*",

Eyke Hüllermeier,

Department of Mathematics and Computer Science, Universität Paderborn, Germany

##### 10h30 Session 2

- "
*On the use of copulas to simulate multicriteria data*",

Jairo Cugliari

^{1}, Antoine Rolland^{1}, Thi-Min-Tuy Tran^{1}^{1}ERIC, Université Lyon 2,

Several methods have been proposed in the past decades to deal with Multicriteria Decision Aiding (MCDA) problems. However, a comparison between these methods is always arduous as the number of dataset proposed in the literature is very low. One of the limitations of the existing datasets is that generally MCDA method are dealing with very small sets of data; typically, a MCDA problem deals with a number of alternatives that does not exceed 20 or 30 and often less. Therefore, it should be interesting to propose a way to simulate new data based on some existing dataset, i.e. taking into account the potential links that should exist between the criteria. We introduce in this paper the use of the statistical functions named copula to simulate such data. A practical way to use copula is proposed, and the quality of the obtained data is discussed.

- "
*Data Generation Techniques for Label Ranking*",

Massimo Gurrieri

^{1}, Philippe Fortemps^{1}, Xavier Siebert^{1}, Marc Pirlot^{1}, Nabil Aït-Taleb^{1}^{1}MATHRO, UMONS

In light of the lack of benchmark data for label ranking, experimentations are typically performed on data sets derived from classification or regression data sets. The generation of artificial datasets is however not trivial since instances have to be associated with rankings over a finite set of labels and attributes (i.e. the feature vector) have to be linked (correlated) with such rankings. This paper discusses and proposes datasets generation techniques in order to provide artificial datasets suitable for label ranking.

- "
*On the use of copulas to simulate multicriteria data*",

Jairo Cugliari

^{1}, Antoine Rolland^{1}, Thi-Min-Tuy Tran^{1}^{1}ERIC, Université Lyon 2,

- "
*Data Generation Techniques for Label Ranking*",

Massimo Gurrieri

^{1}, Philippe Fortemps^{1}, Xavier Siebert^{1}, Marc Pirlot^{1}, Nabil Aït-Taleb^{1}^{1}MATHRO, UMONS

##### 11h30 Coffee Break

##### 12h00 Session 3

**Invited speaker**:

- "
*Boolean functions for classification: logIcal qnálysis of data*",

Yves Crama,

University of Liège, Belgium

Boolean functions are among the simplest and most fundamental objects investigated in mathematics. In spite, or because of their simplicity, they find applications in many scientific fields, including logic, combinatorics, operations research, artificial intelligence, computer science, game theory, engineering, and so forth.

In this talk, we present a collection of Boolean models that have been developed over the last 25 years under the name of "Logical Analysis of Data" (or LAD) in order to handle a large variety of classification problems. We focus on the frequent situation where a decision-maker has observed a number of data points (say, vectors of binary attributes) which have been classified either as "positive" or as "negative" examples of a phenomenon under study. The task of the decision-maker is then to develop a classification system that allows her to assign one of the "positive" or "negative" qualifiers to any point that may be presented to her in the future, in a way that remains consistent with the initial observations.

We first recall useful facts about partially defined Boolean functions and their extensions, and we introduce the main concepts and definitions used in the LAD framework: support (or "sufficient") sets of attributes, patterns (or "elementary classification rules"), theories (obtained by combining patterns), etc. We show how these building blocks can be used to develop simple interpretable classifiers that perform and generalize well in a variety of experimental situations. Moreover, we argue that these classifiers satisfy some minimal requirements for ``justifiability". Finally, we clarify the relation between the LAD classifiers and certain popular classifiers used in the machine learning literature, such as those computed by nearest neighbor classification algorithms or decision trees.

**Invited speaker**:

- "
*Boolean functions for classification: logical analysis of data*",

Yves Crama,

University of Liège, Belgium

##### 13h00 Lunch

##### 14h20 Group photo

##### 14h30 Session 4

**Invited speaker**:

- "
*Learning and identifying monotone boolean functions*",

Endre Boros,

Rutgers University, NJ, USA

Numerous applications require the task of learning and/or identifying a hidden monotone Boolean function.In this talk, first we review several learning models and clarify the the corresponding learning complexity when the hidden function is known to be monotone. The considered models include extending a given partially defined Boolean function or one with missing bits within a specified class of monotone Boolean functions, and learning a certain type of monotone function using membership queries.

In the second part of the talk we consider identification problems, which is a special case/extension (depending how one views it) of learning by membership queries. Identification of a monotone function means that we try to generate all of its minimal true (resp. maximal false) points. This problem is strongly related to Boolean dualization or equivalently to finding all minimal transversals of a hypergraph. In this talk we survey some of the related results, and provide a sample of the standard algorithmic techniques.

**Invited speaker**:

- "
*Learning and identifying monotone boolean functions*",

Endre Boros,

Rutgers University, NJ, USA

##### 15h30 Coffee break

##### 16h00 Session 5

- "
*Learning the parameters of a majority rule sorting model taking attribute interactions into account*",

Olivier Sobrie

^{1,2}, Vincent Mousseau^{1}and Marc Pirlot^{2},^{1}LGI, Ecole Centrale Paris,

^{2}MATHRO, UMONS

We consider a multicriteria sorting procedure based on a majority rule, called MR-Sort. This procedure allows to sort each object of a set, evaluated on multiple criteria, in a category selected among a set of pre-defined and ordered categories. With MR-Sort, the ordered categories are separated by profiles which are vectors of performances on the different attributes. An object is assigned in a category if it is as good as the category lower profile and not better than the category upper profile. To determine if an object is as good as a profile, the weights of the criteria on which the object performances are better than the profile performances are summed up and compared to a threshold. In view of improving the expressiveness of the model, we modify it by introducing capacities to quantify the power of the coalitions. In the paper we describe a mixed integer program and a metaheuristic that give the possibility to learn the parameters of this model from examples of assignment. We test the metaheuristic on real datasets.

- "
*Conjoint axiomatization of the Choquet integral for two-dimensional heterogeneous product sets*",

Mikhail Timonin

^{1},^{1}Queen Mary University of London

We propose an axiomatization of the Choquet integral model for the general case of a heterogeneous product set X = X

_{1}x X_{2}. Previous axiomatizations of the Choquet integral have been given for particular cases X = Y^{n}and X = R^{n}. The major difference of this paper from the earlier axiomatizations is that the notion of “comonotonicity” cannot be used in the heterogeneous structure as there does not exist a “built-in” order between elements of sets X_{1}and X_{2}. However, such an order is implied by the representation. Our characterization does not assume commensurateness of criteria a priori.We construct the representation and study its uniqueness properties.- "
*Utilitaristic Choquistic Regression*",

Ali Fallah Tehrani

^{1}, Christophe Labreuche^{2}, Eyke Hullermeier^{1}^{1}Department of Mathematics and Computer Science, University of Marburg,

^{2}Thales Research & Technology

Traditionally in machine learning, the attributes are a priori normalized (standardized) and their normalization is not part of the learning process. Taking inspiration from multi-criteria decision aid, we investigate in this paper the interest of learning also the utility function. More specifically we extend two classification methods - namely logistic regression and Choquistic regression - to learn both the normalization and the aggregation of the criteria. Some premilinary results are presented in this paper.

- "
*About the french hospitals rankings: a MCDA point of view*",

Brice Mayag

^{1},^{1}LAMSADE, Université Paris Dauphine

The aim of this paper is to convince the MultiCriteria Decision Aid (MCDA) and Preference Learning communities to investigate and to contribute in the development of methodologies dedicated to hospital ranking. To do so, we present the french hospital ranking and show how these rankings can be built properly through two existing methods: decision tree and ELECTRE Tri.

- "
*Learning the parameters of a majority rule sorting model taking attribute interactions into account*",

Olivier Sobrie

^{1,2}, Vincent Mousseau^{1}and Marc Pirlot^{2},^{1}LGI, Ecole Centrale Paris,

^{2}MATHRO, UMONS

- "
*Conjoint axiomatization of the Choquet integral for two-dimensional heterogeneous product sets*",

Mikhail Timonin

^{1},^{1}Queen Mary University of London

- "
*Utilitaristic Choquistic Regression*",

Ali Fallah Tehrani

^{1}, Christophe Labreuche^{2}, Eyke Hullermeier^{1}^{1}Department of Mathematics and Computer Science, University of Marburg,

^{2}Thales Research & Technology

- "
*About the french hospitals rankings: a MCDA point of view*",

Brice Mayag

^{1},^{1}LAMSADE, Université Paris Dauphine

##### 19h00 Workshop Banquet

- at Restaurant "
*Le Berny*", 127 Avenue Aristide Briand, 92160 Antony, tel.: 01 42 37 72 40, how to go there

#### Friday November 21, 2014

##### 9h Session 6

**Invited speaker**:

- "
*Scaling Optimization Methods for Data-driven Marketing*",

Craig Boutillier,

University Toronto, Canada

The emergence of large-scale, data-driven analytics has greatly improved the ability to predict the behavior of, and the effect of marketing actions on, individual consumers. Indeed, the potential for fully personalized "marketing conversations" is very real. Unfortunately, advances in predictive analytics have significantly outpaced the ability of current decision support tools and optimization algorithms, precisely the tools needed to transform these insights into marketing plans, policies and strategies. This is especially true in large marketiNg orwaîizations, where large numbers of campaigns, business objectives, product groups, etc. place competing demands on marketing resources---the most important of which is customer attention.

In this talk, I will describe a new approach, called dynamic segmentation, for solving large-scale marketing optimization problems. We formulate the problem as a generalized assignment problem (or other mathematical program) and create aggregate segmentations based on both (statistical) predictive models and campaign-specific and organizational objectives. The resulting compression allows problems involving hundreds of campaigns and millions of customers to be solved optimally in tens of milliseconds. I'll briefly describe how the data-intensive components of the algorithm can be distributed to take advantage of modern cluster-computing frameworks. I will also discuss how the platform supports real-time scenario analysis and re-optimization, allowing decision makers to explore tradeoffs across multiple objectives in real-time.

Time permitting, I'll hint at how the technique might be extended to solve sequential, stochastic problems formulated as Markov decision processes, and briefly mention other potential applications of this class of techniques.

**Invited speaker**:

- "
*Scaling Optimization Methods for Data-driven Marketing*",

Craig Boutillier,

University Toronto, Canada

##### 10h00 Session 7

- "
*Factorization of large tournaments for the median linear order problem*",

Alain Guénoche

^{1}^{1}Institut de Mathématiques de Marseille (I2M - CNRS)

Computing a median linear order for a given set of linear orders on n elements, is an ordinary task for preference aggregation. This problem is formalized by a tournament (complete directed graph) with n vertices, arcs corresponding to majority preferences. To build a median linear order is to make it transitive, realizing a minimum number of arc-reversal operations. They define the remoteness of any median linear order to this tournament. The computation of a minimum series of arc reversals is usually made using a Branch & Bound algorithm which cannot be applied when n overpasses a few tens. In this text we try to decompose a large tournament (n > 100) into sub-tournaments and to assemble the median orders on each one into a linear order on n elements. We show, making several simulations on random tournaments, weighted or unweighted, that this decomposition strategy is efficient.

- "
*Listing the families of Sufficient Coalitions of criteria involved in Sorting procedures*",

Eda Ersek Uyanık

^{1}, Olivier Sobrie^{1,2},Vincent Mousseau^{2}and Marc Pirlot^{1}^{1}MATHRO, UMONS,

^{2}LGI, Ecole Centrale Paris

Certain sorting procedures derived from ELECTRE TRI such as MR-Sort or the Non-Compensatory Sorting (NCS model) model rely on a rule of the type: if an object is better than a profile on a “sufficient coalition” of criteria, this object is assigned to a category above this profile. In some cases the strength a coalition can be numerically represented by the sum of weights attached to the criteria and a coalition is sufficient if its strength passes some threshold. This is the type of rule used in the MR-Sort method. In more general models such as Capacitive-MR-Sort or NCS model, criteria are allowed to interact and a capacity is needed to model the strength of a coalition. In this contribution, we want to investigate the gap of expressivity between the two models. In this view, we explicitly generate a list of all possible families of sufficient coalitions for a number of criteria up to 6. We also categorize them according to the degree of additivity of a capacity that can model their strength. Our goal is twofold: being able to draw a sorting rule at random and having at disposal examples in view of supporting a theoretical investigation of the families of sufficient coalitions.

- "
*Factorization of large tournaments for the median linear order problem*",

Alain Guénoche

^{1}^{1}Institut de Mathématiques de Marseille (I2M - CNRS)

- "
*Listing the families of Sufficient Coalitions of criteria involved in Sorting procedures*",

Eda Ersek Uyanık

^{1}, Olivier Sobrie^{1,2},Vincent Mousseau^{2}and Marc Pirlot^{1}^{1}MATHRO, UMONS,

^{2}LGI, Ecole Centrale Paris

##### 11h00 Coffee break

##### 11h30 Session 8

**Invited speaker**:

- "
*Surrogate loss functions for preference learning*",

Krzysztof Dembczynski,

Poznan University of Technology, Poland

In preference learning we use a variety of different performance measures to train and test prediction models. The most popular measures are pairwise disagreement (also referred to as rank loss), discounted cumulative gain, average precision, and expected reciprocal rank. Unfortunately, these measures are usually neither convex nor differentiable, so their optimization becomes a hard computational problem. However, instead of optimizing them directly we can reformulate the problem and use surrogate or proxy loss functions which are easier to minimize. A natural question arises whether optimization of a surrogate loss provides a near-optimal solution for a given performance measure. For some of the performance measures the answer is positive, but in the general case the answer is rather negative. During the tutorial we will discuss several results obtained so far.

**Invited speaker**:

- "
*Surrogate loss functions for preference learning*",

Krzysztof Dembczynski,

Poznan University of Technology, Poland

##### 12h30 Lunch

##### 13h20 Poster Session

**Posters**:

- "
*A Metaheuristic Approach for Preference Learning in Multi-Criteria Ranking based on Reference Points*", Jinyan Liu, Wassila Ouerdane, Vincent Mousseau

In this paper, we are interested by an aggregation method called multi-criteria ranking method based on Reference Points (RMP). Briefly, instead to have pairwise comparisons between alternatives, the pairs of alternatives are judged according to the reference points. The introduction of such points facilitates the comparison of any two alternatives in which dominance relationship does not necessarily exist. However, we notice that little attention has been brought on how to learn the parameters of this kind of model. Therefore, to tackle this problem, we propose in this work a methodology for preference learning for the RMP method. More precisely, we are interested by learning the parameters of this method when DMs provide us a large set of data or information. Specifically, an algorithm is provided that is a combination of an evolutionary approach and a linear programming approach. Experimental tests and analysis are also presented.

- "
*Inferring the parameters of a majority rule sorting model with vetoes on large datasets*", Alexandru-Liviu Olteanu, Patrick Meyer

The article is centred on the problem of inferring the parameters of a majority rule sorting model when large sets of assignment examples are considered. Beside the proposal of an approach for solving this problem, the main focus of the paper lies in the inclusion of veto thresholds inside the majority rule model, which, as we illustrate, increases the expressiveness of the model. However, due to its complexity, an exact approach for inferring its parameters is not practical especially when large datasets are considered. Therefore, we propose a metaheuristic approach to overcome this difficulty. The approach is validated over a set of constructed benchmarks as well as on several datasets containing real data

- "
*A Dataset Repository for Benchmark in MCDA*", Antoine Rolland and Thi-Minh-Thuy Tran

Several methods have been proposed in the past decades to deal with Multicriteria Decision Aiding (MCDA) problems. However, a comparison between these methods is always arduous as there is no benchmark in this domain. In the same time, people proposing new MCDA methods have no standardized data to deal with to prove the interest of their methods.We propose the creation of a web MCDA DataSet Repository to face this lack of data. We detail the presentation of this repository in this paper.

- "
*An Arrow-like theorem over median algebras*", Miguel Couceiro and Bruno Teheux

We present an Arrow-like theorem for aggregation functions over convervative median algebras. In doing so, we give a characterization of conservative median algebras by means of forbidden substructures and by providing their representation as chains.

- "
*User Experience Driven Design of MCDA Problems with DecisionCloud*", Michel Zam, Brice Mayag, Meltem Ozturk

Incremental transformation of stakeholder's decision problems in robust models remains a challenging and complex task that needs better tools. Realistic user experience gives the most valuable input but usually requires several life-cycles. This takes too long, costs too much, and lets precious ideas die. Sketching tools are too superficial, formal modeling tools are too cryptic and development tools are not productive enough.

We address the evolution vs. consistency challenge and provide an agile solution approach through the whole collaborative modeling process of multicriteria decision problems, including sketching, modeling and interacting with running apps. DecisionCloud is a MCDA extension of the MyDraft platform. Way beyond declaring criteria, alternatives, constraints, evaluations and run classical decision problems, DecisionCloud provides features as domain modeling and instant GUI prototyping. The whole evolutionary process runs in the cloud and is fully traced. Users, designers, and coders, if any, collaborate consistently using only their web browsers and grow their decision models directly in the cloud.

**Posters**:

- "
*A Metaheuristic Approach for Preference Learning in Multi-Criteria Ranking based on Reference Points*", Jinyan Liu, Wassila Ouerdane, Vincent Mousseau - "
*Inferring the parameters of a majority rule sorting model with vetoes on large datasets*", Alexandru-Liviu Olteanu, Patrick Meyer - "
*A Dataset Repository for Benchmark in MCDA*", Antoine Rolland and Thi-Minh-Thuy Tran - "
*An Arrow-like theorem over median algebras*", Miguel Couceiro and Bruno Teheux - "
*User Experience Driven Design of MCDA Problems with DecisionCloud*", Michel Zam, Brice Mayag, Meltem Ozturk

##### 14h00 Session 9

**Invited speaker**:

- "
*Preference modeling with Choquet integral*",

Michel Grabisch,

University Paris 1, France

In this talk, we show how capacities and the Choquet integral emerge as natural ingredients when building a multicriteria decision model, especially when the criteria cannot be considered as independent. To face the complexity of the model, we provide efficient submodels based on k-additive capacities, which are naturally connected with the interaction indices, quantifying the interaction existing among criteria in a group of criteria. The case of 2-additive capacities seems to be of particular interest, since it leads to a model which is convex combination of an additive model and max and min over any pair of two criteria. Lastly, we address the issue of the identification of the model through learning data and preferences.

**Invited speaker**:

- "
*Preference modeling with Choquet integral*",

Michel Grabisch,

University Paris 1, France

##### 15h00 Coffee break

##### 15h30 Session 10

- "
*Principled Techniques for Distance-based Clustering of Rankings*",

Paolo Viappiani

^{2},^{1}LIP6, Université Pierre et Marie Curie

We consider the problem of clustering rank data, focusing on distance-based methods. Two main steps need to be performed: aggregating rankings of the same cluster into a representative ranking (the cluster’s centroid) and assigning each ranking to its closest centroid according to some distance measure. A principled way is to specify a distance measure for rankings and then perform rank aggregation by explicitly minimizing this distance. But if we want to aggregate rankings in a specific way, perhaps using a scoring rule giving more importance to the first positions, which distance measure should we use?

Motivated by the (known) observation that the aggregated ranking minimizing the sum of the Spearman distance with a set of input rankings can be computed efficiently with the Borda rule, we build a taxonomy of aggregation measures and corresponding distance measures; in particular we consider extensions of Spearman that can give different weights to items and positions.

- "
*An interactive approach for multiple criteria selection problem*",

Anıl Kaya

^{1}, Özgür Özpeynirci^{1}, Selin Özpeynirci^{2},^{1}Izmir University of Economics, Department of Logistics Management,

^{2}Izmir University of Economics, Industrial Engineering Department

In this study, we develop an interactive algorithm for the multiple criteria selection problem that aims to find the most preferred alternative among a set of known alternatives evaluated on multiple criteria. We assume the decision maker (DM) has a quasiconcave value function that represents his/her preferences. The interactive algorithm selects the pairs of alternatives to be asked to the DM based on the estimated likelihood that an alternative is preferred to another one. After the DM selects the preferred alternative, a convex cone is generated based on this preference information and the alternatives dominated by the cone are eliminated. Then, the algorithm updates the likelihood information for the unselected pairwise questions. We present the algorithm on an illustrative example problem.

- "
*FlowSort parameters elicitation: the case of partial sorting*",

Dimitri Van Assche

^{1}, Yves De Smet^{1},^{1}CoDE, Université libre de Bruxelles

We consider the context of partial sorting. We address the problem of finding the parameters of the FlowSort method using an existing categorization. This contribution constitutes an extension of a method we have developed in the context of complete sorting. It relies on the use of a dedicated Genetic Algorithm based on variations of search parameters. We show how to manage the problem of correct categorization prediction, which is more difficult, since ranges of categories are considered. The method is tested on three different datasets for which a partial sorting has been generated with a particular instantiation of FlowSort.

- "
*On confident outrankings with multiple criteria of uncertain significance*",

Raymond Bisdorff

^{1},^{1}Université du Luxembourg

We develop Monte Carlo simulation techniques for taking into account uncertain criteria significance weights and ensuring an a priori level of confidence of the Condorcet outranking digraph, depending on the decision maker. Those outranking situations that cannot be ensured at a required level of confidence are assumed to be indeterminate. This approach allows us to associate a given confidence degree to the decision aiding artifacts computed from a bipolarly-valued outranking, which accounts for the essential and unavoidable uncertainty of numerical criteria weights.

- "
*Principled Techniques for Distance-based Clustering of Rankings*",

Paolo Viappiani

^{2},^{1}LIP6, Université Pierre et Marie Curie

- "
*An interactive approach for multiple criteria selection problem*",

Anıl Kaya

^{1}, Özgür Özpeynirci^{1}, Selin Özpeynirci^{2},^{1}Izmir University of Economics, Department of Logistics Management,

^{2}Izmir University of Economics, Industrial Engineering Department

- "
*FlowSort parameters elicitation: the case of partial sorting*",

Dimitri Van Assche

^{1}, Yves De Smet^{1},^{1}CoDE, Université libre de Bruxelles

- "
*On confident outrankings with multiple criteria of uncertain significance*",

Raymond Bisdorff

^{1},^{1}Université du Luxembourg