Title: Chaotic Behavior: From Self-Similarity To Prediction
Speaker: Dr. Michael Siek, Delft University of Technology and Unesco-Ihe
When and Where:
Date: April 18, 2012
Room: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10, room 0.015
Chaos theory has become an important field of study in mathematics for understanding and modeling complex dynamical systems, with many areas of applications, including: engineering, biology, economics, geophysics and psychology. Methods in chaos theory have been significantly enhanced with considerable emergent research since Edward Lorenz made a discovery in 1963 during his experiments with a simplified atmospheric model. He discovered the sensitivity to initial conditions that lead to chaos theory or butterfly effect.
Behavior in a complex dynamical system appears irregular or unpredictable but is actually deterministic. Nowadays, a large number of dynamical systems are discovered to behave in chaotic manners, whereas previously these systems were believed to act randomly. Much of what one calls coincidence or chance is actually chaos, for example rolling a pair of dice. The dynamical systems characterized by deterministic chaos are predictable. If one can control chaos, with just a little bit of mathematics, everything can be controllable.
Several case studies, like storm surge dynamics in the North Sea have been explored to reveal the chaotic model predictability. The model exploits the dynamical neighbors (self-similarity) found in the higher-dimension of the reconstructed phase space as the main control for prediction. The self-similarity here is identified as similar storm surge patterns or developments in the past. A number of additional features can be incorporated into a chaotic model to enhance its predictability, i.e. the use of a data assimilation technique or ensemble chaotic models. The modeling results show that the predictive chaotic model outperforms the other machine learning techniques, like neural network models. This demonstrates that the modeling technique based on chaos theory can serve as an efficient tool for accurate and reliable short and mid-term predictions.
Title: Teaching a Robot How to Perform New Tasks Though Demonstrations and Feedback
Speaker: Eduardo F. Morales,
Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), Mexico
Time: Tuesday March 27, 13:30-14:30
Location: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10, room 0.015
Abstract: Service robots are rapidly expanding and soon will become part of our everyday life. To personalize service robots to the user's needs, robots will have to learn new tasks according to the preferences of the users who will have to teach them in natural and accessible ways.
In this talk, we will show how to teach a robot a new task using simple demonstrations and voice feedback. We consider three demonstrations
modes: (i) controlling a robot with a joystick, (ii) instructing the robot with voice, and (iii) executing the task while the robot is watching with a Kinect sensor. Demonstrations provide an initial guidance and focus the search space and are followed by a reinforcement learning process to improve over the original sequences performed by the user. The low-level sensor information of the user's demonstration is transformed into a more abstracted representation to allow the system to learn general control policies applicable to different instances of the task. During the learning process the user can provide on-line feedback in the form of actions or qualifiers over the performance of the robot, which are translated into additional rewards to provide additional guidance. The effectiveness of the proposed approach is shown on simulation and real robots for simple navigation tasks and for a pick-and-place task.
Title: Heuristic Search of Multiagent Influence Space
Speaker: Dr. Frans Oliehoek, Massachusetts Institute of Technology.
Date/Time: Thursday, 17 November, 2011, 11:00-12:00
Location: Maastricht University, Department of Knowledge Engineering, St. Servaasklooster 39, Swarmlab (third floor).
Multiagent lanning under uncertainty has seen important progress in recent years. Two techniques, in particular, have substantially advanced efficiency and scalability of multiagent planning. First, heuristic search gains traction by pruning large portions of the joint policy space. Second, influence-based abstraction reformulates the search space of joint policies into a smaller space of influences, which represent the probabilistic effects that agents' policies may exert on one another.
These techniques have been used independently, but never together, to solve solve larger problems (for Dec-POMDPs and subclasses) than was previously possible.
In this talk, I will explain how we combine multiagent A* search and influence-based abstraction into a single algorithm. This enables an initial investigation into whether or not the two techniques bring complementary gains. Our empirical results indicate that, even with a relatively simple heuristic, A* can provide significant computational savings on top of those already afforded by influence-space search, thereby bringing a significant contribution to the field of multiagent planning under uncertainty.
Title: Is there enough biology in bioinformatics?
Speaker: David Morrison (Swedish University of Agricultural Sciences)
Date/Time: Thursday, May 12, 2011, 16:00-17:00
Location: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10, room 0.009
There is a long tradition of mathematicians and computer scientists collaborating with people in the life sciences (eg. most medical research is a collaboration between medical practitioners and statisticians), but in the past 20 years this tradition has taken a quantitative leap with the development of bioinformatics, which is a nexus between mathematicians, computer scientists and biologists the like of which has not been seen before. The biological insights gained through bioinformatics are truly impressive, especially in molecular genetics. Nevertheless, it is pertinent to note that biologists are clearly the junior partner in this exercise, with most of the work being driven by computational interests (and restrictions) rather than long-established requirements of biologists. In this talk I will present two examples from my own research experience, in which key biological pre-requisites have been overlooked by all of the parties concerned: multiple sequence alignment, and phylogenetic networks. This has lead to the production of bioinformatic techniques that are inappropriate for some purposes, although the associated computer programs have been widely used (and therefore misused). It seems that biologists are not yet communicating clearly enough with the mathematicians and computer scientists (or vice versa), at least with regard to clearly recognizing and enunciating all issues that may be of importance in developing computational methods.
- Slides (PDF, 1,14 MB)
Title Fleet Intelligence: Industrial Asset Management meets Machine Learning
Speakers: Frank Kirschnick, Ph.D (CEO, Cassantec Ltd) and Katerina Stamou, M.Sc., (Solution Architect, Cassantec Ltd)
Date: Thursday, March 17, 2011, 16:00 - 17:00h
Operation of mission-critical industrial assets, such as turbines, generators, transformers, pumps, compressors, oscillators or pulverizers, is a challenging task. Commercial objectives of operators are jeopardized by the risk and costs of downtime and lost output, in a trade-off with preventive maintenance and insurance costs. To ensure safe and reliable operation of assets in the power and processing industries, operators rely on condition monitoring and diagnostic techniques. These techniques are primarily based on recorded asset condition and process data, such as vibration, lubricant, thermal, acoustic, ultrasonic, electrical, pressure, flow or speed parameters. With this data, diagnostic techniques allow inferring and reporting malfunction before failure and damage occurs. This may significantly reduce the risk and costs of unscheduled downtime. Typically, the earlier a malfunction warning is given, the higher the benefits for the asset operator. In many circumstances, the prognostic horizon of condition data is key to commercially optimal asset management. Yet, malfunction prognostics require more disciplined condition and process data management, more sophisticated stochastic modelling and more computational intelligence than pure diagnostics. So far, most "predictive diagnostic" approaches ask for operator gut feel rather than considering the full prognostic horizon of the available data histories, often archived over several years. We present a new approach to intelligent malfunction prognostics, utilizing the prognostic horizon of the available data histories and exploiting complementary operator experience and manufacturer know-how. Based on a novel combination of best practice techniques from Operations Research, Artificial Intelligence and Data Mining, we show how to drive a fleet-wide automated, distributed learning process allowing detection and prevention of mechanical and electrical malfunctions earlier and with higher accuracy than traditional condition monitoring and diagnostic systems did. The presentation will be supported by an online demo of recent practice applications in the power industries. Background www.cassantec.com
Title: Depth-First Proof-Number Search in the Presence of Repetitions
Speaker: Dr. Akihiro Kishimoto (Tokyo Institute of Technology)
Date: Wednesday, February 23, 2011, 16:00 - 17:00h
Depth-first proof-number search (df-pn) is a powerful AND/OR tree search algorithm proven to be effective in solving positions in games. However, df-pn has a notorious problem of infinite loops when applied to domains with repetitions. Df-pn(r) practically cures it by ignoring proof and disproof numbers that may lead to infinite loops. In this talk, I point out that df-pn(r) has a serious issue of underestimating proof and disproof numbers, while it also suffers from the overestimation problem occurring in directed acyclic graph. I present two practical solutions to these problems. While bypassing infinite loops, the threshold controlling algorithm (TCA) solves the underestimation problem by increasing the thresholds of df-pn. The source node detection algorithm (SNDA) detects the cause of overestimation and modifies the computation of proof and disproof numbers. Both TCA and SNDA are implemented on top of df-pn to solve tsume-shogi (checkmating problem in Japanese chess). Results show that df-pn with TCA and SNDA is far superior to df-pn(r). Our tsume-shogi solver is able to solve several difficult positions previously unsolved by any other solvers.
Title: Diffusion in Social Networks with Competing Products
Speaker: Prof. Dr. Krzysztof Apt (Universiteit van Amsterdam)
Date/Time: Wednesday, February 9, 2011, 16.00h - 17.00h
We study a threshold model in social networks, in which the nodes influenced by their neighbours can adopt one out of several alternatives. We provide polynomial time algorithms that allow us to determine whether in a given social network a specific alternative can, respectively has to, be adopted by all nodes. We also provide a polynomial time algorithm that allows us to determine whether a given social network yields a unique outcome. Finally, we exhibit a class of social networks that possess a unique outcome. Our algorithms are obtained by first providing a characterization of the class of graphs that result in possible (respectively necessary) adoption of a product by the whole network and a characterization of the networks that admit a unique outcome.
Title: Enumeration and Exact Design of Weighted Voting Games
Speaker: Dr. Tomas Klos, Delft University of Technology
Date/Time: Wednesday 26th of January, 2011, 16.00h - 17.00h
In many multiagent settings, situations arise in which agents must collectively make decisions while not every agent is supposed to have an equal amount of influence in the outcome of such a decision. Weighted voting games are often used to deal with these situations. The amount of influence that an agent has in a weighted voting game can be measured by means of various power indices. This paper studies the problem of finding a weighted voting game in which the distribution of the influence among the agents is as close as possible to a given target value. We propose a method to exactly solve this problem. This method relies on a new efficient procedure for enumerating weighted voting games of a fixed number of agents. The enumeration algorithm we propose works by exploiting the properties of a specific partial order over the class of weighted voting games. The algorithm enumerates weighted voting games of a fixed number of agents in time exponential in the number of agents, and polynomial in the number of games output. As a consequence we obtain an exact anytime algorithm for designing weighted voting games.
Title: Stochastic games with endogenous transitions
Speaker: Dr. Reinoud Joosten (University of Twente)
Date/Time: Wednesday, January 12, 2011, 16.00h - 17.00h
We introduce a stochastic game in which transition probabilities depend on the history of the play, i.e., the players' past action choices. To solve this new type of game under the limiting average reward criterion, we determine the set of jointly-convergent pure-strategy rewards which can be supported by equilibria involving threats. For motivational and expository purposes, we examine the following setting. Each period, two agents exploiting a fishery choose between catching with restraint or without. The fish stock is in either of two states, High or Low, and in the latter each action pair yields lower payoffs. Restraint is harmless to the fish, but it is a dominated strategy in each stage game. Absence of restraint damages the resource, i.e., the less restraint the agents show, the higher the probabilities that Low occurs at the next stage of the play. This state may even become 'absorbing', i.e., transitions to High become impossible.
Title: Single Player Monte-Carlo Tree Search
Speaker: Prof. Dr. Tristan Cazenave (Université Paris-Dauphine)
Date/Time: Wednesday, December 15, 2010, 11:15-12:00
Monte-Carlo Tree Search has recently been very successful for game playing particularly for games where the evaluation of a state is difficult to compute, such as Go or General Games. In this talk I compare Nested Monte-Carlo Search (NMC), Upper Confidence bounds for Trees (UCT-T), UCT with transposition tables (UCT+T) and a simple combination of NMC and UCT+T (MAX) on single-player games of the past GGP competitions. I show that transposition tables improve UCT and that MAX is the best of these four algorithms. Using UCT+T, the program Ary won the 2009 GGP competition. MAX and NMC are slight improvements over this 2009 version.
Title: Decomposition of Constraint Systems: Three equivalent perspective
Speaker: Prof. Dr. Cees Witteveen (TU Delft)
Date/Time: Wednesday, December 15, 2010, 10:30 - 11.15h
Abstract:Decomposition is a technique to split a problem in a number of parts such some global property of the problem can be obtained or preserved by independent, distributed, processing of parts of the problem. Decomposition has been used in the database and constraint community to allow distributed parties to find solutions for a set of constraints or to modify a set of constraints independently from each other. We consider the complexity of finding suitable decompositions in constraint systems distinguishing different decomposition aims. One such an aim, especially popular in the artificial intelligence community, is finding solutions for constraint systems by distributed computations. Decomposition then should preserve solutions, that is local solutions should be mergeable to a global solution. Another aim, often encountered in the database community, is to be able to preserve consistency while allowing each party to add arbitrary local constraints to its local constraint store. We show that contrary to intuition these two modes do not differ from each other. Furthermore, we address some computational complexity results w.r.t. finding solutions for constraint systems by applying decomposition techniques.
Title: High Dimensional, Small Sample Size Data Classification Using Advanced Machine Learning Methods
Speaker: Dr. Oleg Okun
Date/Time: Tuesday, November 16, 2010, 11.00h - 12.00h
In this talk, I will present my latest research on one of the main machine learning tools –classifier ensembles - when applied to challenging problems where the number of features far exceeds the number of available data instances. The main messages of the talk include a simple yet powerful ensemble generation technique and low bias, low variance technique for assessment of classification error. The practical application where both messages were successfully delivered is cancer classification.
Title: A coalition formation value.
Speaker: Prof. Dr. Michel Grabisch (Sorbonne Economic Centre, Université Paris 1)
Date/Time: Tuesday, June 15 2010, 13.00h - 14.00h
Coalition formation is an important topic related to game theory, multi-agent systems and economics. Supposing that coalitions earn some benefit or pay some cost for using goods or services, a central problem is how to determine the allocation of the benefit or cost among all participating players/agents. Our talk will consist of two parts, dealing with two particular situations. We put N, the set of all players, of cardinality n.
The first part considers the case of games with externalities, best modelled under partition function form game: it is assumed that the worth of a coalition depends on the organization of the society of the remaining players. Furthermore, we suppose that the starting point is the society of all individual players (no coalition is formed), and that step by step this society evolves till attaining the grand coalition (all players are in a single coalition). We introduce the notion of scenario and process. A process is a sequence of partitions of players, starting from the finest partition (the society of all individual players) and finishing at the coarsest partition, that is, the set of players itself, and between two steps, there is one and only one merging of two blocks of the current partition. Each process gives rise to n scenarios, one per player, which are simply the process under the viewpoint of a given player. The original point of our approach is to define a value for scenarios, considered as an elementary brick, then from this a value for process, and finally for the game itself. We propose two axiomatizations of the scenario value, reflecting well the idea of coalition formation.
In the second part, we do not consider externalities, but coalitions can be formed completely freely. That is, the sequence of coalitions is arbitrary, and in particular need not be a chain. The model behind this is a Markovian approach of coalition formation, i.e., coalitions are viewed as states and therefore from the transition matrix, we can compute for each sequence of coalitions its probability of occurrence. Such sequences with positive probabilities correspond to scenarios of the first part. We define a value for scenarios, and the value of the game under a given Markov process is the expectation of all scenario values. We give also axiomatizations of the scenario value.
Title: Multi-Agent Learning and Repeated Matrix Games.
Speaker: Dr. Bruno Bouzy (LIPADE - UFR de mathematiques et d'informatique, Universite Paris Descartes)
Date/Time: Monday, May 10, at 16:00h
This talk presents a work in the field of multi-agent learning (MAL) by using repeated matrix games (RMG). In a first part, basic game theory concepts such as one-shot matrix game, correlated and Nash Equilibria are presented. A second part lists the main approaches of MAL algorithms playing RMG. Then a third part describes experiments to evaluate MAL playing repeated matrix games. Previous works assessed that Q-learning surpassed Nash-based MAL algorithms. Our work shows that M-Qubed, S and bandit-based algorithms such as UCB are the best algorithms on general-sum games, Exp3 being the best on cooperative games and zero-sum games. Moreover, our experiments show that two features - forgetting the far past, and using recent history with states - improve most of the algorithms. Finally, the best algorithms are Q-learning and UCB, enhanced with the two features, and M-Qubed.
Title: Computational Analysis of Nonlinear Hybrid Systems.
Speaker: Pieter Collins (Centrum voor Wiskunde en Informatica, Amsterdam)
Date/Time: Wednesday, April 28, at 16:00h
In this talk I will give an overview of the tool Ariadne for reachability analysis and verification of nonlinear and hybrid systems. This is especially an important problem for embedded systems, where software-driven devices need to interact with a physical environment. Among other issues, I will discuss the data structures for describing numbers, functions, sets and systems, the main computational sub-problems and the algorithms used for each one. I will also indicate how the capabilities of the tool might be used to study other questions about dynamical systems, with some examples from systems biology.
Title: Modularity in reinforcement learning using functional gradient boosting.
Speaker: Kurt Driessens (Department of Computer Science, K.U. Leuven)
Date/Time: Friday, April 9, at 14:00h
In this talk I will present non-parametric policy gradients and illustrate the opportunity they provide for learning and re-using modular policies in model-free reinforcement learning. Non-parametric policy gradients use functional gradient boosting to avoid the need for a fixed parameterization that is standard in typical policy gradient learning techniques. By representing the learned policy using the sum of arbitrary potential functions, it becomes possible to join multiple representation and generalization techniques into a single gradient-based policy search algorithm. In theory, this makes it simple to produce and re-use partial potential functions (and thus partial policies) that can be combined to address non-disjoint goals. I will introduce a test-bed proposal based on stunt-kite control which would allow to test, verify and illustrate the applicability of this type of autonomous incremental learning and its possibly beneficial combination with learning from demonstrations.
Title: Towards a "Strategic" Web
Speaker: Dr. Michael Rovatsos (University of Edinburgh, UK)
Date/Time: Monday, January 25, at 14:00h
Strategic reasoning and decision making, i.e. reasoning in environments in which one is interacting with other agents who may have conflicting views or objectives, is ubiquitous in large-scale multiagent environments such as the Web. Yet, in the current Web landscape, users are not supported by existing Web technologies in the process of managing their strategic interactions, and this constitutes a major technological barrier to better exploitation of the "strategic layer" of the Web. In this talk, I will present a vision of what could be termed the "Strategic" Web, i.e. knowledge-based methods based on multiagent systems, AI planning, and game theory that may have the potential of scaling up sufficiently well to be used in real-world applications in complex domains of strategic interaction. I will present some recent contributions, both from our own work as from the literature that give a flavor for the kinds of techniques that could be relevant, and discuss the main challenges of potential future work.
Title: Protecting Against Overfitting in Empirical Reinforcement Learning
Speaker: Shimon Whiteson (UvA).
Date/Time: Friday, December 4, at 14:00h
In this talk, I ask the question: what is the best way to conduct empirical evaluations in reinforcement learning? In general, designing empirical methodologies is difficult because learners can overfit test evaluations and thereby obtain misleadingly high scores. In data overfitting, the learned function is overfit to the data on which it was trained, while in task overfitting the learning algorithm itself is overfit to the task. I propose a model of empirical methodologies in which an adversarial designer submits learners to an evaluator to be scored. Given a methodology, I consider how well the evaluator can protect against data and task overfitting in a one-shot setting, in which all learners are evaluated at the same time, or a timeless setting, in which learners evaluated across time are compared. I analyze several single-task methodologies and conclude that, while often adequate in supervised learning, they fail to provide any reliable protection against task overfitting in reinforcement learning, wherein it is infeasible to rely on fixed data sets. To address these limitations, I advocate departing from single-task methodologies and propose a new generalized task methodology in which evaluations are based on multiple tasks sampled from a distribution. Generalized tasks make it possible to specify, by selecting the distribution, what types of task generality are desired from the learner. The resulting methodology can then timelessly protect against overfitting to tasks within that distribution. Furthermore, since it does not rely on fixed data sets, this protection is reliable even in reinforcement learning. Finally, I present illustrative results in a generalized helicopter hovering task based on a recent Reinforcement Learning Competition.
Title: Agents under pressure: Cooperation arises before extinction.
Speaker: Konstantin Klemm, Department of Bioinformatics, University of Leipzig, Germany.
Date/Time: Wednesday, November 11, at 16:00h
Cooperation and altruistic behavior are often enhanced in spatially structured populations as compared with mean-field scenarios [Nowak and May, 1992]. In most studies of evolutionary game theory with spatial structure, however, the size of the population is assumed to be constant. Here we demonstrate that the variation of population size due to environmental pressure induces non-trivial effects: As increasing pressure drives the population towards extinction, cooperation prevails even if altruistic acts have arbitrarily benefit-cost ratio. The overall population size behaves non-monotonically with environmental pressure: Increased attacks from predators may lead to a growth of the prey population. These effects are captured analytically by pair approximation and lead to a complex phase diagram for cooperation.
Title: Rational Altruistic Punishment with Foresighted Policy Gradient Reinforcement Learning
Speaker Sander Bohte (CWI)
Website of the speaker: http://www.bohte.com/sander/
Date/Time: October 9th, 13h-14.30h
It is well known that in social dilemmas, like Hardin's Tragedy of the Commons or the classic iterated Prisoner's Dilemma, it can be rational for self-interested agents to promote and sustain cooperation by altruistically dispensing costly punishment to other agents, thus maximizing their own long-term reward. However, agents using most current multi-agent reinforcement learning algorithms will not sustain cooperation in social dilemmas: the algorithms do not sufficiently capture the consequences on the agent's reward of the interactions that it has with other agents. Recent more foresighted algorithms specifically account for such expected consequences, and have been shown to work well for the small scale Prisoner's Dilemma. However, they quickly become intractable for larger social dilemmas. Here, we advance on this work and develop a stateless foresighted policy gradient reinforcement learning algorithm that scales well to larger settings like the Tragedy of the Commons. We show for a variety of settings that large groups of self-interested agents using this algorithm will robustly find and sustain cooperation in social dilemmas where agents can punish the behavior of other agents.
Title: An overview of multi-index assignment problems
Speaker: Prof. Dr. Frits Spieksma (ORSTAT, Faculty of Business and Economics, K.U. Leuven)
Date/Time: Wednesday, June 3 2009, 16.00h - 17.00h
Abstract: In this presentation we give an overview of applications of, and algorithms for, multi-index assignment problems (MIAPs). MIAPs, and relatives of it, have a long history both in applications as well as in theoretical results, starting at least in the 1950's. In particular, we focus here on the complexity and approximability of special cases of MIAPs. A prominent example of a MIAP is the so-called axial three index assignment problem (3AP) which has many applications in a variety of domains including clustering. A description of 3AP is as follows. Given are three n-sets R, G, and B. For each triple in R × G × B a cost c_ijk is given. The problem is to find n triples such that each element in R union G union B is in exactly one triple, while minimizing total cost.
We show positive and negative results for finding an optimal solution to this problem that depend upon different ways of how the costs c_ijk are specified.
Titel: Role identification in complex biological processes applied to the cell cycle model of fission yeast
Speaker: Stef Zeemering (Maastricht University, IDEE)
Date/Time: Wednesday, 13 May 2009, 16.00h - 17.00h
The talk will focus on automated role identification in complex bio-logical processes based on linearized interaction matrices using notions and techniques from systems and control theory. A first step in modeling biological processes is to increase the understanding of the role or function of a process entity within the entire process. A method has been developed to distinguish between activating and inactivating roles of regulatory process entities. These roles may vary during different process phases along with changes in the internal feedback control mechanisms. The developed method is applied to the well-known Tyson-Novak model of the cell division cycle of fission yeast. The roles and role transitions of the model entities have been computedby analyzing the structure of a linearized version of the model at several points on the limit cycle. These roles and role transitions coincide to a large extent with the known cell cycle phases (G1, S, G2 and M) and checkpoints (G1/S and (G2/M). This indicates that the developed procedure for automated role identification is able to detect cell cycle phases and checkpoints in the Tyson-Novak model of the cell division cycle of fission yeast. Two approaches have been tested, based on the values or the signs of the linearized model interactions. Both approaches yield promising results, most notably the detection of the strong central role of Cdc13T . An important characteristic of the procedure is that it does not simply register peaks or switches, but that it looks at the role of an entity in the overall process instead. The procedure can serve as a useful tool in the identification of phases and checkpoints in complex biological processes in which the underlying dynamics are largely unknown.