Data Science Expert Series
The Data Science Expert Series aims to bring together data science experts from around the world to Maastricht University, in order to discuss groundbreaking work.
Deze tekst is alleen beschikbaar in het Engels.
In this edition, Professor Kostas Stathis (Department of Computer Science, Royal Holloway University of London) will speak about "Adaptive Strategy Templates using Deep Reinforcement Learning for Multi-Issue Bilateral Negotiation".
We present a novel multi-issue bilateral negotiation model that allows an agent to learn a time-dependent heuristic strategy from a given set of predefined tactics. A strategy template enables the model to learn (a) choice parameters about whether a tactic should be included in the strategy or not (e.g., a Boulware tactic); (b) time parameters determining when should a tactic be triggered (e.g., use Boulware at the start); and (c) attribute-value parameters determining values that constraint acceptance or bidding (e.g., reject offers with utility below a threshold). The model's learning can be accelerated using training with teacher strategies, and further improved after training through negotiation in different settings, resulting in adaptive, non pre-programmed and transferable strategies. The model can also handle uncertainty about preferences of the user it represents and derives a user model that best approximates a given user preference profile, specified only partially when the negotiation begins.
In order to perform experiments using the model, we discuss a proof-of-concept prototype using an actor-critic architecture that uses deep reinforcement learning to estimate the strategy template parameters.To handle user preference uncertainty, we use stochastic search to derive the user model that best approximates a given partial preference profile. Our implementation also applies multi-objective optimization and multi-criteria decision-making methods as one of the bidding tactics to generate near Pareto-optimal bids during negotiation. An extensive experimental evaluation demonstrates empirically that our work outperforms the state-of-the-art in terms of both individual as well as social-welfare utilities. Our results further demonstrate that it is possible to combine the learning from teacher strategies and different choice, time and attribute value parameters to transfer the experience over multiple domains. This in turn provides a significant advantage when an agent based on our model confronts unknown opponents in unseen negotiation scenarios.
Kostas Stathis studies the development of autonomous systems with cognitive and social capabilities with the twin goals of understanding interactive intelligence in symbolic computational terms and developing applications that support what people do in their everyday activities. He holds a PhD from Imperial College London and at Royal Holloway he is Professor of Artificial Intelligence, where he is a founding member of the Centre for Intelligent Systems and founder and director of the Distributed and Intelligent Computing Environments Lab. He is internationally known for developing agent models and leading the implementation of platforms that support the deployment of agents in applications such as Negotiation, Telemedicine, Ambient Intelligence and Cyber Physical Systems.
His most recent work focuses on modelling how an autonomous agent can explicitly represent the environment in which it is situated to recognise decision making opportunities for achieving its goals and how multiple agents can be organised over distributed networks to interact with other agents by inhabiting electronic environments and accessing physical worlds. He is Fellow of the BCS, a member of AAAI, ACM, IEEE and practitioner of the British Education Academy.