首屆國際理論計算機聯合大會(International Joint Conference on Theoretical Computer Science,IJTCS)將於2020年8月17日-21日在線上舉行,由北京大學與中國工業與應用數學學會(CSIAM)、中國計算機學會(CCF)、國際計算機學會中國委員會(ACM China Council)聯合主辦,北京大學前沿計算研究中心承辦。
本次大會的主題為「理論計算機科學領域的最新進展與焦點問題」。大會共設7個分論壇,分別對算法博弈論、區塊鏈技術、多智能體強化學習、機器學習理論、量子計算、機器學習與形式化方法和算法與複雜性等領域進行深入探討。同時,大會特別開設了青年博士論壇、女性學者論壇與本科生科研論壇,薈集海內外知名專家學者,聚焦理論計算機前沿問題。有關信息將持續更新,敬請關注!
本期帶來「多智能體強化學習」分論壇精彩介紹。
多智能體強化學習是近年來新興的研究領域,它結合博弈論與深度強化學習,致力於解決複雜狀態、動作空間下的群體智能決策問題,在遊戲AI、工業機器人、社會預測等方面具有廣泛的應用前景。當前,中國研究者在多智能體算法收斂性理論、多智能體通訊機制學習算法、大規模多智能體系統等問題取得許多進展,正與全世界的研究者一道推進多智能體強化學習的研究。本次 IJTCS MARL Track 將聚焦多智能體通訊算法、基於世界模型的強化學習算法、多智能體策略評估、多智能體強化學習的解概念等前沿課題,希望與廣大研究者一同探討多智能體強化學習的未來發展方向。
Online Search and Pursuit-Evasion in Robotics
In search and pursuit-evasion problems one team of mobile entities are requested to seek, a set of fixed objects or capture another team of moving objects in an environment. Searching strategy or motion planning plays a key role in any scenario. In this talk we briefly introduce several exploration and search models in an unknown environment, and propose a number of challenging algorithmic problems.
A Distance Function to Nash Equilibrium
Nash equilibrium has long been a desired solution concept in economics and game theoretical studies. Although the related complexity literature closed the door to efficiently compute the exact equilibrium, approximation methods are still sought after in its various application fields, such as online marketing, crowdsourcing, sharing economy and so on. In this paper, we present a new approach to obtain approximate Nash equilibrium in any N-player normal-form zero-sum game with discrete action spaces, which is applicable to solve any general N-player game with some pre-processing. Our approach defines a new measure for the distance between the current joint strategy profile of players and that of a Nash equilibrium. The computing process transforms the task of finding the equilibrium into one of finding a global minimization solution. We solve it based on a gradient descent algorithm and further prove the convergences of our algorithm under moderate assumptions. We next compare our algorithm with baselines by experiments, show consistent and significant improvement in approximate Nash equilibrium computation and show the robustness of the algorithm as the game size increases.
Model-based Multi-Agent Reinforcement Learning
Multi-agent reinforcement learning (MARL) typically suffers from low sample efficiency due to useless multi-agent exploration in the state & joint action space. In single-agent RL tasks, there has been an increasing interest of building environment dynamics model and performing model-based RL to improve the sample efficiency. In this talk, I will perform an attempt to build model-based methods to achieve sample-efficient MARL. First, I will discuss several important settings of model-based MARL tasks and the key challenges there. Then I will delve into the decentralized model-based MARL setting, which can be used on almost all decentralized model-free methods of MARL. Theoretic bound on policy value discrepancy will be derived, based on which an effiicient decentralized model-based MARL algorithm will be introduced. Further, I will show the preliminary experimental results. The final takeaway of this talk will be the discussion of feasibility and challenges of model-based MARL.
Solution Concepts in Multi-agent Reinforcement Learning
Nash equilibrium has long been a well-studied solution concept in game theory. Naturally, multi-agent reinforcement learning algorithms usually set Nash equilibrium as the laerning objective. However, in many situations, other solution concepts such as Stackelberg equilibrium and correlated equilibrium have potential to perform better than Nash equilibrium. In this talk, we will talk about two MARL algorithms, bi-level actor-citic (Bi-AC) and signal instructed coordination (SIC), which aim to solving Stackelberg and correlated equilibrium respectively.
Learning Multi-Agent Cooperation
Cooperation is a widespread phenomenon in nature, from viruses, bacteria, and social amoebae to insect societies, social animals, and humans. It is also crucially important to enable agents to learn to cooperate in multi-agent environments for many applications, e.g., autonomous driving, multi-robot control, traffic light control, smart grid control, network optimization, etc. In this talk, I will focus on the latest reinforcement learning methods for multi-agent cooperation via joint policy learning, communication, agent modeling, etc.
An Overview of Game-Based AI Competitions---From a Perspective of AI Evaluation
Intelligence exists when we measure it! A game-based AI competition explicitly depicts our imagination of intelligence, therefore recently, holding this kind of competition is quite popular in AI conferences such as AAAI, IJCAI. With its bright and accurate definition of problems, unified platform environment, fair performance assessment mechanism, open data set, and benchmark, game-based AI competition has attracted many researchers, thus accelerating the development of artificial intelligence technology.
There is a new trend of game-based competitions that hosts a competition for a long time with an online platform, and this will encourage researchers and fans of AI to continuously work on a task and share information at any time. The platform enables us to test the learning ability of bots as well. In this trend, we are facing the problem of evaluating an enormous amount of bots quickly and fairly.
Through the collection and analysis of various competitions, this paper finds that the games used in the competitions are becoming more complex, and the techniques used in the matches are also becoming more complex. The judgment for a match becomes more time consuming and sometimes yield results with randomness. These problems, combined with an increase in the number of participants, have led to the need for organizers to improve the race process to produce fair results on time.
An emerging MCTS (Monte Carlo Tree Search) based AI evaluation method is worthy of our attention. Hopefully, this method may measure the intelligent levels of a bot quantitatively and possibly compare bots created for different games. Besides the above, measuring a bot’s cooperative ability in a multi-agent (three agents or more) system is still an open problem.
本次大會已經正式面向公眾開放註冊!每位參與者可以選擇免費註冊以觀看線上報告,或是支付一定費用以進一步和講者就報告內容進行交流,深度參與大會的更多環節。
註冊截止:2020年8月15日23:59
點擊 ↓↓↓二維碼↓↓↓ 跳轉註冊頁面:
*學生註冊:網站上註冊後需將學生證含有個人信息和學校信息的頁拍照發送至IJTCS@pku.edu.cn,郵件主題格式為"Student Registration + 姓名"。
John Hopcroft
中國科學院外籍院士、北京大學訪問講席教授
張平文
中國科學院院士、CSIAM理事長、北京大學教授
大會網站:
https://econcs.pku.edu.cn/ijtcs2020/IJTCS2020.html
註冊連結:
https://econcs.pku.edu.cn/ijtcs2020/Registration.htm
大會贊助、合作等信息,請聯繫:IJTCS@pku.edu.cn
— 版權聲明 —
本微信公眾號所有內容,由北京大學前沿計算研究中心微信自身創作、收集的文字、圖片和音視頻資料,版權屬北京大學前沿計算研究中心微信所有;從公開渠道收集、整理及授權轉載的文字、圖片和音視頻資料,版權屬原作者。本公眾號內容原作者如不願意在本號刊登內容,請及時通知本號,予以刪除。