Invited speakers

Paul Colognese

Biography

Title: Goal Detection for AI Catastrophe Prevention

Absrtact: As AI agents advance in capabilities, they promise immense benefits but also pose significant risks. If we build powerful AI agents that pursue misaligned goals, the consequences could be catastrophic. Training AI to consistently pursue aligned goals is challenging for various reasons, including the potential for deceptive AI agents to actively mislead oversight processes. This talk explores whether goals might be represented in the computational substrates running the agent and how we might leverage these representations for goal oversight, thereby contributing to the prevention of potential AI-driven catastrophes.

OTHER Invited speakers

Eran Agmon - University of Connecticut

Eran Agmon - University of Connecticut

More info about Eran Agmon - University of Connecticut

Manuel Baltieri - Araya Inc., Japan

Manuel Baltieri - Araya Inc., Japan

More info about Manuel Baltieri - Araya Inc., Japan

Xabier Barandiaran - University of the Basque Country, Spain

Xabier Barandiaran - University of the Basque Country, Spain

More info about Xabier Barandiaran - University of the Basque Country, Spain

Benjamin De Bari - DeSales University, Pennsylvania, USA

Benjamin De Bari - DeSales University, Pennsylvania, USA

More info about Benjamin De Bari - DeSales University, Pennsylvania, USA

Jitka Čejková - University of Chemistry and Technology, Prague, Czech Republic

Jitka Čejková - University of Chemistry and Technology, Prague, Czech Republic

More info about Jitka Čejková - University of Chemistry and Technology, Prague, Czech Republic

Daniele de Martino - Biofisika Institute, Bilbao, Spain

Daniele de Martino - Biofisika Institute, Bilbao, Spain

More info about Daniele de Martino - Biofisika Institute, Bilbao, Spain

Christian Euler - University of Waterloo, Canada

Christian Euler - University of Waterloo, Canada

More info about Christian Euler - University of Waterloo, Canada

Yusuke Himeoka - University of Tokyo, Japan

Yusuke Himeoka - University of Tokyo, Japan

More info about Yusuke Himeoka - University of Tokyo, Japan

Kazuya Horibe - Osaka University, Japan

Kazuya Horibe - Osaka University, Japan

More info about Kazuya Horibe - Osaka University, Japan

Hiroyuki Iizuka - Hokkaido University, Japan

Hiroyuki Iizuka - Hokkaido University, Japan

More info about Hiroyuki Iizuka - Hokkaido University, Japan

Hiroki Kojima - University of Tokyo, Japan

Hiroki Kojima - University of Tokyo, Japan

More info about Hiroki Kojima - University of Tokyo, Japan

Yuki Koyano - Kobe University, Japan

Yuki Koyano - Kobe University, Japan

More info about Yuki Koyano - Kobe University, Japan

Richard Löffler - University of Copenhagen, Denmark

Richard Löffler - University of Copenhagen, Denmark

More info about Richard Löffler - University of Copenhagen, Denmark

Simon McGregor - University of Sussex, UK

Simon McGregor - University of Sussex, UK

More info about Simon McGregor - University of Sussex, UK

Connor McShaffrey - Indiana University, USA

Connor McShaffrey - Indiana University, USA

More info about Connor McShaffrey - Indiana University, USA

Denizhan Pak - Indiana University, USA

Denizhan Pak - Indiana University, USA

More info about Denizhan Pak - Indiana University, USA

Fernando Rosas - University of Sussex, UK

Fernando Rosas - University of Sussex, UK

More info about Fernando Rosas - University of Sussex, UK

Ágota Tóth - University of Szeged, Hungary

Ágota Tóth - University of Szeged, Hungary

More info about Ágota Tóth - University of Szeged, Hungary

Tomas Veloz - Universidad Tecnológica Metropolitana, Chile & Centre Leo Apostel, Vrije Universiteit Brussel, Belgium

Tomas Veloz - Universidad Tecnológica Metropolitana, Chile & Centre Leo Apostel, Vrije Universiteit Brussel, Belgium

More info about Tomas Veloz - Universidad Tecnológica Metropolitana, Chile & Centre Leo Apostel, Vrije Universiteit Brussel, Belgium

Nathaniel Virgo - University of Hertfordshire, UK

Nathaniel Virgo - University of Hertfordshire, UK

More info about Nathaniel Virgo - University of Hertfordshire, UK

This website uses cookies, both its own and those of third parties. Select the option of cookies that you prefer to navigate even its total deactivation. If you click on the button "I accept" you are giving your consent for the acceptance of the mentioned cookies and the acceptance of our cookies policy. If you want to determine what you accept and what you don't, click on "Configuration". Click on the "Read more" link for more information.

Configuration

Types of cookies