July 2022 Newsletter

Original Source Here

MIRI has put out three major new posts:

AGI Ruin: A List of Lethalities. Eliezer Yudkowsky lists reasons AGI appears likely to cause an existential catastrophe, and reasons why he thinks the current research community—MIRI included—isn’t succeeding at preventing this from happening

A central AI alignment problem: capabilities generalization, and the sharp left turn. Nate Soares describes a core obstacle to aligning AGI systems: 

[C]apabilities generalize further than alignment (once capabilities start to generalize real well (which is a thing I predict will happen)). And this, by default, ruins your ability to direct the AGI (that has slipped down the capabilities well), and breaks whatever constraints you were hoping would keep it corrigible.

On Nate’s model, very little work is currently going into this problem. He advocates for putting far more effort into addressing this challenge in particular, and making it a major focus of future work.

Six Dimensions of Operational Adequacy in AGI Projects. Eliezer describes six criteria an AGI project likely needs to satisfy in order to have a realistic chance at preventing catastrophe at the time AGI is developed: trustworthy command, research closure, strong opsec, common good commitment, alignment mindset, and requisite resource levels.

Other MIRI updates

News and links

  • Paul Christiano (link) and Zvi Mowshowitz (link) share their takes on the AGI Ruin post.
  • Google’s new large language model, Minerva, achieves 50.3% performance on the MATH dataset (problems at the level of high school math competitions), a dramatic improvement on the previous state of the art of 6.9%.
  • Jacob Steinhardt reports generally poor forecaster performance on predicting AI progress, with capabilities work moving faster than expected and robustness slower than expected. Outcomes for both the MATH and Massive Multitask Language Understanding datasets "exceeded the 95th percentile prediction".
  • In the wake of April/May/June results like Minerva, Google’s PaLM, OpenAI’s DALL-E, and DeepMind’s Chinchilla and Gato, Metaculus’ "Date of Artificial General Intelligence" forecast has dropped from 2057 to 2039. (I’ll mention that Eliezer and Nate’s timelines were already pretty short, and I’m not aware of any MIRI updates toward shorter timelines this year. I’ll also note that I don’t personally put much weight on Metaculus’ AGI timeline predictions, since many of them are inconsistent and this is a difficult and weird domain to predict.)
  • Conjecture is a new London-based AI alignment startup with a focus on short-timeline scenarios, founded by EleutherAI alumni. The organization is currently hiring engineers and researchers, and is "particularly interested in hiring devops and infrastructure engineers with supercomputing experience".

The post July 2022 Newsletter appeared first on Machine Intelligence Research Institute.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: