The Last Question - Insufficient Data

Isaac Asimov in the 1950's wrote a short story called "The Last Question" that focused on the impact that a super intelligence computer would have on the world and its message is incredibly prescient today.

While written over half a century ago, it brings to light a very interesting limitation of AI which is that most of our big problems in society are not intelligence problems at all.

In the story, the character continually presses the super intelligence to create a solution for reversing entropy and of course it always comes back with the same answer: insufficient data.

Despite OpenAI and many other companies promising solutions to societies most technically challenging problems such as cancer and limitless fusion energy, these problems paradoxically might not benefit from AI or even AGI.

In the case of large language models (LLMs), they were trained with all of the information on the internet and are still imperfect where language has defined characters, defined rules, defined grammar and an incredible degree of predictability. In the case of biology or physics, the information is far more limited and noisy, there are often undefined and/or poorly understood rules and the problems are more complex and chaotic. Because these problems have infinitely more degrees of freedom, they require infinitely more data and yet when compared to language, we have far less data.

What this means is that even with unlimited compute and boundless energy, we won't be able to solve the big problems. To solve these what is required is the ability to exponentially increase the amount of scientific data that can be generated to then feed into models. Many in Silicon Valley will claim that this data can be created synthetically via in silico models but in silico models for these difficult problems are fundamentally flawed in that they are models themselves based upon an incomplete understanding of the underlying science.

This data needs to be generated in the real physical world and we won't be able to dramatically improve our progress with AI on these problems until we greatly increase the amount of data we are generating. AI can support the development of better experiments, but someone still needs to generate this data in the real world and the pace of that will not change until AI truly merges with the physical world through robotics and can generate this type of data at scale.

Before that happens, we are going to be very disappointed with the ability of AI tools to tackle the big problems our world faces and even if we build AGI, it's going to respond to our queries with: "insufficient data."

Previous
Previous

How Could the AI Bubble Pop?

Next
Next

Why you should hire high school interns