And “shut it all down” is what the OpenAI board seems to have had in mind when it pushed the panic button and kicked Altman out. But the effort collapsed when OpenAI’s workers and financial backers all insisted on Altman’s return. Becuase they all realized that “shut it all down” has no exit strategy. Even if you tell yourself you’re only temporarily pausing AI research, there will never be any change — no philosophical insight or interpretability breakthrough — that will even slightly mitigate the catastrophic risks that the EA folks worry about. Those risks are ineffable by construction. So an AI “pause” will always turn into a permanent halt, simply because it won’t alleviate the perceived need to pause.
Noah Smith has some good comments on the OpenAI stuff as well:
And a permanent halt to AI development simply isn’t something AI researchers, engineers, entrepreneurs, or policymakers are prepared to do. No one is going to establish a global totalitarian regime like the Turing Police in Neuromancer who go around killing anyone who tries to make a sufficiently advanced AI. And if no one is going to create the Turing Police, then AI-focused EA simply has little to offer anyone.
What we need to do is to build the Most Powerful A.I., ever, to tell us how to stop the development of powerful A.I. !
Yeah, I liked that part as well.I particularly like the following paragraph:
Yeah, what would it take to actually halt AI development? Making it straight-up criminal to dabble in it? And then, of course, there's the matter of how to effectively enforce such a ban. And, superpowers will be worried that the other guys will get there first, so we have to be the ones who have it first.
What we need to do is to build the Most Powerful A.I., ever, to tell us how to stop the development of powerful A.I. !
(that article is a bit old, I just happened to be reading it today and your post reminded me of that idea)(5) Another key idea that Christiano, Amodei, and Buck Shlegeris have advocated is some sort of bootstrapping. You might imagine that AI is going to get more and more powerful, and as it gets more powerful we also understand it less, and so you might worry that it also gets more and more dangerous. OK, but you could imagine an onion-like structure, where once we become confident of a certain level of AI, we don’t think it’s going to start lying to us or deceiving us or plotting to kill us or whatever—at that point, we use that AI to help us verify the behavior of the next more powerful kind of AI. So, we use AI itself as a crucial tool for verifying the behavior of AI that we don’t yet understand.
There have already been some demonstrations of this principle: with GPT, for example, you can just feed in a lot of raw data from a neural net and say, “explain to me what this is doing.” One of GPT’s big advantages over humans is its unlimited patience for tedium, so it can just go through all of the data and give you useful hypotheses about what’s going on.
This is on aligning rather than halting AI, but:
https://scottaaronson.blog/?p=6823
(that article is a bit old, I just happened to be reading it today and your post reminded me of that idea)
That's only if you like pushing it one step further on - if the next gen of AI is more powerful than our "pet AI" it simply fools that pet AI rather than us.
You’ve probably heard AI is a “black box”. No one knows how it works. Researchers simulate a weird type of pseudo-neural-tissue, “reward” it a little every time it becomes a little more like the AI they want, and eventually it becomes the AI they want. But God only knows what goes on inside of it.
This is bad for safety. For safety, it would be nice to look inside the AI and see whether it’s executing an algorithm like “do the thing” or more like “trick the humans into thinking I’m doing the thing”. But we can’t. Because we can’t look inside an AI at all.
Until now! Towards Monosemanticity, recently out of big AI company/research lab Anthropic, claims to have gazed inside an AI and seen its soul. It looks like this:
Scott Alexander has a very interesting post on some new work done by Anthropic on AI interpretability, which is an important part of alignment work:
Scott Alexander has a very interesting post on some new work done by Anthropic on AI interpretability, which is an important part of alignment work:
Now, I'm not a mathematitactical computron-sciencelord (*everyone gasps*) but it seems to me (*hitches thumbs through suspenders*) (*American suspenders, not UK suspenders, you perverts!*) that the gist of this magical AI is less like "we've created a thinking thing" than "we've created a thing that stores information in a way that's complicated and obscure to our vision".
That is fascinating - that's my reading for the week sorted.
It does also suspiciously sound like for the first time we are making real progress to understanding how memory may work in humans with hints about cognition. Something I wondered if the current generative AIs would help us to start to understand.
Sean Carroll did a solo on his Mindscape Podcast on what people get wrong about the LLMs and why they are far from any actual A.I.
Yudkowsky is an arrogant, self-serving crank who frequently, not to say incessant, spouts drivel.Yeah, I liked that part as well.
Eliezer Yudkowsky wrote an article in the New York Times suggesting an international treaty to limit the number of GPUs that could be used to train any new models, and even pointed out that to be effective it would have to be backed up by military power, specifically missile strikes on rogue data centers. As I recall shortly after that article was published he posted on twitter that, yes, we should be willing to risk nuclear war to prevent the development of AI more advanced that GPT4, but he quickly deleted that post.
He's taken a lot of flak for that article, but he still maintains his position. A lot of the EA movement, though by no means all, is pretty close to Eliezer's position.
These days, many people are worried that we will lose control of artificial intelligence, leading to human extinction or a similarly catastrophic “AI takeover.” We hope the arguments in this essay make such an outcome seem implausible. But even if future AI turns out to be less “controllable” in a strict sense of the word— simply because, for example, it thinks faster than humans can directly supervise— we also argue it will be easy to instill our values into an AI, a process called “alignment.” Aligned AIs, by design, would prioritize human safety and welfare, contributing to a positive future for humanity, even in scenarios where they, say, acquire the level of autonomy current-day humans possess.
In what follows, we will argue that AI, even superhuman AI, will remain much more controllable than humans for the foreseeable future. Since each generation of controllable AIs can help control the next generation, it looks like this process can continue indefinitely, even to very high levels of capability. Accordingly, we think a catastrophic AI takeover is roughly 1% likely— a tail risk2 worth considering, but not the dominant source of risk in the world. We will not attempt to directly address pessimistic arguments in this essay, although we will do so in a forthcoming document. Instead, our goal is to present the basic reasons for being optimistic about humanity’s ability to control and align artificial intelligence into the far future.
Yudkowsky is an arrogant, self-serving crank who frequently, not to say incessant, spouts drivel.
I'm not feeling the doom at the moment, and I haven't read Roboramma's links so maybe this was discussed. But we shouldn't consider only what large publicly owned corporations in the U.S. -- with all their built-in financial and social guardrails -- might do. We also have to consider what bad actors and rogue nations might do. I gather this tech isn't as resource intensive as, say, nuclear weapons, yet even impoverished North Korea has nukes. So for all the talk of "We can limit AI's capabilities," we have to ask "What about players who won't?"
Australians are already losing work to AI, but the impact so far has been largely hidden from view.
Economists say it's also creating jobs at an unprecedented rate, but not always for the people in the firing line.
Benjamin* says he was one of those people earlier this year, although it's unlikely to ever show up in official figures.
"All our jobs were replaced by chatbots, data scraping and email," he says.
"We all got AI-ed."
His job in wine subscription sales was one of 121 positions made redundant in July by the ASX-listed Endeavour Group, which owns a number of prominent retail brands such as Dan Murphy's, BWS and Jimmy Brings.
Benjamin says staff were given the strong impression at the time that AI was a key factor...
I just want to point out that the first of those links was against the doom scenario.
Regarding the latter part of your post: the issue of "If we don't do it, other, less safety minded folk, will do it first" is, at least according to them, the reason that both OpenAI and Anthropic were founded.
Best countermeasure against bad AI with nukes is good AI with nukes
No we aren't.We're definitely heading toward that Star Trek episode where the two computers duke it out, and people willingly walk into death chambers because the data says they're dead.
Don't worry, though, Captain Kirk will save us.
No we aren't.
Joking. I knew I should've made that more obvious.
Has anyone asked the question " Is artificial stupidity distinguishable from real stupidity?" ?
Has anyone asked the question " Is artificial stupidity distinguishable from real stupidity?" ?
The real question here.
On the other side of the coin from the paranoid phobia that an emergent AI will suddenly decided "humanity is a threat" and unilaterally hijack the world's electronics and weaponry to kill everyone off, is the delusional fantasy entertained by AI proponents that AI will "solve all of our problems", as in, social and geopolitical problems like poverty and unemployment. AI proponents have a somewhat cultish aspirational vision that a true AI won't merely be sentient, but sentient minus all of the flaws that sentient humans have. Without any reason to think as much (and every reason to believe the opposite), they assert as a just-so proposition that an AI will be unbiased and immune to lies and propaganda; that complex societal issues are just math problems that humans simply aren't advanced enough to tackle yet, but that an AI ubermind will be able to teach itself the requisite skills and then solve these problems handily and their solutions will be so inherently trustworthy that humanity will not hesitate to cheerfully implement them.
Do you remember when we thought that AIs would break down if they were ever exposed to a serious case of cognitive dissonance?
Do you remember when we thought that AIs would break down if they were ever exposed to a serious case of cognitive dissonance?