> To "avoid enabling uses of AI or AGI that harm humanity or unduly concentrate power" what does one do with an idea or line of research that could potentially harm humanity or unduly concentrate power?
Pursue the research and publish its results freely and in their entirety.
Are you saying that if it were discovered to be both cheap and easy to create a super-pandemic using nothing more than household chemicals and a swab of your own snot, then you'd publish the results freely and in their entirety?
If not, how catastrophic a level of harm are you willing to risk before you start advocating concealing the results?
This is the very same concern cyber-security researchers face every time they notice a vulnerability. Do they publish their work openly and potentially leave all those using the service vulnerable to attack?
Yes, for a combination of two reasons:
1) They have obvious incentives to publish their results.
2) The more we know about vulnerabilities, the better we are at combatting ourselves against it.
That second reason is why we are (overall) better off from acknowledging possible threats to our existence. In your example, the more open the study of super-pandemics is, the more open the study of combatting against super-pandemics is. The more we are aware of a threat, the better informed we are to prevent it.
Yes, in your example, the threat could be highly dangerous and imminent. However, if only one individual was capable of creating a super-pandemic, the number of people who could potentially help stop it is drastically reduced. Rather than a free distribution of results which arms our society completely with the ability of preventing it.
"Only one individual capable of creating a super-pandemic": by hypothesis, this individual is morally good in some way. (Otherwise they wouldn't be having this internal debate about whether or not to release the information about how to produce the virus: they'd just release the virus!)
The ideal scenario is that throughout the rest of time, no entity ever manufactures the virus; the second-best scenario is that an entity does manufacture the virus, but only after there is some hypothetical defence against it. You are shattering forever the dream of achieving the ideal scenario (again by hypothesis, the virus is easy to make, and history demonstrates that if something is easy and destructive, eventually someone will do it); you're pinning your hopes on the second-best scenario. You are therefore implicitly assuming that "we get better at arming ourselves against the virus" happens faster than "enemies create the virus", which is by no means a given and must be weighed on a case-by-case basis.
By the way, this is the basic reason why I'm so sad about the existence of fully 3D-printed guns. There is essentially no defence against guns. The creators of the 3D-printed gun chose to spread the blueprints far and wide, to accelerate their advent. It's sad that human nature is such that this was predictable and inevitable.
> Are you saying that if it were discovered to be both cheap and easy to create a super-pandemic using nothing more than household chemicals and a swab of your own snot, then you'd publish the results freely and in their entirety?
No.
> If not, how catastrophic a level of harm are you willing to risk before you start advocating concealing the results?
I'm not sure. Probably quite catastrophic, because it stretches the limits of credulity to seriously talk about a what if scenario where any given individual can easily create a weapon of mass destruction in their kitchen. A better question is why you think such a straightforwardly accessible method of eradicating our species won't be discovered independently despite your efforts to conceal it. How do we navigate that philosophical labyrinth?
But more pointedly, I dispute that this is comparable to any specific, credible example of strong AI. Give me a credible scenario that brings us from strong AI to the annihilation of our species without handwaving about recursive self-improvement and selective idiocy and I'll reconsider my position. I'm not going to give up the chance to publish incredibly novel research because some other people like to work themselves into hysterics about a conceptually incoherent boogeyman with "AI" slapped onto it.
> Give me a credible scenario that brings us from strong AI to the annihilation of our species without handwaving about recursive self-improvement and selective idiocy and I'll reconsider my position.
What kind of credible scenario are you looking for? You're obviously knowledgeable about the subject, I'm sure you've read various stories, but it's quite easy to dismiss them all as "not credible" or "just so stories". E.g. A good story is Tegmark's story at the beginning of the book Life 3.0, but again, can be easily dismissed if you don't specify acceptance criteria for the story.
I mean, simple scenario - AGI gets built by a farming company, gets programmed to farm more corn, but there is no stop condition programmed in, so converts ever available land into corn fields. Is it likely? No, of course not - no specific example is likely. But why is this something that fundamentally can't happen? More importantly, can you specify criteria for a scenario which you would deem acceptable? Cause if not, you're literally saying that there is no possible way to convince you that this is a possibility, which seems to me not to be a good position on pretty much anything.
A credible scenario is one in which none of the parts of the story have a giant leap between them. For example, with your scenario:
AGI gets built by a farming company
Okay, sounds fine, keep going.
gets programmed to farm more corn
Yep, this makes sense to me.
but there is no stop condition programmed in
Definitely sounds like something that could happen, with you so far.
so converts ever available land into corn fields
...and you lost me. I'm not sure how we ended up here.
The AI you're postulating is 1) intelligent and capable enough to be called "AGI" and arbitrarily terraform land into a useful field for corn, yet 2) dumb and inept enough to interpret its instructions absolutely literally, like a magic genie; meanwhile 3) the collective capability of the human species is apparently insufficient to stop this from happening.
I guess that could happen. But I take it about as seriously as alien life trying to invade us, a random meteorite killing us all, a solar flare destroying the Earth or malicious time travelers. Each of these what if scenarios also have a sensible build up followed by a giant leap in suspension of disbelief, and concluding in disaster. That's not a framework for intelligent discussion and productive rationality, it's a self-indulgent and conceptually incoherent exercise in mental gymnastics. While we're at it we could argue about how many angels will fit on the head of a pin. If I start a news cycle about how quantum computers will allow us build nanotechnology that can hack human brains, does my idea have any credibility? It could happen, right?
Our current capability with respect to artificial intelligence is so far removed from any reasonable form of strong AI that there isn't even a recognizable path forward. Established leaders in the field like Yann LeCun have publicly stated deep learning will not take us there. We don't even know how to consistently and rigorously define strong AI, let alone conjecture about how it would work or what its theoretical danger would be. That means we're trying to extrapolate the properties of hypothetical nuclear weapons from gunpowder. We are effectively children grasping in the dark, getting hysterical about a monster that may or may not be lurking under our collective beds. We can't agree on what the monster looks like, we don't have a rational explanation for why we believe it exists, but we heard the floor creak and we've seen plenty of depictions of monsters in fiction that sound sort of logical.
3) is a perfectly reasonable objection, and I can understand why people say "it's too far in the future, we have more pressing issues right now". But 2): "dumb and inept enough to interpret its instructions absolutely literally, like a magic genie" is not a reasonable objection, unless you have some compelling reason for why the orthogonality thesis is false. Why should an AI care about what we want from it, unless we're exceedingly careful about programming it so that its utility function is perfectly aligned with human desires; and is "exceeding care" a feature, now or ever, of how we approach AI engineering?
Even if the smartest hypothetical AI can perfectly extrapolate the mental states of every human who ever lived, we still die in a cloud of nanobots if it isn't programmed ever-so-carefully to care about what we want.
> The AI you're postulating is 1) intelligent and capable enough to be called "AGI" and arbitrarily terraform land into a useful field for corn, yet 2) dumb and inept enough to interpret its instructions absolutely literally, like a magic genie;
Your argument, as I understand it, is that something intelligent and capable will only interpret its "instructions" absolutely literally. That's not the way I look at it.
It's not that another intelligence won't be able to understand what our "real goals" are. It's that it won't care - whatever it is programmed to do is, quite literally, its goal/value system.
I mean, we don't have to get exotic here - look at humans. We very clearly evolved to find sex pleasureable in order to spread our genes; just as clearly, many humans, while completely understanding the purpose of sex being pleasurable, continue to have sex without any attempt to spread their genes by using birth control.
And just as equally, while most other humans have more-or-less the same value system as I do, I think you'd be hard pressed to convince me that there have never been humans who have tried to do things I disagree with and wouldn't want to happen. And that's just humans. It's pretty clear that if certain people were more capable, a lot of bad things would've happened (at least from my point of view).
Basically, the idea that intelligence infers a value system is something I've been long convinced is untrue, and I'm not sure why you think otherwise.
> meanwhile 3) the collective capability of the human species is apparently insufficient to stop this from happening.
Well, I'm hoping we aren't. But in order to stop it, we need to do something, and clearly most people aren't even convinced there's a real threat. Hopefully, we can stop an unsafe AGI before it gets built, and also hopefully, we can stop it after it exists. But if you're just assuming that we'll be able to stop it after it has already started to do things against our interest, I think you're being optimistic - even humanity has already created weapons that, had we decided to really use them, could've destroyed all other humans before they had a chance to respond.
> But I take it about as seriously as alien life trying to invade us, a random meteorite killing us all, a solar flare destroying the Earth or malicious time travelers.
Some of those things are more fanciful. But, some of those are real threats, e.g. random meteorite. It has wiped out entire species before. I agree we shouldn't be very scared of a low probability event that we can't control, but that's the difference - we can control AGI, before it gets built, and it doesn't appear to be low probability. Not to mention, most people agree that we should get off this planet to protect against meteorites, too!
As for aliens - if you were truly convinced that aliens would visit the Earth in 200 years - are you really telling me it wouldn't change anything about what you think humanity should be doing? I for one think we would definitely be spending time in preparing for that. And that's more-or-less how I look at AGI - we're creating an alien that will visit us between 50-1000 years from now, and we're not really sure when exactly. What should we do to prepare?
Pursue the research and publish its results freely and in their entirety.