Hacker Newsnew | past | comments | ask | show | jobs | submit | nx2059's commentslogin

So in this paper (https://arxiv.org/pdf/1707.02389.pdf), (If I'm groking it correctly) Terrance Tao seems to embed a Turing Machine in a vector space. So is using a single vector not a real problem, or am I just not getting it?


The article was pretty over my head, but does your argument hold to the various augmented neural net systems such as Neural Turing Machines, or Differentiable Neural Computers?


The argument isn't so much about the type of agent (which I think is what Neural Turing Machines etc. are about), it's about the type of environment. In traditional Reinforcement Learning, environments give real-number-valued rewards (or even rational-number-valued rewards which is even more constrained). Presumably this was a decision that was made with hardly a second thought because real numbers are most familiar to people... but such "appeals to familiarity" are totally irrelevant for such an alien field as AGI :) The point of the paper is that a genuine AGI should be able to comprehend environments that involve rewards with a more sophisticated structure than can be accurately represented using real numbers.


There are multi-reward agents and multi-task agents but all rewards get added into a final scalar value. And gradient based methods need to have this one scalar value to derive gradients from.

Since you mentioned higher dimensional representations for rewards, I want to remind that the sub-fields of Inverse RL and Model-based RL are concerned with reward representation and prediction by neural nets.

Also, it doesn't seem like a good idea to try to disprove an entire field with a purely theoretical (a-priori) argument. There should be at least some consideration given to the state of the art in the field.


I make no attempt to disprove an entire field, rather to question an implicit assumption, namely that real number rewards are flexible enough to capture all relevant environments an AGI should be able to comprehend. Quite the opposite, I indicate ways reinforcement learning could be modified to get past the roadblock I point out with real numbers.


IIUC, the claim is that the very idea of a (real valued) “objective function” to be “optimized” is broken?


Broken in the sense that it's not flexible enough to apply to all conceivable environments a genuine AGI could navigate, without misleading that AGI. But I should stress that real number objective functions are probably fine for many specific interesting environments, I'm not trying to say that real number objective functions are useless. Just that they aren't flexible enough to cover all environments :)


Fair enough. It would be interesting/instructive to construct (relatively simple) examples where we can see that they’re broken :-)


Easy to come up with examples using exotic money-related constructions. Suppose there's something called a "superdollar". If you have a superdollar, you can use it to create an arbitrary number of dollars for yourself, any time you want, which you can trade for goods and services. If you want, you can also trade the superdollar itself. Now picture an environment with two buttons, one of which always rewards you one dollar, and the other of which always rewards you one superdollar. Shoe-horning this environment into traditional RL, you'd have to assign the superdollar button some finite reward, say a million. But then you would mislead the traditional-RL-agent into thinking a million dollars was as good as one superdollar, which clearly is not true.


But isn't the sophisticated structure what leads to the real number. If you change the rewards to something more complex surely you still have to pick between actions and at some point you'll have to evaluate which one is "better" and I can't see why you couldn't use real numbers to represent utility.

I mean, humans are general intelligences, and you can translate pretty much any human reward into money, which is a real number.

The paper is super long though. Maybe someone can give a TL;DR that makes sense.


Question: when you say "I can't see why you couldn't use real numbers to represent utility", does your reasoning for that have anything to do with Dedekind cuts, Cauchy sequences, or complete ordered fields? Because that's what the real numbers _are_. If your reasoning has nothing to do with these sort of things, then it can't possibly be sound because in order to argue that X has such-and-such property, you need to know what X actually _is_.

To repeat an example I posted for someone else: Suppose there's something called a "superdollar". If you have a superdollar, you can use it to create an arbitrary number of dollars for yourself, any time you want, which you can trade for goods and services. If you want, you can also trade the superdollar itself. Now picture an environment with two buttons, one of which always rewards you one dollar, and the other of which always rewards you one superdollar. Shoe-horning this environment into traditional RL, you'd have to assign the superdollar button some finite reward, say a million. But then you would mislead the traditional-RL-agent into thinking a million dollars was as good as one superdollar, which clearly is not true.


Good example, although what if you just assigned it a reward of like 100 trillion dollars? It might not be exactly correct but then you're assuming that exactly correct rewards are required for AGI which seems like a pretty big assumption.

Actually I thought about this some more, and maybe money wasn't the best example, but I think there must be some internal measure of utility that humans use that can be represented by real numbers.

Imagine you are presented with an array of possible actions with associated (possibly estimated) rewards. You can only pick one. Maybe there are some doors but you can only open one - behind the first is $1m, behind the second is a superdollar, behind the third is a button that cures world hunger, behind the 4th is your loving family, whatever.

As a human I can pick one. No matter what the rewards are. Even if one reward is "you essentially become God". That means I can order them, and therefore that they can be represented by real numbers (plus infinity for the god option).

I don't see why the infinity would cause an issue: the "you can now do literally anything" reward is worth more than every other reward, but it's the only one. Also it doesn't actually exist so who cares?

Actually I guess it can exist in games, e.g. God mode in Quake. But that should have an infinite reward and agents should choose it over everything else so I can't see the problem really.


>I mean, humans are general intelligences, and you can translate pretty much any human reward into money, which is a real number.

A lot of people have written quite a lot of arguments that this is false.


A lot of people have written a lot of arguments about everything. Has anyone actually demonstrated that it isn't true?


Why was this downvoted? Do people here seriously believe that the US doesn't have a system of crony capitalism? Is it so implausible to believe that regulations, while created with the best intent, are not susceptible to corruption via lobbying by powerful corporations to keep competitors out of the market?


The fact that kmix was downvoted indicates with the problem with a LOT of the youth today. I grew up in the 80's during the cold war. I remember the freedoms we "had" in the USA, and how people defected from the communist "utopia" of the USSR. When I was young I learned that our constitutional republic is based on the assumption that people in power are corrupt. And believed in quotes such as "When the government fears the people there is liberty, when the people fear the government there is tyranny", which gives a rather obvious interpretation to the second amendment. You can have the type of weapons that will cause the government to think twice before taking away our rights. Or "People who give up liberty for safety and security will have nor deserve neither" A deliberate nod to the fourth amendment. The war on drugs is due to completely ignoring fact that the federal government has enumerated powers. Probation in the 1920's was only legal because a constitutional amendment had to be made. We have things like "Enhanced Interrogation" and Prison Rape, a deliberate ignoring of the law against "cruel and unusual punishment" which is just accepted with out second thought today. I could go on and on, but it's become clear that something that my dad told me years ago is true. In a Democracy, you get the government you deserve.


i learned the hard way that it doesn't matter if it happened yesterday, people will 100% forget about it.

Stalin? Misunderstood. USSR killed more people than the nazis? Impossible.


I think the solution might be somewhat simple, something I've noticed recently while trying to deal with my own stress. I recalled that as a child I would often do something very strange. I would just lay on the floor staring at the carpet, just completely mesmerized by the most banal of things. I'm mean there's less happening than watching paint dry. Or I would sit out in the yard in the fresh air and stare and some patch of dirt and grass for some period of time. I spoke to my girlfriend and she said that she recalled doing something similar. she said she had a very distinct memory of a door knob and just starting at it. I think maybe most children do this instinctively and forget it when they grow up. And rather than worry about complex esoteric philosophies just practicing a simple child like act for some time a day can help (and yes I know this is similar to practices known as grounding, but maybe more time is necessary to be devoted to it for it to restore inner calm)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: