Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The whole idea of fork is strange - the design pattern of "child process is executing exactly where the parent process is executing" is foreign to me. Don't we want to direct where the child process is executing? Like, when creating a thread? Why is fork() so conceptually orthogonal to that? Is there a good reason? A historical reason?

I don't find fork() to be obvious or useful or natural. I work hard to never do it.



fork()–exec() separation indeed exists for historical reasons: https://www.bell-labs.com/usr/dmr/www/hist.html

Search for the phrase "Process control in its modern form was designed and implemented within a couple of days."


It makes creating processes easy to me, when you did understand how it works:

    while (1) {
        int client_socket = accept(socket, &client_addr, &client_len);   
        if (client_socket > 0) {
           pid_t pid = fork();
           if (pid < 0) {
               // handle error
           }
           if (pid == 0) {
               handle_connection(client_socket, &client_addr);
           } 
       } else {
           // handle error
       }
   }
No need to do complex things to start a new process, having to pass argument to it in some way, etc.


Oh I understand how it works. I implemented it, in the first POSIX implementation. I just don't get how anybody wants to do that.

Yes, there's the example right there. But it shows the awkwardness immediately - decoding what the f happened by checking a side effect (is pid == 0? wtf?)

How about spoon(handle_connection, ...) or something like that? See how much better?


It makes more difficult to pass context. You have to resort in the classical void * context, that is not handy to use. Or you have to use globals. The fork idea is more elegant to me, it duplicates the program flow execution in place.


If you want the child to start executing some other code but you have fork(), it's easy to do it yourself by calling that function.

But on the other hand, if you do want the child to execute code at the same place as the parent, but a hypothetical fork() asks you to provide a function pointer, it would be a bit more complicated.


It's a leaky abstraction and everything it does can be done manually, and possibly better. It exists purely because, at some point in the past, threads didn't exist.

If you design your program without fork, you'll probably end up with a cleaner and faster solution. Some things are best forgotten or never learned in the first place.


Can it though?

The beauty of (v)fork(+exec) is that it doesn't need a new interface for configuring the environment in whichever way you want before the other process starts. Instead you get to use the exact same means of modifying the environment to your needs, and once it's done, you can call exec and the new process inherits those things.

I mean, just look at the interface of posix_spawn.

I grant though that this isn't without its problems (including performance) and IMO e.g. FD_CLOEXEC is one example of how those problems can be patched up. It's like the reverse problem: you have too wide implicit interface in it, and then you need to come up with all these ways to be explicit about some things.


Add to that, fork is (was) very inefficient. You had to duplicate the entire process state (page tables etc). Then the damn program would exec(), and you would tear it all down again. Took 100ms on older computers. Complete waste.

We would resort to making a weak copy, with page tables faulting in only if you used them. A lot of drama, so the user could make a goofy call that they didn't really want most of the time.


A thread is not the same thing of a process. There are situations where you are fine with a thread, other where you need a process.


Think of it as the CS equivalent of cell division and differentiation in biology.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: