Now that all *that* isn't so raw, I'd love to know how you felt about the other ...

luckydude · on Feb 20, 2021

Actually I was part of an SCM conference put together by Facebook and Google recently. People are starting to think about what happens after Git.

Unfortunately, even now, it seems that there is a lot catching up to BK still to be done. To be fair, we had kernel level programmers working on it, we don't think anyone will pick up our code, you pretty much have to be a top 1-2% programmer to work on it, it's all in very disciplined C, people don't seem to like that any more.

So far as I know, BK is the only system that gets files right, we have a graph per file, everyone else has a single graph per repository. The problem with that is the repository GCA may be miles away from the file GCA; BK gets that 100% right, other systems guess at the GCA. Graph per file means each file has a unique identifier, like an inode in the kernel. So create/delete/renames are actually recorded instead of being guessed at. SCM systems shouldn't guess in my opinion (actually in anyone with a clue's opinion, would you like it if your bank guessed about your balance? Of course not, so why is guessing OK in an SCM? It's not). Graph per file means that bk blame is instant no matter how much history you have.

BK is the only system that even attempts to get sub-modules (we call them components) right. Where by "right" I mean you can have a partially populated collection and you get identical semantics from the same commands whether it is a mono-repo or a collection of repos. Nobody else has anything close, Git sub-modules turn Gits workflow into CVS workflow (no sideways pulls).

I tried my best to show what we did in BK at that conference, I have no idea if they will swipe any of it. It's not like BK is perfect, it didn't do everything, no named branches, a clone is a branch, which is a model that absolutely will not scale to what people are doing today (we can argue whether TB repos should exist, but they do).

But for the problems BK did solve, it tended to solve them very well. Hell, just our regression tests are a treasure trove of things that can go wrong in the wild and we open sourced both the tests and the test harness.

JNRowe · on Feb 20, 2021

Thanks, although I think you're demonstrating here and in the other comments why you should write a real history.

Was the conference recorded? I've tried searching, but I'm not turning anything up.

As an outsider you get my worthless full agreement on strictness of history, and on solving the monorepo or vendoring dilemmas. My employer at the time of the upheaval was a bitmover customer, and as we slowly switched away one repo at a time it definitely felt like a sideways step. I'd hesitate to say backwards because it did come with some big process improvements for us, but definitely not forwards.

I'd surely have been proud of solving problems with the quality that BK did too. I remember playing with a lot of the open source systems of the time¹, and none of them were in the same league. I'll make no apologies for this sounding like truly weird fan mail.

¹ I'm remembering hg, darcs, monotone, $some_implementation_of_arch, prcs, codeville but there was a lot of people in the space to some degree.

luckydude · on Feb 20, 2021

I think it was recorded, I'll go look.

Apologize for saying BK is quality? None needed, we prided ourselves on producing a quality product. And great support, our average response time, 24x7, was 24 minutes. It was only that "slow" because we were North America based. If you only considered the US work week, response time was usually under 2 minutes, but that's not reasonable because we had customers all over the world.

I'm gonna start with a write up of the SCCS weave, with a goal that it is enough of a spec that you could go implement it. Maybe add some notes about how I did it because the way I did it was unusual and had the side benefit that you could extract the GCA, left tip, and right tip for a merge in one pass.

nextaccountic · on Feb 21, 2021

What do you think about patch-based systems like Darcs and Pijul instead of snapshot-based systems like Git?

Recent article on Pijul: https://initialcommit.com/blog/pijul-version-control-system/

luckydude · on Feb 21, 2021

I think if you are asking this question either I have completely failed to explain why weaves are cool or you haven't read what I said about why weaves are cool.

Patch based systems are idiotic, that's RCS, that is decades old technology that we know sucks (I've had a cocktail, it's 5pm, so salt away).

Do you understand the difference between pass by reference and pass by value? You must, but in case you don't, you can pass by reference in sizeof(void *), 4-8 bytes. Pass by value and you are copying sizeof(whatever it is you are passing) onto the stack. Obviously, pass by reference is immensely faster.

But in SCM, it isn't just about speed (and space), it's about authorship. In a patch based systems, imagine that there is user A who is doing all the work on the trunk, there is user B who is doing all the work on a branch, and then there is user U who is merging the branch to the trunk. Lets say B added a bunch of work on the branch, it all automerged. U did the merge. In a patch based system, all of the B work is going to be copied (passed by value) to the trunk and the authorship of that work will change from B to U (since U did the merge).

Flip forward to a month from now, the code has paniced or asserted, whatever, B's code on the trunk took a crap. And people are running git blame to see who did that and who did it, U did. But U didn't, B did but U merged it and it was a copy so it looks like U did it.

That's just the SCM being dishonest because it has no choice, it is pass by value.

Weaves are pass by reference. If you merged in BitKeeper and it automerged, you run blame (we call it annotate but I should make blame be an alias if I haven't already, I'm the guy that came up with blame as that verb), you would only see A and B as the authors.

Weaves mean authorship is correct and that whole repack nonsense that Git does? Yeah, that goes away, you are passing every thing by reference so there is only one copy of the code no matter how many branches it has been merged from/to.

Anyone who is pushing a patch based system (and Git is one as well) just doesn't have a clue about how to do source management. Maybe something better than a weave will come along (and if it does, rbsmith will do it, that guy bug fixed my crappy weave implementation) but I think it will just be a better weave with new operators like MOVE (current weaves know INSERT and DELETE, that's it).

Sorry if I'm being a dick, not looking for sympathy but I've got health problems, my feet hurt like crazy and I get kind of terse at the end of the day. If you truly want to understand more, and this goes for all of hacker news, I'm happy to get on a zoom call and talk this stuff through. And it is blindingly obvious I need to write up the SCCS weave and I will do so, you guys have inspired this 58 year old, burned out, can't code to save his life, dude to at least try and pass on some knowledge. I would love to be working with some young person who has some juice and pass on what I know. I don't know everything about SCM but I know a lot. I'm done, it's time for someone else to carry things forward, I'll help if you want. The world deserves a better answer than what we have now.

neolog · on Feb 21, 2021

I'd be interested to see a conversation between you and the Pijul author. He hangs out in the forum at https://discourse.pijul.org/ and the chat at https://pijul.zulipchat.com/.

pmeunier · on Feb 21, 2021

Turns out I also hang out here sometimes ;-)

I was about to reply point by point with arguments, but I changed my mind:

I don't think I'd be interested in that conversation, the comment above is worse than a public display of a very poor understanding of Git and Pijul: it also shows a complete ignorance of other actors in the market. It turns out the market leader for big repositories, Perforce, is itself based on RCS, possibly (but I don't know for sure, since Perforce is proprietary) because RCS scales much better to large binary assets than other solutions (I'd argue Pijul solves that, but that's beside my point).

I am a big fan of Git, Mercurial and Darcs myself, my co-author on Pijul was actually a maintainer of Darcs for many years, and I'm actively collaborating with the maintainers of Mercurial at the moment. And even though Perforce and Plastic are closed-source, they do solve one problem (scalability) which distributed systems are only beginning to understand (if there's one thing I think we've achieved with Pijul, it is about getting beyond the "distributed vs. scalable" trade-off).

Here's my take on the comment above: I don't think you can build good systems without understanding how others are made. There are no free lunches, no silver bullets, no geniuses. Only good ideas, bibliography, and hard work.

luckydude · on Feb 21, 2021

I've built two commercially successful systems, NSElite that was used to develop the Solaris kernel and was productized as Avocet/CodeManager/TeamWare (don't blame me, I didn't name it), and BitKeeper which was, at one point, in use on every continent other than the Arctic. If you are running on a 5 year old Intel CPU, that was developed using BitKeeper.

The comment that RCS scales for binaries couldn't be more wrong, RCS hates binaries with a passion.

The fact that you didn't address any of the points I raised says you either don't understand what a weave file format is, or, you do, you recognize it is a much better format but don't want to talk about that.

This part of this thread reminds of something Ron Minnich once said: "Don't worry about people stealing your ideas, you are going to have to cram them down their throats".

I'll drop it, we can revisit when I write up how weaves work. I don't think any objective person would argue that a patch based system is as good, let alone better.

pmeunier · on Feb 22, 2021

I came back here to say:

> I've built two commercially successful systems,

This is a great achievements, congrats on that!

All I was saying in my previous comment was, these very cool achievements don't prevent you from calling all other people "idiotic", and from saying "they don't have a clue" without even looking at their designs, or even thinking that they might have had ideas different from yours.

Also, calling Git "patch-based" is quite wrong, but my point wasn't technical.

pmeunier · on Feb 21, 2021

> The comment that RCS scales for binaries couldn't be more wrong, RCS hates binaries with a passion.

That isn't what I meant, I was actually talking about Perforce. RCS doesn't even handle multiple files.

neolog · on Feb 21, 2021

Tbh I think it's not likely to be a productive conversation if unless insinuations of bias and ignorance can be eliminated.

luckydude · on Feb 21, 2021

I think the most productive thing I could do is write up how the weave works. People don't seem to understand that, if they did, nobody would be talking about patches.

I'm sorry if I pissed off the other guy, but I'm retired, and I'm tired, and I have no interest in arguing with people who don't get it. But it's on me to provide the info that lets them get it. I'll do that and what people do with it is up to them.

pmeunier · on Feb 21, 2021

(Also, our Discourse and Zulip have been pretty civil places until now, I'd like to keep it that way).

ymbeld · on Feb 20, 2021

> Unfortunately, even now, it seems that there is a lot catching up to BK still to be done. To be fair, we had kernel level programmers working on it, we don't think anyone will pick up our code, you pretty much have to be a top 1-2% programmer to work on it, it's all in very disciplined C, people don't seem to like that any more.

Oh my oh my.

luckydude · on Feb 20, 2021

Believe me, I would do backflips if someone wanted the BK source base, it's got almost 2 decades of my work in it, north of 140 man years. It's a lot to just let fade away.

But I assembled a team of people better and smarter than me, I did my best to keep the code simple but I didn't always succeed.

If you, or anyone, wants to pick it up, I'm happy to answer questions.

mucholove · on Feb 20, 2021

Will take a look at it.

At this point, the trick to creating a sizable git alternative I think is to Trojan horse coding into a new realm with “no-code” like apps.

One of my favorites is Fossil by Richard Hipp. I wonder what your thoughts on it are. I think it is RCS, but usability wise I think it’s way ahead of git.

I just recently learned C and I really like the coding style Hipp uses as well. :)

luckydude · on Feb 20, 2021

Richard is awesome. He also did sqlite. As for Fossil, I'm a fan of his UI and code, but haven't looked in detail at the design. I'm pretty sure it is patch based with everything stored in sqlite. It's an OK way of doing things but not if you understand a weave.

But no disrespect intended towards Richard, I've met him plenty of times and enjoyed it each time. Great guy.