Solely to allow us to provide the Service and to host
the Content you upload to the Website without violating
any rights you have in it, you grant GitHub and our
successors a nonexclusive, worldwide, transferable,
fully-paid and royalty-free license to use, reproduce,
display, modify, adapt, distribute, and perform the
Content in connection with rendering the Website and
providing the Service.
This appears to let Github take any software repository hosted there, and use it under this license to provide features to the github website. So they could take AGPL licensed software and modify and use it without complying with the source disclosure requirements of the AGPL because they have been provided this other more permissive license. It's essentially a free BSD license to everything on Github, for Github alone.
Also, "the Service" is defined as any product or service Github provides, so this could be expanded into any business.
The "transferable" in that also might let Gihub contract with some other company to provide part of "the Service" -- perhaps in an entirely unrelated business than the current Github website, and transfer the license to any free software they like to that other company.
I don't know if it was intended to be used this way, and I am not a lawyer so I could be misinterpeting it, but I will not be hosting any software on Github if they adopt this TOS, without consulting a lawyer.
We've gotten a few questions about this wording, and we'll revise it to clear up confusion. Thanks for the feedback.
We have absolutely no intention of taking any of the code people store with us or using it for our own purposes. As we said, this is solely to allow us to host your content without violating your rights.
No. Like it says, the license is solely to allow the service to function and display what you have uploaded to it.
Virtually every website that allows users uploads had a provision like this so that you cannot upload a copyrighted work and then turn around and sue them for infringement.
Except Github is hosting software, and this license allows them to modify and use that software in any way they like, without complying with the software's normal license.
Yes, even if your license says you have to quack like a duck when you view the source, if you intentionally upload it to github... they have a right to host on their servers it without quacking. This is not malicious. And is kind of obvious?
Here's what bitbucket says:
>Subject to the terms of this Agreement, you hereby grant to Atlassian a non-exclusive, worldwide, royalty-free right to (a) collect, use, copy, store, transmit, modify and create derivative works of Your Data, in each case solely to the extent necessary to provide the applicable Hosted Service to you and (b) for Hosted Services that enable you to share Your Data or interact with other people....
Wouldn't using, say, a high performance source code search engine in their backend, be "in connection with rendering the Website and providing the Service"?
Isn't there a (theoretical) concern for someone hosting GPL'd frameworks/servers/etc. on GitHub, that GitHub might want to use to replace or build their services?
This is the wording that concerns me most, because the Service is defined as:
> The “Service” refers to the applications, software, products, and services provided by GitHub.
Thus, GitHub could legally use this to use any software hosted by them, so long as it entered their stack at some point.
Contrived example:
They could use a modified Linux kernel at the bottom of their stack, and refuse to give anyone access to the source. Breaching the AGPL, but it doesn't apply. Simply because the Linux kernel is mirrored on GitHub.
> They could use a modified Linux kernel at the bottom of their stack, and refuse to give anyone access to the source.
Sure, if they wanted to risk billions of dollars of valuation for no good reason. After such a thing was discovered, how long do you think folks would continue hosting their private proprietary projects there?
That's not Github's primary use for the software they host.
The idea that they'd want to save a few thousand bucks by stealing someone's proprietary code to run Github.com itself is simply silly - it'd risk their reportedly multi-billion dollar valuation.
While I wrote my comment from a free software POV, this also seems to apply to proprietary software hosted on Github. Both software in private repositories there, and software with published code whose license does not allow modification.
Github does not need to sell the content to do anything I described. The content is software, and they can run it, modify it, and sell access to the site that runs it.
And "Service" is defined as "anything github is doing", so distribution inside the Service could involve any number of entities.
Service does have a definition, as it should in any legal document. Which isn't quite everything that they are doing:
> The “Service” refers to the applications, software, products, and services provided by GitHub.
If you focus the sentence on the "provided by" phrasing, then anything they take using their new TOS, would have to somehow be involved in directly forwarding something on to customers, as it needs to provide something. So they couldn't use it for anything internal only, such as say, a spreadsheet program, because the user doesn't directly benefit. They could however use a code auditing tool, as they could claim they are providing a better product to the public by auditing. It's a little bit grey, but there are limits.
Not to say that I agree with the new TOS, as I have some serious doubts about using it, and it does make me consider removing some of my projects from GitHub, though they are working on the wording still. We'll see.
> You may scrape the website for the following reasons:
> Researchers may scrape public, non-personal information from GitHub for academic research purposes, only if any publications resulting from that research are open access.
> Archivists may scrape GitHub for public data for archival purposes.
> It is prohibited to scrape GitHub for spamming purposes, including for the purposes of selling GitHub users' personal information, such as to recruiters, headhunters, and job boards.
How about a level of indirection?
Alice the Archivist scrapes GitHub and compiles a compendium of names / emails. She then publishes this information in bulk for anybody who'd like to use. Alice's cousin Roger the Recruiter downloads the archive and does what he does best. Alice isn't selling anything, she's just providing the information bundled up. Roger isn't getting it from GitHub, he's getting it from the public feed (say via Bitorrent or a public data set posted somewhere).
I just don't think the whole "you can do X but only if you use the data in ways we want" is a good idea.
It's completely unenforceable, and there are millions of loopholes like the one you just said that will let them skate around the restrictions both legally and in action.
This seems like an addition because of the "GeekedIn leak"[0] of a bunch of scraped GitHub data from Nov of last year, but I just don't get what it's trying to do. Bad actors are just going to ignore it like normal (or find loopholes) and good actors weren't doing this in the first place.
Personally, I wish they left out the "what you can do with the data" and focused their ToS only on the "how you are allowed to scrape it". Give solid limits on amounts and speeds and other various aspects of the scraping, make them restrictive enough to prevent a service like that from working, then provide a point of contact so that an "archivist" can be allowed to do more on a case-by-case basis.
I would imagine the idea is to have a legal recourse to go after a company built atop breaking the terms of service. An example of that would be Craigslist going after Padmapper. There's tons of scraping of Craiglist and I'm pretty sure it's a violation of the terms, but they'd reserve legal enforcement to specific circumstance.
I just think that there are better ways of limiting that (however I'll freely admit that I don't really have any experience in the area, especially the legal aspect, so I could be horribly off on this).
I've seen many very cool, creative, and useful tools pop up on top of GitHub and sites like it, and I just fear that changes like this will have the (perhaps unintended) side effect of shutting them down.
GitHub has an API. Use of an API is not 'scraping'. The act of scraping involves pulling full HTML pages and parsing out data. Because scraping and API usage are different, there are different ToS for each.
I suggest that useful tools are sticking to the API.
Maybe not enforceable (I dunno) but the academic scraping clause seems very good. You can't really hide that you scraped from GH if you want to publish it according to accepted standards (transparency, i.e. there will be a "data was collected this way" section). So if you don't open access your work you risk writing a worse paper if you want to hide where you got the data. At the very least this gives us a nice opportunity to shame people for not OAing their Github-data based papers. I like it :)
If Alice's archive gets no traffic, and Roger's recruiting firm does 40% of their business using Alice's data, and Alice and Roger get dinner together every week, Alice seems to be doing something wrong.
Even on his own, Roger is still scraping Github. The defense "It wasn't me, it was the Python program that was in the wrong!" is just as ludicrous as trying to hide behind the fact that Alice's archive is scraping the site itself. They're putting limitations on the information, not limitations on how computers are allowed to talk to each other.
But how can GH establish intent? Legally it's unenforceable (good luck getting an answer from my lawyer), practically it's irrelevant (I will archive this data now, maybe in the future I will be convinced that selling them is ethical), etc.
The only time it'll be tested is in hindsight. If you scrape the data and put it on your laptop, they don't really care. If you then sell the data, they can come after you for breach of service.
They're not trying to keep you from scraping, they're trying to keep you from using the data in ways that would be distasteful to their users.
What if you're downloading just the git repository and scraping that. In theory you're not scraping github but the files github allows you to download.
This is why there are "derivitative" clauses on many licences. "You may do this, so long as what your produce/derive is under the same conditions". That would stop the problem you highlight
Section D, rule 7 seems like it could use some work:
> GitHub employees do not access private repositories unless required to for security or maintenance, or for support reasons, with the consent of the repository owner. [...] If we have reason to believe the contents of a private repository are in violation of the law or of these Terms, we have the right to remove them.
So, basically, GitHub says that there's a possibility for them to remove a private repo because of a hunch that it violates ToS, without actually looking at it to make sure?
It says "reason to believe" which means they will have evidence (the "reason"). You might need to sue them in order to see it, but that's usually how these things go.
If they wanted to give themselves free reign to remove repos they would say something like "we have the right to remove private repos at any time, for any reason, at our sole discretion."
> Default Contributor License: To address growing confusion over licensing and contributions to others’ projects, we added a simple default contributor license. If it does not suit your needs, you may add your own Contributor License Agreement to your repository.
That is much needed. Does anyone know what the new default contributor license is?
"Additionally, unless there is a Contributor License Agreement to the contrary, whenever you make a contribution to a repository containing notice of a license, you license your contribution under the same terms, and agree that you have the right to license your contribution under those terms."
It makes me so unreasonably happy to see that added to the terms of GitHub..
Looks like, by default, it licenses the code to be viewable and useable by any third party, but only can be duplicated on GitHub itself. Actually, being able to use/perform would seem to contradict the duplication statement since you'd likely have to duplicate the code elsewhere than GitHub to view/perform it. Another issue is, what if I start under the default terms, someone clones the repo, and then I post it under, say, a GPL? Does that mean fork of the original is now perpetually under terms of the default license and can the fork be forked? Seems like you'd want to be very careful about allowing a default license like this to take effect... so much so that it raises a secondary issue...
I'm not a lawyer but, I wonder if your terms of service can passively force a licensing agreement like this and still hold legal muster. We're talking about much more than simply licensing GitHub to display your code if you don't license it otherwise. Here GitHub is essentially causing you to enter into agreements with unrelated third parties on an opt out basis. Unless there are warnings about what you are about to do, seems like it could cause problems for both the licensor and licensee.
Some say a default license is needed, but there is a default license position: which is you are not licensed to use the code unless granted permission to do. When I don't select a license, it seems more natural for GitHub to simply block forking and repo access to anyone other than the owner and allowing only the unlicensed code to be displayed per an agreement with GitHub. That only requires that the GitHub user and GitHub have an agreement rather than the GitHub user and an unrelated third party. Or explicitly force a license choice on repo creation: pick one or upload something. That too, would get past this without forcing a license which, to me at least, looks problematic.
I'm in the process of moving all of my repositories hosted on GitHub to fossil scm instead. Unlike GitHub it is open source[1] and with a simple cgi script[2] I am able to host all of my repositories on my own web server. After knowing git, fossil wasn't very hard to learn and it comes with ticketing and wiki built-in and accessible via a web UI locally on any machine I'm using.
One of the benefits in addition to being able to read fossil source code and being able change/style the web interface however I want is that I don't have to worry about staying up to date on Terms of Service changes I really have no say in.
Edit: failed to mention fossil can import/export to/from a git repository[3]
I'm almost in the same boat. I'm putting all new repos in Fossil.
I like being able to have complete control over the web interface. I like that it's lightweight, unlike say Gitlab, yet provides a full website. I like that the bug tracker is part of the repo rather than part of a company's proprietary infrastructure.
The biggest advantage, by far, is that it's not Git. I agree that Git is good for a lot of projects. It's an unnecessarily complicated beast for small projects with a couple of collaborators, neither of whom are willing to spend hours dealing with weird commands to handle a repo that got messed up for unknown reasons. Fossil, unlike Git, is version control that others are willing to use.
I agree 100%, I'm a lot happier with it for personal projects especially once I got the hang of it, and you get a lot of web tools that are baked into it nicely. The only concern going into it was knowing that I couldn't change/undo previous history but since using it I actually think that is a version control feature you want (assuming you don't slip in a password) so you don't risk losing previous data, you just have to be more thorough in looking at what you are committing.
You actually can delete a password using shunning, but it's not a regular part of the development process. (I only mention that because many potential users are scared off because they misinterpret not being able to change history.)
I've tried GitLab before and wasn't too impressed. Gogs looks interesting, it looks like it might just be a single binary file like fossil which would be nice, all I had to do on windows is add the location of the single binary to the path and on linux I just had to move it to my bin folder and wham it works. I've been really happy with fossil so far so not looking to switch anytime soon, gogs certainly looks interesting
I'd definitely recommend GitLab for teams and workplaces... But not so much an individual. Though, the CI stuff being so damn simple makes for some niceness.
Gogs/Gitea are great for individuals as a single drop in file, but it is a hosted service.
If you're willing to drop git and use fossil, it comes with the benefit of pretty much everything any hosted git solution has, as well as not needing to run a service. It has ticketing and wiki inbuilt. The only thing a hosted solution might supply is CI/CD.
If you're going well with fossil, with it's ability to integrate with git repos, I don't think you're going to find anything else that feels as good.
Yes I believe Gogs is a single binary and fairly easy to deploy as a result. Gitlab is nice, but is a lot to setup and maintain unless you like that sort of thing, as I do.
Do what a lot of people do, find your favorites, cobble together your own and go from there. Of course a lawyer is best, but until you can pay for one and at least while things like privacy policies are important for SEO, fake it till you make it.
Of course you should actually abide by your own policies. You can even search for common phrases[0] to see how many templates are out there.
We open-sourced all of our docs where I work. As previously noted, it's no substitute for a lawyer if something serious is really on the line, but it's probably a good place to get started. If you're not selling B2B you probably don't need to get mired down on the specifics of legality or need to have a framework to negotiate terms and policies. The biggest thing you'll want to do is set limitations on liability for usage. You'll save money giving a lawyer 80% of T&C to tune vs. having a lawyer start from a boiler plate T&C. Especially if you need to manage any vertical-specific nuances that a biz lawyer may not be specifically attuned to.
If you cant afford a lawyer then you probably cant afford the side project. Some things you simply get what you pay for. Start an LLC, pay a lawyer a couple hundred bucks and you'll be set for quite a while.
A side project doesn't have to be an LLC. Majority of most basic web services require some sort of Terms of Service but if you're only building it for fun paying a couple of hundred bucks for a lawyer to write up terms of service no one cares about doesn't seem cost effective.
Section K.3. is very problematic - it is (essentially) the same as the previous TOS. It states that github can restrict/delete your account for any reason, at any time, without warning. This may be fine for trivial projects, but if you're running a business and relying on github for actual revenue, this is completely unacceptable. I have asked people about this, and their answer has essentially been, Cross our fingers and pray it doesn't become a problem. I won't use github for anything remotely serious until this substantially changes.
A very limited and explicit list of what can cause account/content restrictions/deletions. We have to know what behaviors/content can trigger this, otherwise it's just groping around in the dark, which is an awful way to run a business.
The new license grant to others is nicely explicit on what permissions are granted on an unlicensed repo vs the old vague "you agree to allow others to view and fork your repositories." Note that you CANNOT legally edit your fork of an unlicensed repo. That would be violating the original repos copyright.
Right, sorry, I meant to say 'public'. And I don't see that it is beside the point, because there's no other relevant and public repository that I can identify - no github/terms or whatever.
Also, "the Service" is defined as any product or service Github provides, so this could be expanded into any business.
The "transferable" in that also might let Gihub contract with some other company to provide part of "the Service" -- perhaps in an entirely unrelated business than the current Github website, and transfer the license to any free software they like to that other company.
I don't know if it was intended to be used this way, and I am not a lawyer so I could be misinterpeting it, but I will not be hosting any software on Github if they adopt this TOS, without consulting a lawyer.
(The current TOS has nothing like this in it.)