Hacker Newsnew | past | comments | ask | show | jobs | submit | adamnemecek's commentslogin

Good call, ArXiv seems like one of the most important institutions out there right now.

The French government put a bit of money on the table to help researchers fulfil their open science requirements for government and EU grants, and funded the HAL repository ( https://hal.science/ ). It’s much smaller than arXiv, but it exists. In other countries like the UK there are clusters of smaller repositories as well, but it’s not as well centralised.

It’s so important, in fact, that there should be more than one such institution.

People keep falling into the same trap. They love monopolies, then are shocked when those monopolies jerk them around.


I am using Zenodo for a while now instead. It is more user friendly, as well.

Zenodo is more for IT Papers and also datasets isn't it?

It can host large datasets as well, yes. It is hosted by CERN, so it is not specifically IT in any way. It also allows you to restrict access to the files of your submission. It has no requirements to submit your LaTeX sources, any PDF will be fine. There are also no restrictions on who can publish. You'll get a DOI, of course.

Everything published on arXiv could also be published on Zenodo, but not the other way around.


oh interesting I didnt know this

I like it as well, it works great. But I wonder if it would scale if at some point there were a massive exodus from arXiv.

I think it already hosts much more data than arXiv, given that they also host large datasets.

It is just a preprint repository. It is pretty open (the stories where a preprint was rejected or delayed unreasonably are extremely rare). It offers the basic services for a math/compsci/physics themed preprint repository.

I don't see much of a monopoly, nor any "moat" apart from it being recognised. You can already post preprints on a personal website or on github, and there are "alternatives" such as researchgate that can also host preprints, or zenodo. There are also some lesser known alternatives even. I do not see anything special in hosting preprints online apart from the convenience of being able to have a centralised place to place them and search for them (which you call "monopoly"). If anything, the recognisability and centrality of arxiv helped a lot the old, darker days to establish open access to papers. There was a time when many journals would not let you publish a preprint, or have all kinds of weird rules when you can and when you can't. Probably still to some degree.


there is. bioarxiv.

it just hosts pdfs, no?

It does do a fair amount of filtering of submissions, and it's a long term archive (e.g. for the next 100+ years). I suspect both (but with the former dominating) are the issue.

Just put out a torrent and people of the sort at r/DataHoarder will keep it alive for longer than bureaucrats.

Well, technically, it can also compile your tex file if you upload the tex file instead of the pdf directly, which helps a lot in standardizing the stylistic structure between preprints. Most other repositories are wild west and inconsistent. I really appreciate the similarity in style applied to most preprints there. Moreover, this means you can also download not just the pdf, but the source tex file to, which can be very useful.

The similarity in style comes from conference and journal templates, not from Arxiv. You can style your paper with latex in any style, Arxiv doesn't care. On Arxiv you mostly see preprints that people submit to conferences and journals and they enforce the style.

Also the sources and has a very tame but useful pre-acceptance process.

Technically yes, socially no.

Who could have predicted that a guy who has no experience developing superintelligence will fail at developing superintelligence.

Gradually develop an OS that is not just an Android fork, but a full blown OS people can contribute to. And of course write it in Rust, like the problems with Java are so apparent in Android.


This feels like a perfect use case for AI.


how AI would help?


Robots or drones with ground penetrating radar?


It might be a good idea to look into the research on streams as coalgebras, there is quite a bit, for example here https://cs.ru.nl/~jrot/CTC20/.

Coalgebras might seem too academic but so were monads at some point and now they are everywhere.


Did consider that your view might be skewed because you work in a CRUD app?


That is very sad to hear.


But you have to know what book you are looking for.


The previews are still there though, they just don't rank.


Right, that's what I'm saying. For whatever reason it seems publishers decided they don't want their preview-only books as part of the full-text search across all books. If they decide that, Google has to comply.

This isn't like web search where web pages are publicly available and so Google can return search results across whatever it wants. For books, it relies on publisher cooperation to both supply book contents for indexing under license and give permissions for preview. If publishers say to turn off search, Google turns off search.


But why would people train on excerpts from Google Books when whole books can be downloaded on libgen and such?


Google books is much bigger than libgen.


copyright reasons?


Both are a copyright violation


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: