All this recent hype around sqlite... sqlite is a great embedded database and th...

prirun · on May 11, 2022

* the big one for me: very limited migration support, requiring quite a lot of ceremony for common tasks (eg rewriting a whole table and swapping it out)

I don't know where this idea of having to swap a whole table in SQLite came from, but it simply isn't true. Over the last 13 years I have upgraded production HashBackup databases at customer sites a total of 35 times without rewriting and swapping out tables by using the ALTER statement, just like other databases:

https://www.sqlite.org/lang_altertable.html

For the most recent upgrade, I upgraded to strict tables, which I could also have done without a rebuild/swap. I chose to do a rebuild/swap this one time because I wanted to reorder some columns. Why? Because columns stored with default or null values don't have row space allocated if the column is at the end of the row.

the_duke · on May 11, 2022

For a long time sqlite did not have DROP COLUMN and RENAME COLUMN support, which are both pretty essential.

I'm embarrassed to admit that I didn't realize RENAME COLUMN was actually added in 3.25, almost four years ago.

DROP COLUMN was only just added last year in 3.35.

I'm surprised a database schema lasted 9/12 years without ever renaming or dropping a column.

This changes things! But even now, ALTER TABLE is not transactional. So especially with many concurrent readers there can definitely be situations where you'd still want to rewrite.

teraflop · on May 11, 2022

I'm not sure what you mean by "not transactional". SQLite implements transaction support at the "page" level, and builds all other database operations on top of it, which means anything that touches the bytes of the database file is transaction-safe. You can verify this for yourself:

    sqlite> CREATE TABLE foo(a,b,c);
    sqlite> INSERT INTO foo VALUES (1,2,3);
    sqlite> BEGIN;
    sqlite> ALTER TABLE foo DROP COLUMN b;
    sqlite> SELECT * FROM foo;
    1|3
    sqlite> ROLLBACK;
    sqlite> SELECT * FROM foo;
    1|2|3

It's of course still subject to SQLite's normal restrictions on locking, which means a long-running ALTER statement will block concurrent writers (and probably also concurrent readers if you're not running in WAL mode).

prirun · on May 11, 2022

> I'm surprised a database schema lasted 9/12 years without ever renaming or dropping a column.

I did have a couple of columns that were no longer needed and would have dropped them, but instead I just set them to null and ignored them. Nulls only take 1 byte of space in a row. I dropped them when DROP COLUMN was added.

cryptonector · on May 11, 2022

It would really help if SQLite3 had a `MERGE`, or, failing that, `FULL OUTER JOIN`. In fact, I want it to have `FULL OUTER JOIN` even if it gains a `MERGE`.

`FULL OUTER JOIN` is the secret to diff'ing table sources. `MERGE` is just a diff operation + insert/update/delete statements to make the target table more like the source one (or even completely like the source one).

`FULL OUTER JOIN` is essential to implementing `MERGE`. Granted, one could implement `MERGE` without implementing `FULL OUTER JOIN` as a public feature, but that seems silly.

Sadly, the SQLite3 dev team specifically says they will not implement `FULL OUTER JOIN`[0].

Implementing `MERGE`-like updates without `FULL OUTER JOIN` is possible (using two `LEFT OUTER JOIN`s), but it's an O(N log N) operation instead of O(N).

The lack of `FULL OUTER JOIN` is a serious flaw in SQLite3. IMO.

  [0] https://www.sqlite.org/omitted.html

SQLite · on May 11, 2022

RIGHT and FULL JOIN are on the trunk branch of SQLite and will (very likely) appear in the next release. Please grab a copy of the latest pre-release snapshot of SQLite (https://sqlite.org/download.html) and try out the new RIGHT/FULL JOIN support. Report any problems on the forum, or directly to me at drh at sqlite dot org.

isoprophlex · on May 11, 2022

This is fantastic news, I'm very glad to hear that this is appearing soon! Thanks!

cryptonector · on May 11, 2022

SWEEEEET!

Finally!

Thank you so much for this Mr. Hipp!

EDIT: Don't forget to edit the `omitted.html` page when you ship it!

sorenbs · on May 11, 2022

Migrations have gotten better recently, but there are still cases where you need to follow the 12 steps very carefully: https://www.sqlite.org/lang_altertable.html#otheralter

Prisma Migrate can automatically generate these steps, removing most of the pain. I'm sure other migration tools can do this as well.

llimllib · on May 11, 2022

simonw's sqlite-utils can help here too: https://sqlite-utils.datasette.io/en/stable/cli.html#transfo...

vlovich123 · on May 11, 2022

D1 does not throw away consistency. It’s built on top of Durable Objects which is globally strongly consistent.

smarx007 · on May 11, 2022

"D1 will create read-only clones of your data, close to where your users are, and constantly keep them up-to-date with changes."

Sounds like there will be no synchronous replication and instead there will be a background process to "constantly keep [read-only clones] up-to-date". This means that a stale read from an older read replica can occur even after a write transaction has successfully committed on the "primary" used for writes.

So, while the consistency is not "thrown away", it's no longer a strong consistency? Anyway, Kyle from Jepsen will figure it out soon, I guess :)

geelen · on May 11, 2022

Yeah, so you can always opt-in to strong consistency by transferring execution to the primary (see the "Embedded Compute" section of the blog). Then it's pretty much exactly the same as a DO.

greg-m · on May 11, 2022

Just clarifying - D1 without read replicas is strongly consistent. If you add read replicas, those can have replication lag and will not be strongly consistent.

Disclaimer: I work at Cloudflare :)

infogulch · on May 11, 2022

Thanks for the clarification, that is what I would expect.

Does SQLite support some kind of monotonic transaction id that can be used as a cache coherency key? Say a client writes a new record to the database which returns `{"result": "ok", "transaction_id": 123}`, then to ensure that subsequent read requests are coherent they provide a header that checks that the read replica has transaction_id >= 123 and either waits for replication before serving or fails the request. (Perhaps a good use for the embedded worker?)

discodave · on May 11, 2022

Since it's a relational DB, and supports transactions, you can have a journal table right?

I know of a very important system at AWS that did this with MySQL :D

infogulch · on May 12, 2022

Yes you could do it manually, but it would be nice if the solution didn't require carefully managing update queries so the journal addition isn't missed and increasing write amplification to manually update a journal table when that information probably already exists somewhere in the WAL implementation.

vlovich123 · on May 11, 2022

Yup sorry about that. I missed the entire "read replica" bit when reading that blog post.

mwcampbell · on May 11, 2022

Interesting that D1 is built on top of Durable Objects. Does this mean that it would be practical for a single worker to access multiple D1 databases, so it could use, for example, a separate database for each tenant in a B2B SaaS application? Edit: And could each database be in a different primary region?

a-robinson · on May 11, 2022

Yes, exactly!

hn_ei_ser_23 · on May 11, 2022

That is interesting. I wish CF would give us some more information as I've assumed that there must be a lack of strong consistency which would be a major drawback.

Edit: But that would mean that durable objects can't be replicated asynchronously? That would mean a big latency hit. Then what's the difference to a central DB in one datacenter?

kwizzt · on May 11, 2022

I’m not familiar with Durable Objects. When D1 does replication to read replicas, if it’s not doing synchronous replication, then it’s not strongly consistent, is that correct?

the_duke · on May 11, 2022

I wish the post had provided some more technical details.

It's more of a "quickstart" than a peek under the hood.

unraveller · on May 11, 2022

I'd like to see some up front D1 & R2 benchmarks (read/write/iops). I can't judge invocation cost value until I can judge my use case performance. Here's hoping its -gt NVMe raid 10 under the hood of D1 as some big SQLite reads suffer under slow storage.

jpcapdevila · on May 11, 2022

Are you guys using litestream or a similar approach? E.g storing WAL frames in a durable object.

jambutters · on May 11, 2022

What types are missing from strict that you need?

vaughan · on May 11, 2022

Has anyone tried to write a new modern SQLite?

sophacles · on May 11, 2022

Why? Yes sqlite doesn't have all the features postgres has. Postgres doesn't have all the features the sqlite has either. What's wrong with having different tools with different sets tradeoffs. It's a different shape of Lego and that's fine - some things call for a 1/3height 2x2 and others call for a full height 1x8.

jpcapdevila · on May 11, 2022

I think the most successful attempt would be Realm.

https://realm.io/

chrisshroba · on May 11, 2022

DuckDB comes to mind, but I can't speak to its differences from SQLite.

https://duckdb.org/

anyfactor · on May 11, 2022

I haven't tried duckdb but I have been googling about it. I think I saw a discussion where it was mentioned that duckdb isn't a replacement for SQLite. It is an OLAP database [0] which makes its ingestion time slower than SQLite, I think. So it is meant for analytics but not as fullfledge replacement for SQLite.

[0]: https://en.wikipedia.org/wiki/Online_analytical_processing

Duckdb on HN: https://news.ycombinator.com/item?id=23287278

1egg0myegg0 · on May 11, 2022

Close! DuckDB has very fast bulk insert speeds, but slower individual row insertion/update speeds. (Disclaimer: I write docs for DuckDB)

gigatexal · on May 11, 2022

DuckDB is Olap SQLite. The vector engine is dope. But most of the innovation is in the OLAP stuff.

steaminghams · on May 11, 2022

why do you consider sqlite to not be modern?

all the hip service providers seem to be all over it which would indicate pretty good modernity to me at least.