How is a “SQL database” different from a “relational database”? I’m in industry,...

mamcx · on Dec 12, 2023

Is the same as "C++" and "OOP".

An "SQL database" is an implementation of a "partial" and inconsistent version of the relational model.

So is not wrong, IMHO, to say "SQL databases" to be "Relational Databases" (nitpicking as "Relational-inspired").

Is not that the relational model as first envisioned is perfect. Is ok to implement "partially" and also "extend" the vision it provides.

What is flipping is that what `SQL` does is "inconsistent". That is the major sin.

fipar · on Dec 12, 2023

I'm in industry too, though I did go through formal studies at university and databases was one of the topics that interested me the most back then (late 90s). I dropped out without graduating so I'm definitely not in academia, but I do keep up with some of what's going on though my ACM membership, and it is within that context that I made my comment about the authors' choice of words in their paper.

With that said, some examples of differences between an SQL database and a relational one (which, to be honest, AFAIK, is something that doesn't exist in a production-ready available implementation):

- In a relational database everything is a relation (including the result of applying any operator). Within the Relational Model, this is called the Information Principle, and among other characteristics, the header and tuples in a relation are a set and therefore have no order. SQL is not set oriented, evidenced by the fact that columns and rows do have an order (to the point that most, if not all SQL products let you specify the place in a table's definition in which you want to insert a new column, something that makes no sense whatsoever for a relation).

- In line with the previous item, SQL allows duplicates, which relations don't have (because sets don't have duplicates).

- According to some (notably, Codd disagreed with this), NULLs and three-valued logic are not part of the relational model, while SQL obviously supports NULLs.

- In a relational database, values are stored in relation variables, and these change by being replaced completely with the relational assignment operator. Say you have a relation users, and you want to remove fipar from it, the relational way to do that would be to say users := users minus (users where username = fipar) (pardon the crude pseudocode, hopefully the intent is clear). This means there's no way for the update to be done partially. SQL databases used transactionally comply with this, but some let you relax ACID properties for performance, and when that is done, the universe of possible results for operations includes outcomes that would not happen with relational assignment.

The list is much bigger, and there are lots of edge cases. For a proper treatment from someone who knows what they're talking about I'd recommend this book for a thorough answer to your question: https://www.oreilly.com/library/view/sql-and-relational/9781...

Not a difference, but since this is a common misconception, the "relational" in a relational database is not about relationships between tables. If we simplify by saying that a relation in a relational database is analogue to a table in an SQL database, you can have a relational database with a single relation (i.e., a single table). The relation is between the header and the tuples (rows in SQL). The idea is that the headers form a statement about the world modeled by the database, and every tuple is a combination of values for which the statement is true. In light of this, it should be obvious why duplicates make no sense in a relation. Saying something twice doesn't make it more true!

Another misconception is that relational databases don't scale. That makes no sense, because the relational model has nothing to say about the implementation. Saying the model doesn't scale because you can't use Foreign Keys after certain data size and throughput threshold are crossed in MySQL (say) makes as much sense as saying that arithmetic doesn't scale because you hit an overflow while using a specific model of calculator.

naasking · on Dec 12, 2023

> In light of this, it should be obvious why duplicates make no sense in a relation. Saying something twice doesn't make it more true!

Saying something twice doesn't make it more true, but that could also entail that duplicates should be benign. After all, "true AND true" is still true. So I don't think the conclusion that duplicates make no sense follows from this fact, it rather entails that duplicates should have no effect (idempotency).

In fact, a higher performance implementation is possible if we permit duplicates at some levels of the system because we don't need to check for duplicates. If duplicates have to be removed, that can arguably be done at the final materialization stage, or some other stage where it makes the most sense and overall work is minimized.

mamcx · on Dec 12, 2023

Yeah, I worked on https://tablam.org and https://spacetimedb.com.

It becomes pretty clear that `order` is a significant property to make useful (and performant!) programs. "Duplicates" is also required to make usefull programs.

One nonobvious reason for this: You wanna report that a `customer` has a duplicated key `1`. If you CAN'T model `[(customer.id = 1), (customer.id = 1)]` then you can't report errors! And `erroneous` data is VITAL to make useful programs because then the only possibility is "perfect" data, and that is not possible!

Another reason is that we want to `count` duplicates, to see `duplicates`, and other NON-obvious at first: "What is a duplicate?". Get fun with floats, Unicode, combining case and non-case sensitive input... and is obvious that for useful programs IS REQUIRED to support bags in an extended version of the relational model.

And yet...

IS very important to remember about `set semantics` and try to adhere to it when makes sense. Your query planner will like it. You "valid" constraints like it. And `unique index` like it. And so on...

proamdev123 · on Dec 13, 2023

Thanks for the detailed response. Very helpful.