Why not use SQL to query an API instead of GraphQL?

Question

nalzok posted an interesting question over a year ago: "Why not use SQL instead of GraphQL?".

Respondents seem to suggest three main problems:

One should not directly expose one's database to external users.
SQL is too powerful to allow external users to have direct access.
This exposes the internal structure of one's database.

I'd like to piggy-back on this question by asking why not use SQL as the query language for an API instead of GraphQL.

Here is my high-level understanding of direct SQL, GraphQL, and (theoretical) SQL API functioning:

Direct SQL: External User --> SQL Query --> Database
GraphQL: External User --> GraphQL --> Database
(Theoretical) SQL API: External User --> TheoreticalSQLAPI --> Database

TheoreticalSQLAPI would seem to resolve all three of the major problems listed above:

The database is not directly exposed.
Can be restricted to a subset of SQL.
Need not be a direct mapping of the internal database structure, it could be abstracted.

The one advantage I've seen mentioned for GraphQL is that it is not restricted to querying databases but can query any type of data - but this would seem to be a fairly minor problem for SQL since it has been implemented as a query mechanism for a wide variety of data formats.

Is the answer simply that no one has created a popular implementation? Are there other advantages GraphQL offers over SQL?

I'm probably only asking this question b/c I've been writing SQL for ~20 years and GraphQL seems like yet another technology to master.

"and GraphQL seems like yet another technology to master." - give it a few years yet and nobody will even remember it. — Steve, CommentedAug 2, 2020 at 18:08
And how are you going to "restrict to a subset of SQL"? Write your own parser/compiler for it? That sounds an awful lot like reinventing GraphQL. — Telastyn, CommentedAug 2, 2020 at 19:30
@Steve - Will be interesting to see, but atm I don't see anything else really challenging, do you? — davemackey, CommentedAug 4, 2020 at 16:36
@Telastyn and Ewan - To restrict to a subset there would need to be an implementation for this purpose and, yes, this would be similar to GraphQL - the big difference is that one could write in SQL instead of GraphQL. It seems less-than-ideal to me to need to write one's queries in SQL and then one's queries in GraphQL...though I understand some will be using non-sql data sources. — davemackey, CommentedAug 4, 2020 at 16:38

Jörg W Mittag · Accepted Answer · 2020-08-02 17:10:46Z

SQL is a relational query language. Most Web Service APIs do not expose a relational data model, they expose a graph data model. Hence, you need a query language for graphs.

Another, more theoretical reason is that, as of SQL 2003, SQL is Turing-complete which means I can write a query that never terminates. This opens the service up to Denial-of-Service attacks. And it is impossible to analyze a query before running to determine whether it will run forever. The only way to handle this is with a timeout, but no matter how high you set the timeout, you will always terminate queries that are actually not infinite.

Regarding DOS attacks, wouldn't a subset of SQL solve this problem? I mean, if you have only a SELECT statement and a WHERE with some basic operators, and you can only query specific views, such subset, while still useful to interact with the API, wouldn't be any less secure compared to GraphQL, would it? — Arseni Mourzenko, CommentedAug 3, 2020 at 9:29
While interesting, I don't think Turing-completeness is a big practical problem. It's not a common problem in day-to-day programming, where an infinite loop would be trivial to write. In SQL it's probably be a challenge even if you were trying. Plus I'll probably need a timeout anyway, because I don't really care if it's infinite or just 10 hours, both are way too long. — Mark, CommentedAug 3, 2020 at 15:57

Arseni Mourzenko · Accepted Answer · 2020-08-03 10:41:44Z

I've been writing SQL for ~20 years and GraphQL seems like yet another technology to master.

The exact same thing would happen with a subset of SQL. Keywords such as SELECT and WHERE would look familiar, but it would be a different technology, with its own quirks.

SQL features

The difficulty, actually, is to create the actual subset of SQL. The starting point, I suppose, is to have something very basic. Otherwise, you'll open your API to a series of attacks (Jörg W Mittag mentioned DOS, but I suppose there would be other risks as well). But then, different usages would require adding more and more features, and it wouldn't be obvious to know what should be included, and what shouldn't.

Moreover, developers familiar with SQL would also be expecting their favorite features to be included. And every such feature adds complexity. Say I want to SELECT ... WHERE ... BETWEEN .... Can I do it? If not, that would be disappointing. If yes, a statement as basic as BETWEEN could keep you busy for a long time, given that you don't know the underlying data source (and don't know how the data is structured). From a different perspective, imagine developing a bridge between the SQL-style API and, say, MongoDB. How much code would it take?

Tooling

SQL is not just a language, it's also a series of tools that you learnt to use for the past twenty years. Sure you have your favorite profiler, and you know how to read execution plans. But with an API, you don't have any of that. You don't even have the ability to launch your favorite graphical tool to see what views the API provides, which columns are there, which types are being used, what are the constraints, etc.

I suppose you have used products such as Hazelcast, which, for marketing purposes, pretend that they use SQL as the query language. They are great examples of how bad could it be to reuse a language without actually providing all the features of the language and the tools which come with relational databases.

Conclusion

So, essentially, you end up with something which only looks like SQL, but which isn't, and which brings no benefits and no familiarity of an actual SQL database. And therefore you can't leverage your twenty years experience to work with it. On the contrary, you are constantly asking yourself, why this damn API is not behaving like your favorite relational database, and why you can't use your favorite tool or some SQL statement which would make your life so much easier.

If you have to learn the subset of SQL and its quirks, you can as well learn any other language/technology.

At the same time, junior programmers who never wrote a single line of SQL would look surprised at the fact that they need to use something which looks like SQL to use an API. Actually, lots and lots of programmers don't find SQL particularly fashionable, and would rather avoid using anything which is associated with SQL.

You raise a good point regarding NoSQL data sources, I don't have an answer to that... If I was creating a standard, open API that uses SQL I'd probably have a config file that defined common defaults for allowed sql queries and then most other commands could be opt-in to enable. I would think also that while one might not emulate one's exact db it might be advantageous if the internal db looked somewhat similar to the sql api, both for the sake of internal devs working on both and in troubleshooting both internal/external by reducing complexity in mappings. — davemackey, CommentedAug 4, 2020 at 16:44

JacquesB · Accepted Answer · 2020-08-03 17:24:50Z

You would have to parse the SQL into some safe intermediate format and then generate the actual SQL from that. Just sanitizing the passed SQL would not be safe enough IMHO. So it is not trivial to implement, and you will not get anything "for free" by using the same query language as the underlying database engine.

So if the purpose is just to avoid learning GraphQL, I assure you it will be easier to just learn GraphQL. It is not rocket science.

That said, there may be cases where providing a relational oriented query API could have advantages compared to a graph oriented query API. The relational model is generally more powerful and flexible. (It is no accident that relational databases have largely replaced graph databases.) But in APIs you don't necessarily want maximum power and flexibility because clients using the API in unexpected ways makes it harder to evolve.

One significant downside to SQL as API is that it always returns data in table shape. This is a desired property in databases (called relational closure) but for API's it will often result in inefficient requests. Every time you have a one-to-many join, you will get repeated data on the one side.

I'm definitely learning and using GraphQL...I'm more curious why things are going this way than suggesting we should build a SQL-based alternative. Regarding the table shape, I would think in such a theoretical API the API provider would create functions to handle common requests/shapes that might not work well with a table shape. I wouldn't see this as a huge departure seeing one can create views / procedures in a SQL db. — davemackey, CommentedAug 4, 2020 at 16:53
> It is no accident that relational databases have largely replaced graph databases --- what does "replaced" mean in this context? Have graph databases ever dominated over relational? — Zen-Ventzi-Marinov, CommentedMar 15, 2023 at 19:26
@ZenVentzi: Graph databases and hierarchical databases dominated before the relational model was invented. The relational model was designed to alleviate the problems with the hierarchical and graph databases. — JacquesB, CommentedMar 15, 2023 at 19:52

Stack Exchange Network

Why not use SQL to query an API instead of GraphQL?

3 Answers 3

SQL features

Tooling

Conclusion

Linked

Hot Network Questions

Why not use SQL to query an API instead of GraphQL?

3 Answers 3

SQL features

Tooling

Conclusion

Linked

Related

Hot Network Questions