Write Consistency

This is quite a common scenario. You have a shared resource, and a lot of applications tapping into that shared resource. The shared resource holds state, the applications read and change that state. So far nothing particularly fancy. What things would you worry about?

Off the top of my head:

Access control – Exerting control over who can modify and see what.
Concurrency control – Prevent race conditions from messing things up.
Types – Make sure everyone has a unified interface into the state.
Integrity checks – Don’t let one application ruin the day for other applications.

And one way to deal with all of this is by enforcing write consistency.

First, let’s define write. I’d like to talk about the application perspective, and what it wants to do. It might be updating a single record, or a bulk update. It might create one record and update another, and maybe these two are controlled by a single transaction. Either way, there’s some point at which the application says “these are the changes I’m making, please write them down.”

Write consistency is about reaching a consistent state at the end of the write. The simplest definition I’m going to offer is this: a write followed by a read returns the same state. This is of course the general case, we can make for some allowances, for example, the case where you write less data than you read: default values, auto generated keys, etc. But conceptually those are all the same, at the end of the write you can predict what will come out of a subsequent read.

So that’s write consistency. What happens if I don’t have write consistency? I might we writing ten new records, follow up with a read, and only retrieve five of them; it takes a second read to find the other five. I can say that over some period of time I reached a consistent state, but it didn’t happen in the write itself.

Read Consistency

There’s a part of me that wakes up in the morning and wishes every application I develop will only ever have to deal with write consistency. Imagine if all your Web resources had cascading deletes: no more broken links! Imagine search engines updating their indexes as the contents gets updated.

Then there’s the other part of me that’s looking forward to building new applications that exist only because of situations that have no write consistency. Again, like the Web.

A while back I got frustrated with the limited vocabulary I had to describe these kind of scenarios. What we call a lack of ‘framework for reasoning’. Whatever design decisions I would make were very opportunistic, with no coherent set of practices I can take from one project to another. Worst, it involved a lot of hand waving and broad generalization of the like, “When you’re using Google …,” or “The RESTful way …”.

So I decided to create two buckets. I looked at the overall characteristics, and found one bucket that deals with write consistency, and another that deals with read consistency. Those are not opposite or mutually exclusive, and there’s also a third bucket for everything else, but I’m not concerned with that. Here I’m only going to talk about things that
fall in one of these two buckets.

So what is read consistency? Read consistency is being able to construct a consistent state by reading it. And I use it umbrella term to describe all that makes it possible: resolving references across resources, specific error codes, version numbers, etc. Even the way we would do updates in a read consistent environment: no locks, conflict detection, compensation, etc.

By illustration, I’m going to use read consistency to perform a search and return a list of records, but only after I prune the search results to discard dead-ends (404’s in HTTP parlance). I don’t have a write consistency guarantee that deleting a record immediately removes it from the index, or that updating a record immediately reflects it in the index. But I can easily create a consistent state by only retrieving existing records. And if I want to be more fancy, I can even match the record’s version number against the index, or do subsequent filtering by running the search again on every record I find.

You might be wondering why I would need a new term to describe something that’s quite common, something we intuitively do every time we use a search engine. This all go back to framework for reasoning, or being able to
explain what I mean when I talk about ___.

Let’s look at this from another point of view. Would I use a search engine that returns links to non-existing records? My initial instinct says “no, why would I care to search for something that doesn’t exist?”. But I just talked about that use case in detail. Am I contradicting myself? I really don’t care for a search engine that never returns useful results. But I do care for a search engine that return relevant results, even if there’s a time lag between updates to the collection of records and the search index. I’m interested in it, because it’s merely a consistency problem that I can resolve at reading time.

Enough With The Search Engines

That’s my initial reaction whenever I read about the mythical search engine use case. I’m thinking two things: content and search. But in practice, I don’t have a big content problem to solve, most often I get to deal with structured data, and what works well for one doesn’t work as well for the other. And search is interesting, but hardly a top priority. In most applications it’s a second or third category feature.

So why am I talking about search? Because of all the examples I came up with, this seemed like the most neutral case, one that’s void of any ‘we do it this way!’ complications that happen when you start talking about orders or financing or customers. Think of it as a Lorem Ipsum use case for data management.

It also raises two important points.

I’m personally sick of structured search forms. I don’t care that the database has ten data fields you can query on, I want one search bar, just like search engine. Doing smart search on structured data is a simple problem to solve, have a look at Lucene and how people are using it. So the first point is that we need to look at application design beyond the narrow view of entity-relationship diagrams, and imagine things that are possible beyond Visual Basic forms.

The second point is, I think most people intuitively understand how search works. You keep the data in one place, you keep the index in another, and the index is updated asynchronously to catch up with the data. You can do certain things to speed it up, like pinging the search engine, or mapping out your data. That’s the simplest, most intuitive example for read consistency. In fact, asynchronously updating indexes is a feature that read consistency databases offer, and we’ll get to talk about that too.

Shared Resources

Some database servers are designed for the sole purpose of storing data. But I think most of you are far more familiar, and more actively involved with using, database servers designed to solve a different type of problem. I’m talking about database servers designed to be used as shared resources.

In a typical client-server environment — the one in which the relational databases of today spent their formative years — we have a lot of different applications hitting the same resource, sharing data through it. Storage is not good enough, we also need to handle the coordination problem. Coordination in this sense is making sure that you don’t have one application ruining the day for everyone else. And so our first instinct is to centralize by moving as much business logic as possible into the database itself. We let the database enforce compliance on its clients.

What kind of business logic? We can enforce every order to contain a date field that must exist and must hold a datetime value. We can enforce every order to always reference an existing customer, so you can no longer delete the customer and keep the order hanging around, but we’ll make it easy to delete a customer and all their orders. Those are all declarative business rules.

We also have imperative rules, which we implement using triggers and stored procedures. In some cases, we’re going to decide the rules are complicated enough that no application has privileges to update order records directly, but instead needs to use stored procedures. In other cases, we’ll just write the entire application inside the database, and use clients as dumb terminals. Performance is another coordination problem, and we can control that too using stored procedures and views.

Web of Services

In a loosely coupled environment, we’re going to find a different pattern. Here we have a variety of applications accessing a variety of services. I’m using these two terms, rather than client and server, because it’s easier to conceptualize how these roles play together. An application is something I have visibility into, the codebase I’m designing or working on, and services are black-boxes of functionality. Of course, what is application to me may be a service to you, and vice versa.

Both applications and services need to store state, possibly using a database. But the important distinction here is that we’re pursuing a loosely coupled architecture, so we no longer share data directly through the database. Each service is independent, it exposes an interface that we’re going to use to retrieve and change state. We just downsized databases from the role of shared resources to mere storage engines.

Where am I going to put my business logic now? I favor the application. Since the database is no longer a point of coordination, I don’t get as much benefit form moving my business logic into the database server. In fact, the last thing I want is the unpleasant reminder of the days we wrote COBOL programs and deployed them on mainframes. And that allegory is not by mistake, today’s database design traces all the way back.

I much rather write validation logic inside my application where I can return errors like ‘555-XYZ is not a valid phone number’ rather than ‘SQL error 704: invalid column value’. It’s easier to develop, maintain, package and reuse, not to mention the variety of i18n/l10n libraries I can use. And by the same token, I’d much rather write any complex update, query or business rule I need using a modern day programming language.

Scalability

So let’s talk for a second about Google. I have nothing to say. I’m only bringing it up so we can get it out of the way, because it seems like a common and misguided knee-jerk reaction every time someone brings up scalability. It’s a false dichotomy that your scaling needs are either large, or non-existent. The problem I have is not being Google. The problem I have is not being Google.

Me: Relational databases are hard to scale.
DBA: Phfft. What do you know!
Me: Well, my server is about to max out at 1TB.
DBA: Piece of cake.
DBA: Get a budget approved for a bigger server, once it ships, come back and we’ll schedule a migration.
DBA: The last project that needed a bigger server got it done in a couple of weeks.
Me: I already have one server, I don’t have budget to replace it for a bigger one.
DBA: Then get another just like it.
DBA: Here’s some material to explain the difference between read-only slaves and master-master, and how it will affect your code.
DBA: Once it ships, come back so we can provision rack space, and then we’ll schedule a day to install and synchronize the two.
Me: I see.
Me: Hey, I’m just wondering.
DBA: Yes?
Me: Say I used Amazon S3 to store my data, and just hit 1TB. What would I need to do then?
DBA: Hmm. Nothing. I guess?

The other side of scalability is the unconscious design decisions we make when we’re conservative about what we store. Think of all the applications you didn’t know you could develop before you saw a demo of AJAX. Think how you designed Web apps back them, and how you design them today. The same thing is going to happen with your database.

For me, this will enable new type of applications I wouldn’t even imagine today. Right now, my design decisions are focused on limiting the data I store, and discarding it as quickly as possible. I’m being conservative with the disk space, unfortunately, also conservative with application features. That’s about to change.

Smart Databases, Dumb Databases

Smart databases, I’m borrowing the term from smart clients, do more than just handle data. They combine data with business logic. Some is declarative, some is imperative, and it all comes from the need to solve the coordination problem of a shared resource.

Dumb databases, on the other hand, just store and retrieve data. They contain no business logic, zero, zilch, none. The only thing they can do is access and modify records efficiently.

A lot of applications developed today use smart database servers in this minimal form. They shift all the imperative logic into the application, and replicate the declarative logic in both places. That’s a usage pattern, but that’s not a dumb database. A dumb database will not contain declarative logic, it won’t know what to make of it.

If you take a database server and dumb it all the way down, you end up with a glorified file system, of which we have enough. The dumb databases I’m talking about have two other interesting characteristics that separate them from smart databases and file stores. They’re particularly good at dealing with read consistency. And they’re particularly good at delegating to the application in all manners relating to logic.

So let’s look at what a read consistency dumb database looks like.

Open Schemas

Database schemas serve two purposes. One is to enforce rules on the structure, values and semantics of the data they store. For the benefit of the database and as a unified interface for all applications accessing the shared resource. My application already deals with that much better than any CREATE TABLE can do, and if you’re using an ORM or similar technology you already captured all that information inside your application as well. I don’t feel a particular need to replicate this logic in two places, nor joy at migrating schema changes.

The second purpose is to increase the density of bald spots on my head. The original design traces back to the days when we stored years as double digits, and considered fixed-length CHAR fields a feature rather than a bug. Those days are gone. Although modern databases allow you more flexibility in the form of BLOBs, array fields and such, it’s clear that they really don’t like it that much and penalize you for doing so.

So the first feature a dumb database has is no schema definitions. That part is delegated to the application.

Versions and Generations

Write consistency databases can scale out in one of two ways. You replicate the database, but you have to keep both replicas identical, so writes don’t scale, and you can only have as much storage as can fit on a single node. The other option is partitioning, which is read consistency on top of a write consistency database.

A read consistency database scales with ease. Partitioning is something that happens, not something you have to work hard for, writes scale as well as reads, and if you run out of space, you just add another database. The price you pay for more data is the cost of space, think Amazon S3 if you need to visualize the economics of it. So if you can expand to fit all available space, what would you do?

I’m going to quote the wise GMail: “Don’t delete, archive!” You just store and store and store. Of course you don’t have to turn into a pack rat and store data that will never be used, and you do need to delete stuff, retention policies and all that. But since you can afford the space, your default mode of operation would be ’store at will, lazy on delete’.

So now you can start keeping versions around, the same way a Wiki retains all previous edits. Turns out being lazy with deletes solves the read consistency problems very well. You can use generational counters to retrieve a view of the database at a particular point in time. I call those generational counters, rather than versions, because they may span multiple records from different tables.

Separately, we’re going to use versions to help clients cache data and perform conditional updates. This is a common enough pattern that we expect the database to handle it for us, on every single table, offering Last-Modified and ETag on the cheap.

Update Feeds

Like I said before, I don’t have much love for triggers and stored procedures, I’d much rather use a proper programming language. So how do we get those to happen in the application?

Going back to the big picture, we have applications hitting multiple services at the same time, and I personally don’t believe in network fallacies, so I’m going to care for latency by doing as much work as possible outside the request-response cycle. Request comes in, I do the minimum amount of work on the inputs, store the minimum amount of data, and quickly send back a 200 (OK), 201 (Created) or whatever other status fits the bill.

First instinct would be to suggest a message queue. Not a bad idea, but let’s get something straight: it’s a design pattern. Design patterns are the way we work around limitations of the original design, by introducing boilerplate complexity. So let’s instead tackle it at the database level by introducing update feeds.

We get two kind of update feeds. Push feeds are callback to the application that inform it of individual writes (create, update, or delete). This is done asynchronously, so it does not extend the write or require any locking of resources. Still it’s blocking, so we’re going to preserve it for simple and priority updates, and one specific case that we’ll cover shortly. It is done at least once for each write, so we have a guarantee that a push feed will always see the most recent updates to the database.

Pull feeds allow the application to grab all the recent updates and process them at once. Since the database keeps track of Last-Modified, it’s a simple matter to catch up on all the updates since the last pull. Like push feeds, we can determine all the recent create/update/delete writes on a table. The difference is that we are pulling, so we can perform longer units of work, or decide to pull at different intervals that depend on the workload.

Push and pull feeds are great for a variety of uses. One we talked about is minimizing the response time, performing the bulk of the work asynchronously, an architectural pattern supported by the database. Another is chaining updates together, which I’m exploring in the context of pushing updates to multiple services and handling complex transactions with compensation. I’ll talk about this at length in a future post.

We can also use pull feeds to collect updates from existing tables and use those to populate computed tables. Computed tables are one way we can trade space for time, using more space to store duplicate data, but improving query time. There are enough uses for this in typical applications that do not cross over into the territory of OLAP (e.g. ranking, recommendations, social graphs). If the words map-reduce cross your mind, then you obviously know of one particular implementation for handling this type of workload.

Keep in mind that, unlike queries returning records, update feeds return events. While you can use queries to retrieve records based on their timestamp, you need update feeds to determine when records are deleted. Besides the use cases we described above, you can also use update feeds for replication and for indexing records outside the database. Remember the mythical search engine scenario? Update feeds are an easy way to feed structured data into a search engine.

Update feeds give us two important characteristics: a database that delegates all the logic to the application, and that acts as the primary place for storing the application state. It’s also a critical feature for handling indexes.

The Relational Model

I use the relational model principles to design, analyze and make predictions about the database. I use it to decide when to store data in 3rd normal form, and when to denormalize liberally. Yes, I denormalize data! Which is why I’m still thinking relational model, even though I’m talking about something other than a relational database. But if you are looking for a database that enforces 3rd normal form, and optimizes for tabular data, then you’ll be disappointed.

So now example time:

GET /orders/123 <order> <item> <link>/products/456</link> <quantity>5</quantity> </item> <item> <link>/products/789</link> <quantity>1</quantity> </item> <total>15.99</total> <created-by>assaf.labnotes.org</created-by> </order> GET /products/456 <product> <text>LOLcat picture frame</text> <price>4.99</price> </product>

This is an elephant. Some people look at it and see XML data, some people look and see service calls, some people look and see relations. I think they’re all there.

I brought up this example to illustrate several points. First of which, is that everything you know about data and relations still holds. In this example I wanted to illustrate how I can join data pulled over HTTP from a Web service. The other two points deal with the way we’re going to model our entities. Differently.

In modeling the entities, I realized that orders and products are distinct with weak ties between them. They may in fact be offered by different services, or stored in different databases, so they might as well be in different tables. Goes without mention that I won’t even dream of duplicating product details inside the order, or listing orders inside a product record.

But in modeling the order entity, I made two different decisions. The first, is to calculate the order total and store the computed result in the order itself. Was that a good idea or short sighted on my part? I won’t argue either way, but if you do have an opinion in the matter, preferably a strong one, then you’re using your relational model instincts to reason about read consistency databases. I just wanted to illustrate that all that we know is still useful.

The other decision I made was to store line items inside the order itself. I realized I have no compelling use case to keep those separate. When I add or remove a line item, I’m changing the order, I expect the order to have a new version and updated timestamp. When I delete the order, I assume all the line items will go away. And when I query the order, I intend to find all the line items there, without resorting to Cartesian join and result-set gymnastics.

So I designed the order entity from that perspective, and simplified the application logic. I also created three issues that we’ll talk about next: indexes, updates and conflicts.

Asynchronous Indexing

As the number of orders grow in size, I’m going to face a problem. How can I find all the orders related to a product without scanning through the entire orders table?

If I used a relational database, I would break the line items and orders into separate tables. One reason is to allow fine grain updates into the order. Another is the constraint imposed on indexes: an index is derived by reducing a table row into a set of fields, ordering these fields, and sorting over the collection.

Since I’m designing for a read consistency database that scales extremely well for writes, I’m not too concerned about the granular updates. I much rather optimize for reads (more frequent) by preserving data locality. As for indexes, well, that’s a separate issue.

Remember that my dumb database can hold no declarative logic, it can’t by itself decide what goes in the index. All the dumb database is able to do is store the indexes and use them efficiently to retrieve records, but it needs the application to decide what data goes in the index. This is a special case for update feeds: the database delegates write events to the application, and the application resolves each event into a set of index records.

This sounds a little bit complicated, so let’s work that into our example. I’m going to define an index by giving it a name in the database, and a callback function that, given an order, will return a list of product URLs. I can use access that index with a product URL to find all the orders that contain that product. The database does all the heavy lifting, but the index structure is decided by the application.

With that index, I get efficient queries on my orders, without having to create and maintain a separate line item table for the sole purpose of indexing. I only need to design indexes that support my queries.

I happen to think asynchronous indexes are a powerful feature that simplifies entity management. Here’s another example. Given the same list of orders, I’m going to define a function that only selects completed orders, takes the difference between completed and created dates, and reduce that into a ‘days to complete’ index. I can now find all orders that took five days to complete, without having to store computed values in the table, or see NULLs in my index.

Asynchronous indexes have three interesting properties. The decision on what and how to index is done by the application, which also means they are sparse indexes. And they are updated asynchronously, much like a search index, which reduces contention on writes.

Transactions

Two things you need to know about read consistency databases: 1) there are no locks, and 2) there are no locks.

You may already decided that it’s impossible to build a database without some sort of locking mechanism. Perhaps, although some voices from the functional programming world may argue otherwise. Either way, what I mean by no locks is that you can’t use one action to block another, and you certainly can’t deadlock. This makes the database and application that uses it a parallel problem. And parallel problems yield nicely to multi-core CPUs and banks of inter-connected nodes.

There’s a mythical example that explains how relational databases work. It involves an atomic transaction that moves data from one account (debit balance) to another (credit balance), in such a way that both happen together without intermediate results. In the past I used that as an example to illustrate the role of atomic transactions in storage. Nowadays I use this as an example to illustrate a bit of social engineering. Don’t laugh.

The point of this example is to confuse database transactions with financial transactions, using something that affects us directly: our checking account. Banks don’t work that way, in real life financial transactions are much different. In fact, the transaction in our checking account is a record of the money changing hands. The sum of these records is the bank account. And we can use these records to calculate a snapshot and store it as the daily balance, or present the current balance from daily balance and pending transactions combined. A classical application of read consistency.

For a large number of applications, what you need are the ability to make progress, dodge race conditions, and end up with a consistent view.

So the first thing we need to understand is how we develop applications for real life scenarios. In real life scenarios, we’re going to deal with incremental state changes (e.g. credit card charges take time to clear), resolve conflicts as they happen (e.g. order ready to ship, when we lost the last item we had in stock), and coordinate outcomes at higher levels (two phase commit for airplane tickets and hotel rooms? show me). Those all fit well within our read consistency model.

While we like the database to be dumb, we don’t tolerate stupidity: we can’t stand for lost updates. That’s a critical requirement we can address in a variety of ways. Idempotent writes, so clients can retry those until successful (see below). Reliable storage, through traditional mechanisms like RAIDs, logs and geographical fail-over. At-least-once semantic, that one is important since we do a lot of work asynchronously, so it’s baked into the update feeds (see above).

Lost updates is also a term that describes one update overwriting another. We don’t have locks, but we’re going to use conditional updates instead (relatives of optimistic locks). We already identified this as a feature offered by the database itself in the form of versions and cheap ETags. Since we tend to handle coarse grain entities, the ones that represent our units of data, we can often perform the equivalent of a transaction in a single update. We can also use update feeds to chain two updates together, so a change in one record will be reflected in another.

There’s still an issue of pushing updates to multiple independent entities. This, it turns out, is a much larger architectural problem to solve. How do you push updates reliably into your ERP and CRM, when those are independent services? So we start thinking in terms of versions, ordering of operations, chained updates and compensation. At this point I’m going to wave my hands a little. It’s a really interesting topic, but much larger in scope for this post, so I’ll defer it to some other time.

There are obviously applications for which this is not enough, and applications for which you would prefer the convenience of ACID transactions. But for a large class of applications that are sensitive about the correctness of their data, and must handle it reliably, a read consistency database would work just fine.

Identities

Let’s start with the basic stuff. Each record has a unique identifier created by the database. These identifiers are opaque, you can’t use them to infer order or locality, but you can certainly use them for equality. Nothing new so far, but that’s not the only type of identity we have to contend with.

How can we create a record exactly once? Remember that we don’t have transactions, but we do have conditional updates, and conditional updates are slightly different from optimistic locks. We use conditional updates to update a record only when it has a certain value we know, typically from a previous read, but sometimes any value will do.

So we’re going to ask the database to allocate a new record identifier for us, fairly cheap request. Then we’re going to make an update on the condition that the record doesn’t already exist. If you’re familiar with HTTP, think of a GET to a known resource (e.g. /orders/new), extracting the URL out of the Location header, and using it to make a PUT with If-None-Match set to ‘*’. In short, we’re only going to make a successful update if no one else beat us to the punch, including any previous attempt we made before. Create once.

So what does this have to do with identity?

A common scenario is one where the entity has an identity, different from the unique identifier created by the database. Imagine for example that we’re creating user accounts, we decide on the username as identity, therefore no two users can have the same username. We better be able to do that.

Let’s revisit asynchronous indexes. They’re updated asynchronously, duh, so we’re opening up to a race condition in which two writers create two records, and the index updates to point at both. We can decide this a read consistency issue, and simply formulate our query to ignore all but the first record. Or, given that our indexes are sparse, decide to create an index entry once, and the first record wins.

This works nicely for asynchronous updates, since race conditions are rare and we don’t care much for a few orphaned records. But what if we’re doing something synchronously: the user is waiting for us to confirm the new account, or ask them to pick a different user name? We can block. Create the record, wait for it to show up in the index, decide if it’s the same record as the one created, and return the appropriate response. That’s one option.

Let’s look at another use for conditional updates. We’re going to first allocate a new record identifier, then update the index to point there on the condition that no index entry exists, and then update the record on the condition that no record exists. Why do we need both conditions? On the chance that the index entry was previously created, and then abandoned before creating the record. Might have been us in a different thread. So if we do find an index entry, we’ll use that to make a conditional update.

All this, without locks. And obviously, it’s abstracted by the client library, so we don’t need to run the entire sequence, just ask to create a record with identity, or create-new/update-existing. But it’s helpful to know how this is handled by the database.

“Wait!”, you say, “so you can update indexes directly, why didn’t you say that before?” Because the more you work with traditional databases, the more you’re conditioned to think of storage as a synchronous problem. And that’s wrong. There’s a world of possibilities out there that comes from thinking about and solving problems asynchronously. Read consistency databases open that door, but it’s also important to understand how to use them properly. So I wanted to focus more about this new frame of mind, the reaffirming habits of the past.

Conflict Resolution

Data partitioning means placing different subsets of the data in different places. Data partitioning is free in the sense that you don’t have to work to make it happen inside the application. We’re going to leave it up to the database (or a proxy, think about that) to decide how to distribute individual records, how to locate them and combine results, and how to shuffle data around when adding new nodes. All the features we covered so far make this transparent to the application.

However, data partitioning is not enough for all workloads, sometimes we need replication. Replication brings with it a different problem, that of network partitioning. When the network partitions, it’s possible to make an update in one replica, but read a stale value from another. It is also possible to perform independent updates on replicas of the same records.

Read consistency helps us deal with out-of-sync replicas at read time, but we still need to solve independent writes and reconcile those. Again, we’re going to use update feeds to delegate conflict resolution to the application, it’s just another type of update events.

Time To Junk The RDBMS?

That depends on how you’re using it, and I’m the last to suggest you junk your RDBMS just because a new shiny object comes around and becomes the sound-byte of the day. If it ain’t broke, build something new.

But if anything I wrote sounds vaguely familiar because you somehow managed to dumb your RDBMS into storing structured data in BLOBs, added versions and timestamps on all records, grappled with minimizing transactions and locks, denormalized data like there’s no tomorrow, or relied too much on a message queue, then time to rethink. Are you using a hammer to polish your china? (Tip: not a good idea, invest in soft cloth)

The thing about relational databases, dumbing them down doesn’t create a dumb database that you can scale easily, and doing read consistency on top of write consistency is two problems to solve. It’s still a shared resource programmed in COBOL pretending to be a mainframe from the day structured data would fit nicely in tabular form. Which, granted is perfectly fine for a lot of applications. And insufficient for others.

Where's my comment? I get too much comment spam, so I have to moderate comments. Damn those spammers. If you don't see your comment immediately, be patient. I'll approve it the minute I see it. Want to know when your comment shows up, or check if anyone responded? Track it.

Ryan Tomayko

September 20th, 2007 at 8:14 am

Holy crap, Assaf. You may need to consider upgrading your medium. e.g., books!

Matthew King

September 20th, 2007 at 9:32 am

Very, very good article. I now understand much better the thing I think you must be describing without naming.

Labnotes » CouchDB: Thinking beyond the RDBMS

September 20th, 2007 at 11:19 am

[…] can do all of that in the service. And you can do better if you replace write consistency with read consistency, making allowances for asynchronous updates, and using functional programming in your code instead […]

Thomas

September 20th, 2007 at 8:15 pm

I’m having a hard time with the push/pull updates and indexes. I generally agree with everything and would love to follow along through a few examples to make the concepts more concrete.
What do you think?

engtech

September 22nd, 2007 at 4:53 pm

You almost lost me until I saw lolcats in the example code.
+5

Assaf

September 23rd, 2007 at 4:50 pm

Lolcats add credibility to any technical discussions.

Morning Brew #78

September 27th, 2007 at 4:14 am

[…] Read Consistency: Dumb Databases, Smart Services. […]

Alex Barnett

October 6th, 2007 at 9:03 am

A fun read, thanks Assaf.

Plasmoid

November 26th, 2007 at 9:37 pm

Well, that’s a nice pool of random ideas.
Unfortunately you gloss over hard problems as if they were trivially solved…
Distributed Indexes, “Partitioning is something that happens, not something you have to work hard for” etc.
What would really help to get an idea of what you’re after
would be some examples with pseudocode.
What exactly happens on INSERT, UPDATE, DELETE, SELECT - what is read/written, and from/to where?
What happens when one node in your Cluster fails? (n-copies, or what?)
How exactly does the partitioning “just happen”?
How does the distributed index work?
How do aggregate queries work?
In some paragraphs you make it sound like you know some magic bullet, something
in between a stream oriented db and a distributed rdbms, in other paragraphs
you just sound like you have no idea what you’re talking about at all…

lubosh

January 13th, 2008 at 9:07 pm

Hi Assaf, very interesting read but update feeds are more complicated problem. some changes might take longer to become fully available to all nodes, so your client application might miss on those.
for example,
- record A gets created
- record B gets created
- record B will get available to all nodes
my client application will pull updates (only record B) and remembers time of last update
- record A will get available to all nodes
in this case record A will never get noticed because of its earlier timestamp.
how would you deal with this problem? perhaps only message queues so we don’t depend on time unit.

January 14th, 2008 at 12:29 am

lubosh, there’s a delay between the app adding the record (or changing an existing one), and all the database nodes reaching consensus to show the new record (or the update).
The update feed can (and should) deal with that consensus by ordering records as hey become available, so B will show up followed by A. Remember, the delay is not measured in hours, but in milliseconds, a simple consensus algorithm can take care of that.

Giles Bowkett

January 24th, 2008 at 8:10 am

on structured search, find and listen to Clay Shirky’s talk “Ontology Is Overrated” on IT Conversations dot com. the difference between structured search interfaces and simple type it-see it search bars was the difference between Yahoo’s loss of the number one spot and Google’s ability to take it and keep it.

Oleg Andreev

March 24th, 2008 at 5:36 pm

Great post, Assaf!
StrokeDB is an attempt to address all the issues, described in this post. Check it out on strokedb.com

Internet-Scale Services « Todor Ivanov’s Weblog

March 26th, 2008 at 5:23 am

[…] interesting post following the database topic I found in Labnotes blog is Read Consistency: Dumb Databases, Smart Services - the article is a bit longer but there are good points reminding us what is important in DB […]

Feb	MAR	Apr
	27
2007	2008	2009

Labnotes

Read Consistency: Dumb Databases, Smart Services

September 20th, 2007