CP For Distributed Databases
Some distributed databases are characterized as being CP. BigTable is an example of this, and it stands in contrast in this respect with Dynamo which is AP. If you think about it, however, things are a little bit odd. If you’ve ever used BigTable or it open-source clone HBase you know that it doesn’t stop accepting writes if one node goes down. And at the scale these tools are employed, nodes go down all the time. So is the characterization wrong? Or is there something missing from the description?
In truth, the CAP Theorem is defined in a very precise way. It gives us a whole lot of intuition about the real world, but it does not model it that accurately. It deals with a so-called read-write register - a distributed object which can be read or written. This is the entity which can be CP, AP or CA. However, even databases with relatively simple interfaces like the NoSQL key-value stores are much more complicated than a register. For BigTable/HBase a single row (key-value pair) can be assimilated with the register. Indeed, operations on a row happen transactionally, and it is CP in terms of its behaviour.
Physically, the key-space is divided into ranges called tablets. Each tablet it handled by a tablet server, and is replicated three times usually. If a tablet server goes down or if there’s a partition, the tablets it handles become unworkable, per the CP behaviour. More precisely, one cannot write successfully, since a write requires all replicas to receive a copy of the data. However, all the other tablets are still OK and the system as as whole is still more-or-less in a functioning state.
What this means in practice is that, for example, a small number of users of the system will see issues, if their data was stored on the failed server. Similarly, not all data in an offline analysis might be available. Which is not the end of the world - as a total shutdown of would imply. And as would happen in a single node database.
A much better article in the same vein is Martin Kleppmann’s Please Stop Calling Databases CP OR AP.