As with Bigtable, Cassandra is solving a real issue of storing large amounts of structured data in the middleground between KV stores and relational databases, at very large scale. This is becoming increasingly more important.
The main idea of Cassandra is to decentralize all of the processing, moving a way from a typical master-slave architecture. Decision making, e.g. failure detection, is all done in a decentralized manner through the use of Gossip. This removes reliance on a single point of failure or contention.
While this removal of reliance on a master is nice, it makes me wonder, is it really necessary? We saw in Bigtable that a master-server system can be made very efficient by making the operations that it carries out infrequent (e.g. in Bigtable most requests don't involve the master). Is the extra complexity of the decentralized system worth it? Cassandra's widespread adoption may point to yes, but I am not entirely convinced.
I don't particularly see this being highly influential in 10 years - while Cassandra is a well designed system incorporating many good ideas, it didn't seem that it had many very original ideas, just a well put-together set of existing ideas.
No comments:
Post a Comment