The problem is somewhat real - while GFS / HDFS currently do a pretty good job of supporting large-scale compute applications, they do have the issue of having a single-point bottleneck (master/namenode server) and, as the paper mentions, of requiring that applications exploit data locality.
The solution's main idea is to distribute all files across many disks, utilizing reads/writes from all of them to have very high throughput, while also removing any central bottleneck (reads/writes are done without contacting any central server).
This solution is different from previous work primarily because of the assumption of full network bisection bandwidth. Previously, this was a very unrealistic assumption, but they discuss that newer network switches are making this feasible. This allows for less emphasis on data locality.
There are some hard tradeoffs here; for one, the GUID identifier for blobs enables finding a node to contact without requiring a central server, but doesn't as easily enable the use of arbitrary path strings / directory hierarchy. Another tradeoff is the necessity for more expensive/complex network switches; while they discuss that full bisection bandwidth networks are now feasible, increasing this bandwidth is still a non-negligible expense.
I can see this paper being influential in 10 years; it seems that if it truly is becoming more feasible to have full bisection bandwidth networks, this will be even more true in the future, and some of the Hadoop clusters that are starting to be spun up are reaching the limits of what HDFS can handle with its single namenode. There seem to be some very interesting ideas here, plus the overall idea is relatively simple.
No comments:
Post a Comment