Wednesday, November 25, 2015

Review of "One Size Fits All"

Today, this paper seems to address something that is a nonissue - throughout the duration of my career in computer science I've been of the mindset that "one size fits all" has no place. It's strange to me to think how ubiquitous the DBMS was in times past - amazing that we were at a point when all of the systems I think of today as distinct - OLAP, OLTP, analytics, warehousing, etc. - were run through the same platform. I don't know if this paper in particular was influential in causing the shift to today's mindset, but in any case, the shift certainly occurred. 

One interesting question that I think arises from this is - will there ever be a time when we have systems that are so efficient we can again return to a one size fits all paradigm? It currently doesn't seem likely, but that would seem to be the "holy grail" of sorts. The Hadoop ecosystem today does provide at least a unified platform that many such systems can interact upon, with a unified data later (e.g. HDFS, Tachyon) and unified schedulers (Mesos, YARN). 

Monday, November 16, 2015

Review of "Jellyfish: Networking Data Centers Randomly"

The main idea of Jellyfish is to connect servers / switches completely randomly, rather than trying to conform to a regular structure such as the traditional tree / fat-tree structure. The primary motivation was the ease of incrementally adding new servers, but they found that the random layout actually increased bandwidth capacity when using the same hardware due to the paths between servers being shorter on average.

The VL2 paper also involved an element of randomness, though at the routing level rather than the physical connection level, which makes me wonder if there may be anything fundamental about the use of randomness in networking to avoid congestion and increase connectivity.

There are two main tradeoffs I see here: complexity of routing, and length/complexity of cabling. Since servers are no longer connected to nearby neighbors in a tree fashion, and may be connected to servers anywhere else in the data center, the average cable length may increase significantly, and the cabling follows a less regular structure so it may result in a "dread spaghetti monster" (authors' words). They discuss ways to solve this issue including clustering switches (which there will be many more as compared to servers for large clusters) in the middle of the data center, constraining most of the cabling to that area. They discuss a number of ways to deal with the routing complexity issue.

This idea is very interesting, and I am curious to see if in the 3 years since publication anyone has tried this.

Review of "PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric"

PortLand tries to solve extremely similar problems as VL2: making the network appear flat and allowing VMs to migrate gracefully, reducing amount of configuration necessary, efficient communication between any pair of nodes. Again, this is definitely a real issue.

PortLand uses a "PMAC" (pseudo-MAC) address to abstract away physical location details; this is similar to the directory lookups used in VL2. It seems that these two systems are taking very similar approaches; add another layer of abstraction to make physical location transparent to the application developer. PortLand seems to place less emphasis on providing high bandwidth between any pair of servers, instead not requiring a change in network topology and focusing on fault tolerance (whereas VL2 adds a more dense linking structure to provide high bandwidth).

Since both of these papers are 7 years old, I am curious to know if either of these systems (or something similar) is in use anywhere. I'm sure some of the ideas have appeared by now, and it would be interesting to find which approach was more successful. It is my intuition that PortLand's approach with fewer network links required probably appears more often, though I don't think I have ever heard of any PMAC approach to a physical-layer-hiding abstraction.


Sunday, November 15, 2015

Review of "VL2: A Flexible and Scalable Data Center Network"

The issue of getting good network bandwidth across arbitrary machines within a data center is very important; task placement is often based on data locality rather than placing machines near each other, and having to juggle both of these concerns simultaneously is very difficult. Making the network more flat in terms of performance eases this issue.

They have a few central ideas: use Valiant Load Balancing to route flows (rather than packets, as in typical VLB) randomly across the network, use a lookup table to map IP addresses to physical locations such that machines can easily be relocated, and push work to the end nodes to reduce state and complexity of switches.

There are a number of tradeoffs; for example, using the directory lookup structure for IP addresses helps create the flat network illusion they work to achieve, but provides some overhead. Their networks require a fair amount more connections than standard topologies, e.g. having a full bipartite graph between the intermediate and aggregate nodes. This cost comes at the benefit of much improved performance and ease of programming/task scheduling layout.

I can see this being influential in the future; as data center computing becomes completely ubiquitous, we expect it to become easier, with more of its complexities hidden, and the flat network abstraction is very helpful in achieving this. The performance gains as opposed to a standard tree hierarchy are also substantial. I don't necessarily think this system in particular will take off, but the ideas seem important.


Monday, November 9, 2015

Review of Tao

As more data becomes represented in graph formats to capture the intrinsic connectivity of the data, the ability to store information in a way that is aware of this structure becomes more important. 

The main idea of Tao is to provide a caching layer on top of a MySQL database which is graph-aware, intelligently fetching data based on its structure. It is essentially a replacement to memcache which is tuned specifically to graph-structured data - nothing too revolutionary. 

One trade-off they made which I am surprised about is the use of MySQL as the backing storage engine. It seems to me that this translation layer from graph-structured to relational must impose unnecessary overheads, but it also means that transitioning was likely easier and they get the reliability guarantees of MySQL automatically. 

I don't see this being influential in 10 years - there's uncertainty over whether we will want to continue to use graphs as the primary representation of data moving forward, and even if we do, it doesn't seem that this work is particularly novel. 

Review of Neo

Neo provides a data storage solution built from the ground up to store graph-structured data, unlike Tao which simply acts as a layer on top of an inherently relational storage model. 

While this seems like a good thing, the details provided in the paper about how performant the system is and how they achieved this are very sparse (essentially nonexistent). I would be very curious to know more about how this performs as opposed to traditional databases and NoSQL stores, and if it does perform well, how they managed to achieve this. It read almost like a marketing document more so than an academic document. 

The traverser / navigational model is interesting, and I would be curious to see how more complex queries are written in this manner. If the data really is naturally graph-structured, it seems that it would probably be intuitive to write queries in this way, but without any examples I am unsure if that intuition is correct. 

Wednesday, November 4, 2015

Review of "GraphX: Graph Processing in a Distributed Dataflow Framework"

I am not sure how frequent it is to need to be able to join graphs against other data when doing graph computations, but it does seem that it could be a common situation and GraphX can certainly help to solve this while still being extremely performant.

The main nugget is essentially to be able to run graph algorithms on top of Spark, which gives you fault tolerance for free, as well as allowing you to seamlessly intermingle graph data and other data that you may want to join against the graph data. One important thing done to make this successful was to optimize the common join-map-group pattern.

One trade-off is programming model; the vertex-program model of GraphLab and Pregel may be more intuitive for graph processing than fitting an extension of Spark's RDD API to the problem; I certainly had a little harder time reasoning about the program's behavior. Performance may also be an issue in some cases, but it seems that in general GraphX performs very well.


Review of "PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs"

I thought this paper was pretty interesting - intuitively the power law distribution of natural graphs makes sense, but I would not have originally thought about it when working with something like GraphLab. I'm not sure exactly how general the power law distribution is, but it's definitely true for social networks, which are extremely important for graph processing in today's world, so this is definitely a worthwhile problem to solve.

There are two main ideas: first, separate the program explicitly into Gather / Apply / Scatter phases, allowing PowerGraph to optimize your program by having a better understanding of what needs to occur where. Second, partition the graph based on edges, not vertices, since this is more representative of the work that needs to be done.

The first trade-off I see is simply usability / programmability; it seems that it is a bit easier to reason about the programs which are written in GraphLab and Pregel than in PowerGraph due to the developer having to be more aware of the different phases in PowerGraph. However, this increase in complexity is relatively minor, and certainly seems worth the performance benefits in applicable situations.

I can definitely see this being impactful in 10 years - graph processing is a very hot subject right now, and this paper seems to introduce some ideas that seem very promising.