I thought this paper was pretty interesting - intuitively the power law distribution of natural graphs makes sense, but I would not have originally thought about it when working with something like GraphLab. I'm not sure exactly how general the power law distribution is, but it's definitely true for social networks, which are extremely important for graph processing in today's world, so this is definitely a worthwhile problem to solve.
There are two main ideas: first, separate the program explicitly into Gather / Apply / Scatter phases, allowing PowerGraph to optimize your program by having a better understanding of what needs to occur where. Second, partition the graph based on edges, not vertices, since this is more representative of the work that needs to be done.
The first trade-off I see is simply usability / programmability; it seems that it is a bit easier to reason about the programs which are written in GraphLab and Pregel than in PowerGraph due to the developer having to be more aware of the different phases in PowerGraph. However, this increase in complexity is relatively minor, and certainly seems worth the performance benefits in applicable situations.
I can definitely see this being impactful in 10 years - graph processing is a very hot subject right now, and this paper seems to introduce some ideas that seem very promising.
No comments:
Post a Comment