Monday, September 21, 2015

Review of "IOFlow: A Software-Defined Storage Architecture"

I am unsure how real of a problem they are solving here. While I can see how having flow control and speed guarantees may become useful, I haven't personally heard of this ever being an issue and am not sure how common this problem is, although I suppose it must be a real problem somewhere at Microsoft for them to have put so much effort into solving it.

The main idea is to use a centralized control application to manage policies at various layers throughout the stack that a distributed IO call must traverse (down into core of client, through network, through NIC of server into its file system), informing them of how they should treat different incoming requests (e.g. based on where the call originated from, what data the call is asking for, etc.). 

This solution is different for a few reasons; first, it makes the assumption that the cluster is relatively small (hundreds of servers), which does not apply to the entire general case (e.g. it would not currently work on Amazon EC2). This is also different because it is assuming a somewhat new workload, consisting of numerous VMs living on a single node, and many of these nodes comprising a network cluster, with each VM owned by (potentially) a different user/application. This presents a somewhat new take on prioritization because of its multi-tenancy. 

One fundamental trade-off is the use of a centralized control application - this limits the scalability of the system, since all of the servers will need to contact it for instructions/policies. Another is overall speed vs flexibility - overall throughput does decrease as a result of the IOFlow system, but it provides you with the power to guarantee a certain bandwidth for critical applications. 

I don't think I see this paper being influential in 10 years - I'm not familiar with this issue as an overly pressing problem, though I can see the utility of IOFlows. 

No comments:

Post a Comment