CS294 Paper Review Blog: Review of "Mining Modern Repositories with Elasticsearch"

Friday, October 9, 2015

Review of "Mining Modern Repositories with Elasticsearch"

As companies gain increasingly larger amounts of data, they need scalable solutions to deal with that data. Most of what we have looked at in this class has been somewhat complex - how to process large amounts of data, etc. - but we can't forget about one of the simplest applications: simply finding things within this wealth of data, aka search.

This is different from previous offerings simply because of the volume of data - local searching and indexing is by no means a new concept, so the main contribution of Elasticsearch was intelligently distributing and sharding this data to be able to access it quickly in a scalable manner.

The trade-off is primarily complexity vs speed. Elasticsearch is pretty simple: you get textual searches on certain fields, even letting you define a range for e.g. numeric types, but that's about it. There's no joining, aggregation, etc. On the other hand, this enables it to run extremely quickly on very large datasets.

This paper in particular I don't see being influential in 10 years, but I do see Elasticsearch in general (or at least, some derivative) continuing to be very important moving forward.

CS294 Paper Review Blog

Friday, October 9, 2015

Review of "Mining Modern Repositories with Elasticsearch"

No comments:

Post a Comment