I recently conducted an interview about Algorithms.io. One of the questions asked was “What were some of the biggest technology challenges you had to overcome?” My answer is below:
First off, what we’re doing today is really enabled by the latest advances in cloud computing. Even 18 months ago it would have been nearly impossible to build this, and all of the people able to do it would have been locked up at one of the large technology companies
That said, there were two really difficult technical challenges to overcome: abstracting algorithms into reusable components and building a cross-cloud orchestration layer.
Being able to abstract and normalize algorithms written in different programming languages into a standard format was an interesting challenge. We can currently take in algorithms written in R, Java, PHP, Python, C, as well as MapReduce and PIG functions for Hadoop and chain them all together without having to write custom glue code. This approach makes our system infinitely customizable, and allows our customers to leverage the best open source code along with their own proprietary algorithms. The net impact is companies are able to get more leverage out of their current data science team, or for those without a data scientist, they can start using these tools with limited ramp-up.
Our cross-cloud orchestration layer is also a useful piece of technology that’s pretty unique. The platform currently sits across Rackspace, Amazon Web Services, Google Compute, and HP Cloud, and we have Microsoft Azure in our product roadmap. This allows us to process our customer’s data in whichever cloud environment they are currently using. We’re able to plug in new database and execution environments and dynamically spin up/down execution resources (i.e. Hadoop cluster, R server, Java server, etc) in each of those clouds. With our alpha system alone that is over 55,000 unique data analysis combinations to manage PER CLOUD. We do this all today programmatically, and it’s getting better all the time.