Friday, December 13, 2013 has been acquired by LumenData

On behalf of the team, I’m pleased to announce that LumenData, a leader in the Master Data Management (MDM) and Data Strategy space, has acquired  LumenData’s solutions are used by dozens of enterprises to address the difficult problems of data integration, cleansing, and maintenance.  The company’s MDM software aggregates data across the enterprise into a “single-source of truth” that customers can rely on to effectively run their business.  

We founded to help companies use data generated by connected devices to identify insights and products.  As adoption of our cloud machine learning platform has grown, we’ve come to realize that expertise in data strategy and integration is an integral part of providing larger, more meaningful solutions.  The days of Big Data are still young, and while the market has been flooded with new technologies, companies still need a trusted advisor to help them understand what data to use and how to best assemble it in order to get the most value from analytics.  By combining LumenData’s expertise in overall data strategy and our powerful cloud platform, we are positioned well to work with our collective customers from initial concept through production deployment.

We would like to thank all of our customers, partners and advisors who have supported us over the past two years.  We’re thrilled to begin this next chapter of our journey and look forward to continuing providing innovative solutions for companies looking to transform streaming data into business value.


Andy Bartley

Tuesday, October 1, 2013

Visit At Booth 118 At Dataweek Tomorrow and Thursday

Come visit us tomorrow and Thursdayat Dataweek 2013 at the Fort Mason center in San Francisco.  We will be in booth #118 giving a live demo of our new machine learning platform for wearable devices.

This new platform intelligently classifies streaming data from wearable devices into actionable events that can be used to build predictive applications.  It combines a data scientist, dev ops engineer, and developer all into one simple service.

We've received a very positive response about the new platform, including being voted Dataweek's September Startup of The Month.

For those who can't make it to Dataweek you can catch us later this month at the following events:

Oct. 17
@ San Jose Conference Center

Oct. 21
@Hacker Dojo

Oct. 25
@The Hub, Seattle


We are transitioning to focus on streaming data

This past 6 months has been exciting for us.  We publicly launched our first algorithms as a service offering in April, won pitch competitions in May and June, and have been heads down working with customers since July.

It's been a lot of fun going from idea to revenue. We've had the opportunity to work with great companies in a variety of industries including semiconductor, online education, venture capital.  We've enjoyed building recommendation systems, predictors, and intelligent content classifiers.  But every once in a while, a calling comes that you just can't ignore.

That calling for us is the internet of things (IoT).  While we've had our eye on this space for some time, the last several months have made it clear that the right fit for our technology and team is working with streaming data from connected devices.

Our technical team has been working with massive streams of data for years in the security space security was big data before big data even existed as a term).  In security, as with IoT, classifying data is the name of the game.  Systems need to be able to ingest massive amounts of raw data and figure out WHAT it means.  Is there a threat? Is a system going out-of-whack? Is a system being hacked?

All of these events have a digital fingerprint that can be identified with the right data models.  This requires infrastructure to properly ingest and store streaming data, a modeling process to build the "digital fingerprint" of what you're looking for, and machine learning so that those models get better over time.

Data from IoT devices need the same intelligence.  Devices should be able to easily stream raw data into scaleable storage.  It should be easy to build data models for "digital fingerprints" and apply machine learning to find and refine those fingerprints over time.  Today, there are several great companies and open-source projects tackling the infrastructure piece, but machine learning solutions that solve the big problem for IoT, classification (and anomaly detection to a lesser degree) are still lacking.

It is our goal to make this powerful machine learning technology available to companies of all sizes.  We are starting out with wearables companies, and will be announcing some exciting new projects in manufacturing and healthcare soon so stay tuned.  We're excited to be playing in such a rapidly growing space, and look forward to working with a cadre of great customers and partners in the months to come.

Wednesday, September 25, 2013

Data format required for streaming platform alpha launch

In response to several recent questions, here's a quick update on the data structure our system can handle for now.  

Input.  What the data should look like.  Each piece of data represents a point in time.  Your sensor is gathering data and then emitting (should use this term) it.  You don't specifically mark the time on each piece of data but as they are created they are sent to the system. 

Each piece of data (or if you want to call it an event) has some properties associated with it.  It depends on what your sensor is designed to do.  For example for a gyroscope sensor.  It will emit rotation of the x, y, and z axis.  For every sample that the sensor emit it might look like gyro.x = -84.2, gyro.y = -132.2, gyro.z = -80.  All of that data represents 1 event.  You can send as many of these "events" to as you want.

Most sensors can emit at a very high frequency.  You can try to send all that data into (which it can handle) but is your internet connection good enough to send all that data?  Or do you need that high of a sample rate for your application?  What is the right sample rate?  Thats another topic for another blog.

Once you have all this time series data streamed in you can do some very cool stuff with it.  For our users, the most important thing to do with the data is classify it WITHOUT having to pre-define the device state.  What that means, is being able to tell from the raw data what's going on, not requiring pre-set parameters that define the events ("right now I am running, all data coming in is related to running") for a specific period of time.

We'll have more on this in our support FAQ.

Friday, September 13, 2013 Now Providing Machine Learning For Streaming Data

The team is excited to announce that we will now be providing a version of our machine learning platform specifically for streaming data. We will be supporting flat data from any type of connected device.
This offering comes as a result of customer demand from the past several months. We had initially planned to launch our streaming platform in Q1 of 2014. However the number of requests we’ve received for streaming support convinced us that the time was right now to bring our Beta offering to market.
So, what are the top three problems this platform solves for our customers?

First, turning sensor data into useful information is not easy. Many of our customers have limited data science resources on their teams. Developing and moving machine learning algorithms into production is a challenge. Our catalog of machine learning algorithms provides a complete data science solution out of the box. This is especially useful for startups looking to raise capital as our platform provides a Big Data and Machine Learning story for pitches.

Second, time to market. The wearables market is exploding, and being able to provide valuable apps to users based on device data is critical. Our platform provides all the infrastructure needed to ingest, store, predictively model and return results that power consumer web and mobile apps. This has eliminated months of development time which allow our customer to stay focused on delighting their customers and building awesome developer communities.

Third, scaleability. It’s one thing to build out the infrastructure, it’s another to scale it across thousands of devices and millions of events. Our cross-cloud architecture automatically scales with our customers business. It leverages multiple geographies across multiple cloud providers to ensure service is never interrupted.

For our technical friends, the platform includes:
  • Web Sockets For True Streaming
  • Multiple Classification Algorithms
  • Time-Series Data Storage
  • Streaming Visualizations
  • API for Developers
You can find more information on the platform here:
We’re now accepting new users for our Beta. If interested in joining the beta, please email me at andy(at)

Tuesday, August 20, 2013

Do developers actually spend time looking for new algorithms?

One question we've been curious about when considering the potential of algorithms as a services is how often developers look for new algorithms.

So, we ran a quick two-day survey to get an initial indication of what the answer might be.

With a sample size of n = 80, we'd be remiss to not point out the obvious facts that the statistical validity of the responses wouldn't hold up in Judge Judy's court.  That said, as a quick finger in the air test is gives an idea if any wind is blowing at all.

The answers in this population were a bit lower than we initially expected.  For a service to be meaningful, we'd want it to have the potential to engage users on at least a monthly basis (to justify the monthly fee).  With most respondents looking for new algos at the rate of around 2 - 4 times per year, a service that provides new algorithms would require a significant amount of eyeballs to manage the likely churn that would come with disengagement (assuming subscription sign-ups in the first place).

We'll be broadening the scope of this survey to get a larger sample size, and more meaningful results.  The results will be posted when available so stay tuned.

Wednesday, July 24, 2013

Machine Learning as a Service gaining in popularity by those that matter - the implementers

I had the opportunity to speak on a panel last night on the topic of Machine Learning as a Service.  My fellow panelists were from BigMLSnap Analytx, and Grok.
One of the questions I was asked before the event was “What needs to happen for broader acceptance of Machine Learning as a Service?”  My answer is below:
“I think this hinges on a broader understanding of how intelligent algorithms like machine learning can add value.  This means greater discussion of successful use cases on how the use of these algorithms has a meaningful impact, especially as compared to traditional approaches.   I’m looking forward to the day when there are so many compelling stories that Amazon and Netflix aren’t one of the first examples people use to talk about Machine Learning.
I have noticed that an increasing number of people we talk to have spent some time educating themselves on this concept, and it’s not all techies.  Many are being exposed to the topic as part of the broader discussion around Big Data. Their analysis is evolving from  “How would I use this in my organization?” to “How do I fit this within the budget of my organizational constraints?” People are wondering if it is something they are going to build out internally or if there is a way for them to have access to some of the technology without building out internal resources. Those are the people that we really enjoy talking to.”

Saturday, July 20, 2013

Two Technology Hurdles We’ve Overcome

I recently conducted an interview about  One of the questions asked was “What were some of the biggest technology challenges you had to overcome?”  My answer is below:

First off, what we’re doing today is really enabled by the latest advances in cloud computing.  Even 18 months ago it would have been nearly impossible to build this, and all of the people able to do it would have been locked up at one of the large technology companies

That said, there were two really difficult technical challenges to overcome: abstracting algorithms into reusable components and building a cross-cloud orchestration layer.

Being able to abstract and normalize algorithms written in different programming languages into a standard format was an interesting challenge.  We can currently take in algorithms written in R, Java, PHP, Python, C, as well as MapReduce and PIG functions for Hadoop and chain them all together without having to write custom glue code.  This approach makes our system infinitely customizable, and allows our customers to leverage the best open source code along with their own proprietary algorithms.   The net impact is companies are able to get more leverage out of their current data science team, or for those without a data scientist, they can start using these tools with limited ramp-up.

Our cross-cloud orchestration layer is also a useful piece of technology that’s pretty unique.  The platform currently sits across Rackspace, Amazon Web Services, Google Compute, and HP Cloud, and we have Microsoft Azure in our product roadmap.  This allows us to process our customer’s data in whichever cloud environment they are currently using.  We’re able to plug in new database and execution environments and dynamically spin up/down execution resources (i.e. Hadoop cluster, R server, Java server, etc) in each of those clouds.  With our alpha system alone that is over 55,000 unique data analysis combinations to manage PER CLOUD.  We do this all today programmatically, and it’s getting better all the time.

Friday, July 12, 2013 Now Support Async Calls

We have just released new functionality that allows someone to make an asynchronous call with any of our algorithms via our API.  Prior to this every algorithm’s API call is a synchronous call which means if the algorithms takes 5 minutes to run, the API call will hold the connection for 5 minutes.  This is not ideal in some situation.  This now asynchronous call functionality allows you to run the same algorithm but instead of waiting around for it to finish, the system will return a job id to you and you can use that to query for the status of the job.
Let me show you how this works.  It is not too different from before.
Making an asynchronous call.  The only thing we are changing in the call is the parameter “method” is now set to “async”
curl -X POST \
-d 'method=async' \
-d 'outputType=json' \
-d 'train=3339' \
-d 'test=3340' \
-d 'dependentVariable=closed' \
-H "authToken: <AUTH_TOKEN>" \
This call will return immediately.  Using the “job_id” you will make a query to checkout the status of this job.
curl -X GET \
-H "authToken: <AUTH_TOKEN>" \
    "additional_info": {
        "final": {
            "output": {
    "datasource": {
    "created": {
        "date""2013-07-09 22:20:29",
    "last_modified": {
        "date""2013-07-09 22:20:29",
The status will be in various states depending on the algorithm.  There is one final state “ERROR” or “COMPLETED”.  Once you see this, it is the final output.  The results from the algorithm is placed into a datasource for you.  The datasource id can be found in two places in the query job return.  Both will have the same ID.
Thats about it.  Making asynchronous calls easy!

Saturday, July 6, 2013

The increasing importance of Algorithms: Article by The Guardian's Leo Hickman

The Guardian’s Leo Hickman published a new article this week about the increasingly pervasive use of algorithms in today’s data driven world.
You can find the full article transcript here:
The article included an interesting quote from Chris Steiner, author of Automate This: How Algorithms Came to Rule Our World.
Steiner argues that we should not automatically see algorithms as a malign influence on our lives, but we should debate their ubiquity and their wide range of uses. “We’re already halfway towards a world where algorithms run nearly everything. As their power intensifies, wealth will concentrate towards them. They will ensure the 1%-99% divide gets larger. If you’re not part of the class attached to algorithms, then you will struggle.”
In today’s digital world, more businesses than ever before have access to data about their customers and operations.  Much more is to come as Silicon Valley has invested over $1B in Big Data since Q2 2011 (  The opportunities to turn this resource into revenue will continue to evolve.  Those business that are able to do this with algorithms will get the most leverage, and as demonstrated in Ecommerce with recommendations, will set a new standard that all others must follow or fail.

Thursday, June 27, 2013

Thoughts on why Machine Learning as a Service is useful for developers

I recently had someone ask me why Machine Learning as a Service is a viable business.  It’s not the first time I’ve heard that question.  Below is an excerpt from my response:
I think one of the best trends to look at is the increase in number of APIs that are available – and the increasing number of applications that are mashups of various APIs. Some of these mashups (i.e. Summly) have been very successful.
With the web now highly programmable, basic functions are now commodities (payments, social, location, etc). As this happens the standard for all applications rises, and developers look to new technologies to add competitive differentiation to their applications.
We believe Machine Learning will be a key competitive differentiator for applications. Being able to turn raw data into intelligent output to build better apps and user experiences has already shown to be a significant advantage (Google, Netflix, Amazon). The same artificial intelligence that helps make these companies great, must eventually be delivered in a way that the broader market can consume because it will demand it to stay competitive.
We’re still early on in the adoption cycle, most of the developers I’ve talked too today would rather consume ML inside of a complete application, but we’re starting to see an increase in the number of new inquiries we receive about specific algorithms. Developers and companies are realizing that if they don’t begin to do more with their data they will be left behind by competitors. Not all of them are going to learn or hire someone who know ML, so having access to it via a simple API can be an attractive option”

Thursday, June 20, 2013

Much Love For Mashape

Launching a new API isn’t easy.   There is a lot of noise in the market, and discovery can be challenging.  That’s why finding great partners to work with in this space is of vital importance.
We made the decision early on to partner with Mashape to bring our Machine Learning APIs to market.  For those who aren’t familiar, Mashape is the leading open market for APIs.  They help new APIs get discovered and integrated into awesome applications.
Mashape has been an exceptional partner from day one.  Before we even launched with them, Aghi their CEO was willing to meet with me in person to explain their marketplace.  As we rolled out our offerings the technical team was available around the clock to help make sure our APIs were working properly.
Most recently, Mashape has brought Chris Ismael on to help with promotion of APIs.  Chris has done an unbelievable job for us.  He’s lined up several hackathon sponsor opportunities and media interviews that have exposed our company to an extensive new audience.
Here’s what Chris has to say about working at Mashape:
“As a Mashape evangelist, I get to have a unique and interesting perspective on APIs. While everyone’s hacking on the Facebook’s, Twitter’s, and , I get to play with the ones that don’t get mainstream attention. This gap between developers and the ‘long-tail’ APIs presents a huge opportunity cost to the API economy. Which is why I enjoy my job of promoting APIs from Mashape – aside from the fact that I get to show off complex stuff without driving myself crazy (e.g. need to build a recommendation engine like Amazon? There’s an API for that), I get to contribute back to the API providers by giving them valuable feedback from developers whom I show their APIs to. If I can bridge that gap and get them closer, I’ll be a happy camper.”
Chris and the Mashape team are helping to ensure that the potential of the web as a programming platform is realized.  If you are launching a new API I strongly encourage you to connect with the Mashape team to find out how they can help you make your service known.

Friday, June 14, 2013

Our cloud infrastructure for Algorithm APIs

Launching a new product or service is an exciting time.  New customers, new opportunities, new success stories.  That’s why we are excited to announce the launch of our new cloud infrastructure as a service for Algorithm-based APIs.
This offering comes as the result of two key trends we are observing:
1. In the world of the Programmable Web, algorithms are competitive advantage:  With more API’s available than ever before, it’s becoming easier to build interesting new applications from nothing more that mashing up APIs.  When anyone can build an app, how can developers differentiate themselves?  We believe that adding intelligence to applications will be one of the key ways to accomplish this.  And were not alone in this belief because…
2. The number of  APIs based on complex algorithms is on the rise:  Natural language processing, audio feature extraction, image and facial recognition, and machine learning are just a handful of the complex new algorithm classes available via API.  As these become better understood they will evolve from being a competitive differentiator to an industry standard.  The market demand for these complex algorithms has fueled an explosion in the number of new algorithm based APIs.  Mashape alone recently posted an article on their blog highlighting over 40 machine learning algorithms available through their marketplace.
In response to this demand, we decided to offer our cloud back end to developers looking to bring a new algorithm-based API to market.  Think of our platform as the Heroku for Algorithm APIs.  It’s a cross-cloud platform that manages the development, provisioning, and scaling of cloud infrastructure across Rackspace, AWS, and Google Compute.
The platform includes all of the necessary components to ingest, store and process data with algorithms:
  • Use any database you like (SQL, NoSQL, Graph, etc)
  • Supports algorithms written in Java, C, Python, PHP, R, MapReduce, and PIG
  • Messaging queue and load balancer
  • Orchestration queue executes jobs automatically in the proper environment
  • Multi-tenant with security roles and permissions – one platform for many customers
Our service is white-labeled, meaning developers control their own brand.  With a web page and tutorial content built over a weeekend, developers can bring their APIs to market.
If you have an idea for next awesome algorithm API, or you’re sick of dealing with infrastructure maintenance for your current API, contact us today and let us help you launch and grow your business.

Tuesday, June 4, 2013 voted “Most Likely To Succeed” at SV Forum Launch: Silicon Valley

I’m pleased to announce that was one of the six companies to receive the “Most Likely to Succeed” award at SV forum’s Launch:Silicon Valley 2013 conference this past Tuesday. We presented alongside 29 other great companies and are humbled to have been chosen for this award.
It was a great event with many interesting speakers. Ray Kurzweil opened the conference and emphasized importance of algorithms. His speech provided a great introduction to our pitch which also highlighted the importance of algorithms in today’s data driven world.
We received interest from many companies throughout the day, further validating our belief that delivering algorithms as a cloud service is essential in today’s tech ecosystem. The age of machine learning is underway!

You can see the video of our pitch on Tube Chop here: