Friday, December 13, 2013 has been acquired by LumenData

On behalf of the team, I’m pleased to announce that LumenData, a leader in the Master Data Management (MDM) and Data Strategy space, has acquired  LumenData’s solutions are used by dozens of enterprises to address the difficult problems of data integration, cleansing, and maintenance.  The company’s MDM software aggregates data across the enterprise into a “single-source of truth” that customers can rely on to effectively run their business.  

We founded to help companies use data generated by connected devices to identify insights and products.  As adoption of our cloud machine learning platform has grown, we’ve come to realize that expertise in data strategy and integration is an integral part of providing larger, more meaningful solutions.  The days of Big Data are still young, and while the market has been flooded with new technologies, companies still need a trusted advisor to help them understand what data to use and how to best assemble it in order to get the most value from analytics.  By combining LumenData’s expertise in overall data strategy and our powerful cloud platform, we are positioned well to work with our collective customers from initial concept through production deployment.

We would like to thank all of our customers, partners and advisors who have supported us over the past two years.  We’re thrilled to begin this next chapter of our journey and look forward to continuing providing innovative solutions for companies looking to transform streaming data into business value.


Andy Bartley

Tuesday, October 1, 2013

Visit At Booth 118 At Dataweek Tomorrow and Thursday

Come visit us tomorrow and Thursdayat Dataweek 2013 at the Fort Mason center in San Francisco.  We will be in booth #118 giving a live demo of our new machine learning platform for wearable devices.

This new platform intelligently classifies streaming data from wearable devices into actionable events that can be used to build predictive applications.  It combines a data scientist, dev ops engineer, and developer all into one simple service.

We've received a very positive response about the new platform, including being voted Dataweek's September Startup of The Month.

For those who can't make it to Dataweek you can catch us later this month at the following events:

Oct. 17
@ San Jose Conference Center

Oct. 21
@Hacker Dojo

Oct. 25
@The Hub, Seattle


We are transitioning to focus on streaming data

This past 6 months has been exciting for us.  We publicly launched our first algorithms as a service offering in April, won pitch competitions in May and June, and have been heads down working with customers since July.

It's been a lot of fun going from idea to revenue. We've had the opportunity to work with great companies in a variety of industries including semiconductor, online education, venture capital.  We've enjoyed building recommendation systems, predictors, and intelligent content classifiers.  But every once in a while, a calling comes that you just can't ignore.

That calling for us is the internet of things (IoT).  While we've had our eye on this space for some time, the last several months have made it clear that the right fit for our technology and team is working with streaming data from connected devices.

Our technical team has been working with massive streams of data for years in the security space security was big data before big data even existed as a term).  In security, as with IoT, classifying data is the name of the game.  Systems need to be able to ingest massive amounts of raw data and figure out WHAT it means.  Is there a threat? Is a system going out-of-whack? Is a system being hacked?

All of these events have a digital fingerprint that can be identified with the right data models.  This requires infrastructure to properly ingest and store streaming data, a modeling process to build the "digital fingerprint" of what you're looking for, and machine learning so that those models get better over time.

Data from IoT devices need the same intelligence.  Devices should be able to easily stream raw data into scaleable storage.  It should be easy to build data models for "digital fingerprints" and apply machine learning to find and refine those fingerprints over time.  Today, there are several great companies and open-source projects tackling the infrastructure piece, but machine learning solutions that solve the big problem for IoT, classification (and anomaly detection to a lesser degree) are still lacking.

It is our goal to make this powerful machine learning technology available to companies of all sizes.  We are starting out with wearables companies, and will be announcing some exciting new projects in manufacturing and healthcare soon so stay tuned.  We're excited to be playing in such a rapidly growing space, and look forward to working with a cadre of great customers and partners in the months to come.

Wednesday, September 25, 2013

Data format required for streaming platform alpha launch

In response to several recent questions, here's a quick update on the data structure our system can handle for now.  

Input.  What the data should look like.  Each piece of data represents a point in time.  Your sensor is gathering data and then emitting (should use this term) it.  You don't specifically mark the time on each piece of data but as they are created they are sent to the system. 

Each piece of data (or if you want to call it an event) has some properties associated with it.  It depends on what your sensor is designed to do.  For example for a gyroscope sensor.  It will emit rotation of the x, y, and z axis.  For every sample that the sensor emit it might look like gyro.x = -84.2, gyro.y = -132.2, gyro.z = -80.  All of that data represents 1 event.  You can send as many of these "events" to as you want.

Most sensors can emit at a very high frequency.  You can try to send all that data into (which it can handle) but is your internet connection good enough to send all that data?  Or do you need that high of a sample rate for your application?  What is the right sample rate?  Thats another topic for another blog.

Once you have all this time series data streamed in you can do some very cool stuff with it.  For our users, the most important thing to do with the data is classify it WITHOUT having to pre-define the device state.  What that means, is being able to tell from the raw data what's going on, not requiring pre-set parameters that define the events ("right now I am running, all data coming in is related to running") for a specific period of time.

We'll have more on this in our support FAQ.

Friday, September 13, 2013 Now Providing Machine Learning For Streaming Data

The team is excited to announce that we will now be providing a version of our machine learning platform specifically for streaming data. We will be supporting flat data from any type of connected device.
This offering comes as a result of customer demand from the past several months. We had initially planned to launch our streaming platform in Q1 of 2014. However the number of requests we’ve received for streaming support convinced us that the time was right now to bring our Beta offering to market.
So, what are the top three problems this platform solves for our customers?

First, turning sensor data into useful information is not easy. Many of our customers have limited data science resources on their teams. Developing and moving machine learning algorithms into production is a challenge. Our catalog of machine learning algorithms provides a complete data science solution out of the box. This is especially useful for startups looking to raise capital as our platform provides a Big Data and Machine Learning story for pitches.

Second, time to market. The wearables market is exploding, and being able to provide valuable apps to users based on device data is critical. Our platform provides all the infrastructure needed to ingest, store, predictively model and return results that power consumer web and mobile apps. This has eliminated months of development time which allow our customer to stay focused on delighting their customers and building awesome developer communities.

Third, scaleability. It’s one thing to build out the infrastructure, it’s another to scale it across thousands of devices and millions of events. Our cross-cloud architecture automatically scales with our customers business. It leverages multiple geographies across multiple cloud providers to ensure service is never interrupted.

For our technical friends, the platform includes:
  • Web Sockets For True Streaming
  • Multiple Classification Algorithms
  • Time-Series Data Storage
  • Streaming Visualizations
  • API for Developers
You can find more information on the platform here:
We’re now accepting new users for our Beta. If interested in joining the beta, please email me at andy(at)

Tuesday, August 20, 2013

Do developers actually spend time looking for new algorithms?

One question we've been curious about when considering the potential of algorithms as a services is how often developers look for new algorithms.

So, we ran a quick two-day survey to get an initial indication of what the answer might be.

With a sample size of n = 80, we'd be remiss to not point out the obvious facts that the statistical validity of the responses wouldn't hold up in Judge Judy's court.  That said, as a quick finger in the air test is gives an idea if any wind is blowing at all.

The answers in this population were a bit lower than we initially expected.  For a service to be meaningful, we'd want it to have the potential to engage users on at least a monthly basis (to justify the monthly fee).  With most respondents looking for new algos at the rate of around 2 - 4 times per year, a service that provides new algorithms would require a significant amount of eyeballs to manage the likely churn that would come with disengagement (assuming subscription sign-ups in the first place).

We'll be broadening the scope of this survey to get a larger sample size, and more meaningful results.  The results will be posted when available so stay tuned.

Wednesday, July 24, 2013

Machine Learning as a Service gaining in popularity by those that matter - the implementers

I had the opportunity to speak on a panel last night on the topic of Machine Learning as a Service.  My fellow panelists were from BigMLSnap Analytx, and Grok.
One of the questions I was asked before the event was “What needs to happen for broader acceptance of Machine Learning as a Service?”  My answer is below:
“I think this hinges on a broader understanding of how intelligent algorithms like machine learning can add value.  This means greater discussion of successful use cases on how the use of these algorithms has a meaningful impact, especially as compared to traditional approaches.   I’m looking forward to the day when there are so many compelling stories that Amazon and Netflix aren’t one of the first examples people use to talk about Machine Learning.
I have noticed that an increasing number of people we talk to have spent some time educating themselves on this concept, and it’s not all techies.  Many are being exposed to the topic as part of the broader discussion around Big Data. Their analysis is evolving from  “How would I use this in my organization?” to “How do I fit this within the budget of my organizational constraints?” People are wondering if it is something they are going to build out internally or if there is a way for them to have access to some of the technology without building out internal resources. Those are the people that we really enjoy talking to.”