Data science: a healthy and rapidly maturing market

by Julien Deblander - Data Scientist | minutes read

There can be no science without instruments. And data science certainly serves as no exception to that rule. Luckily, going by the latest Gartner Magic Quadrant for Data Science and Machine Learning (DSML) platforms, the market for data science tools is “beyond healthy and thrillingly innovative”.

This exceptionally good health clearly shows in the numbers Gartner has gathered, as they are running in the double digits. With the overall revenue from DSML platforms growing by 19% in 2018, this actually represents the second-fastest-growing segment of today’s total analytics and BI software market. Its market share effectively kept on growing from 14.1% in 2017 to 15.1% in 2018, with several of the smaller and younger vendors sustaining hypergrowth.

As Gartner points out, however, this particular market space represents quite a broad mix of vendors that are offering a granular range of capabilities, with solutions appropriate for most levels of maturity. Since most of these vendors continue to push out new innovations and work very hard at differentiating their solutions, this makes for an extremely fragmented and complex market landscape.

User mix: data science professionals

In a previous post, I explained how expert data scientists differ from citizen data scientists. I also claimed that DSML platforms are aimed at both user groups, supporting variously skilled data scientists in multiple tasks across the data and analytics pipeline. This claim is supported by Gartner, whose analysts note that “many vendors are now aiming for a sweet spot with their platforms to simultaneously appeal to and delight both expert data scientists and citizen data scientists.” More specifically, it seems that “vendors that previously only catered to expert data scientists are now adding augmented capabilities and improved interfaces to appeal to citizen data scientists.”

Definitions and parameters of data science and data scientists continue to evolve, in fact. To the extent that Gartner sees data science professionals in general as the primary users of DSML platforms. Not surprisingly, therefore, Gartner also includes data engineers and machine learning (ML) specialists in its DSML user mix. “As participation from a supporting cast in the data science life cycle becomes more common”, the research company explains, “vendors are adding more capabilities designed for data engineers, developers and ML engineers.” It’s basically the old story of the vendors wanting to expand the footprint and availability of their solutions to maximize customer return on their platform investments.

Coherent integration is key

According to Gartner’s definition, a DSML platform consists of a core product supported by a portfolio of coherently integrated products, components, libraries and frameworks. These can be proprietary, open source or developed by a partner. Coherent integration in Gartner’s book means that “the core product and its supporting portfolio provide a consistent “look and feel” and create a user experience where all components are reasonably interoperable in support of an analytics pipeline.” Typical tasks across that analytics pipeline include, among others, data ingestion, data preparation, data exploration, model creation and training, model testing, and collaboration.

In its research, Gartner found that many organisations start their DSML initiatives using free or low-cost open-source and public cloud offerings. Having built up their knowledge that way and explored those offerings’ possibilities, they are then likely to adopt commercial software to tackle broader use cases and to operationalize their deployment and management of DSML models. “While enterprise data science success with a purely open-source stack is possible,” Gartner concludes, “the vast majority of mature and impactful data science teams have invested in a commercial platform.”