Back to Writings

From Tableau to Palantir: What is next for analytic and decision support software

Map of analytics software landscape

Think of an ICE agent who comes to the office, press a button and see the list of peoples who needs to be arrested for deportation next. The list gets updated on a daily basis as new information becomes available from hundreds of data sources. What could be a better "actionable" insights?

It is a shame that we see applications of a software system such as Palantir's in such agencies. But their's is state-of-the-art in terms of analytic software. Companies and organizations fall on various levels of analytic maturity and utilize various processes and tools in this $25 billion market. I am not going to give a comprehensive list of such tools and describe when and where they are applicable but trying to draw a bigger picture of where the industry started and where it is going.

As far as Medici bank, organizations keep track of their data in a database (paper or electronic). However, the analytic software category started 15 years ago when data warehousing software such as Hadoop was introduced. For once, the increasingly large datasets become queryable. At the time companies needed to invest large sums to just maintain and implement a data analytic infrastructure. Today, this problem is fully solved with Infrastructure as Service tools such as Snowflake or cloud-based warehousing offered by all cloud providers (e.g. Redshift, BigQuery, etc.).

Without a user-friendly data manipulation and visualization tools, extracting insights from the data warehouse is possible only by specialized data analytic people. So many companies add another set of tools for querying with graphical interface and dashboarding. Here is another large set of providers (e.g Tableau who got acquired at $15 billion).

To this date, however, the problems still remain and many organizations are not data driven and fall on the lower end of analytic maturity. The problems are two folds:

(i) Data Integration

The integration of various data is still a major problem. Despite all the infrastructure tools, gathering all data into a single data store that is cross-queriable like SQL joins is not easy. No tool can automatically clean and organize your data. The level of manual and specialized work to bring and maintain the data keeps many organizations deprived of their ability to perform analytic tasks. A company like Palantir solves this by bundling a consulting contract with custom software development and licensing.

The need for this heavy manual work upfront reduces the ability to provide a fully SaaS software to many organizations. No one's data is clean and tidy in nice datasets so one can ingest and query it. It is not surprising that the category with the highest growth in this space is data wrangling tools.

In future, I expect AI-based approaches to be used more for data ingestion, maintenance and cleaning. Imagine a tool that can be pointed to various internal data sets for ingestion and it can automatically find joinable columns, duplicates and similar fields and store all into a warehouse. Now you can use Tableau on top of this easily and with confidence.

(ii) Actionable Insights

Easily obtaining actionable insights or prescriptive analytic still remains a distant dream. Currently, the most complicated type of scenario for action is setting certain rules, say on a dashboard so that when the data satisfies the rule, a notification to action occurs. For example, when inventory below a threshold, then we need to re-order.

More advanced data science techniques such as anomaly detection, Churn prediction, recommender systems, etc. have been used in the most mature organizations as an additional layer of complexity on top of the data. In all these cases, data science teams build prediction machines so that a decision can be made given the information the organization have.

In many situations an actionable decision is contingent not simply on the established data of the organization in the warehouse but on the external data and exogenous variables in the environment. In addition, a decision mostly needs an exploration of a possible space of action to find an optimal point. This search needs to be assisted with simulations and experimentations to be reliable.

I expect another set of tools to emerge that connects many of current predictions to a digital twin of the environment so that a decision maker has a chance to explore possible outcomes. It then can apply hypothesis to the actual environment for further experimentation. We are yet at the infancy of this space. Although simulation of the real world, as a concept, have been around for a long time, there are game-changing shifts; one is that the big data and compute tools are cheap and accessible for capturing all details necessary for a realistic outcome. And two, complex ML algorithms can be used so that for areas that can not be modelled we can use black-box approached based on actual observations.

The Path Forward

Two dimensions of progress can be considered along the above challenges shown in the schematic map below. To build more value, the insight needs to be actionable and for that it need more integration of data and stronger prediction and simulation and hypothesis testing.

Analytics software landscape map
The analytics software landscape: from data to actionable insights

To conclude, it appears that the world of data warehousing, data science and decision science will be converging. The winner platform is the one with powerful integration tools and combination of prediction and simulation machines that can help its customers arrive at the most optimal, actionable decisions.