Considering AI? Understand what data you need at the outset

Joris Stork, Senior Data Scientist, DataProphet

At the core of today’s state-of-the-art Artificial Intelligence (AI) algorithms is the ability to learn patterns from a sample of data. In the manufacturing context, an example of such a pattern might be the ways in which a set of parameters contained in that data, which are related to a process in a factory, vary together. When considering using AI, it is important to understand what the data requirements are.

The general answer as to what constitutes the “right” data for AI-enabled process optimisation, is the set of data that is sufficient to describe how changes to a process’s parameters affect quality. The bulk of process data can generally be represented as a table, or a collection of tables, comprising columns (parameters) and rows (production examples, representing, say, one production batch per row). In order to be meaningful as a representation of a process, or more specifically of the history of a process, these tables need to be accompanied by some explanatory information.

The key pieces of explanatory information, required by the data science team, are:

  • A high-level description of the physical process;
  • A description of the flow of production through the process (normally in the form of a process flow diagram), including in some contexts the time offsets between process steps;
  • A description of how the data table(s) relate to the process.

Many of these descriptions can be obtained from a manufacturing plant’s technical documentation. Due to the nature of AI-enabled parameter optimisation, there are some clear fundamentals that the bulk of the data needs to satisfy. This paper outlines these fundamentals in terms of data columns as well as row-wise requirements.

To learn more about the data required when considering AI, read the detailed paper here.