An AI algorithm works by learning a simplified description of the world from the data we show it about the world. We call this simplified description, a model. The state of the art and AI today doesn’t know how to learn models of the whole world. That’s why having the right data is so important. It needs to be a window on the particular thing we want the AI algorithm to learn about.
If you’re looking to optimize your manufacturing process using unique historical data that describes the process, on the one hand, you need data that represents changes in the process that could affect the process.
The data will generally be a set of process parameters that you can control, and other things that you may have measurements of, but can’t control. That’s one piece of the picture. The other part is historical data that tells them a little about the things you’re optimizing like product quality, and yield.
If you run a manufacturing process, you’re probably collecting process and quality measurements. Your historical data could exist in physical logbooks, in spreadsheets, or in a database, you probably have high-quality automated sorts of process measurements in your factory PLCs. But we’ve also been successful in training our AI from handwritten process logs.
How much data you need depends a lot on the process. At a high level, the algorithm should typically see 1000s of examples of the production cycle, the data needs to be representative, you have 10 production lines running in parallel, and you need to probably collect data from all 10. Otherwise, your model isn’t going to generalize across them. at a lower level, the nature of the process will dictate the volume of data needed. For example, some die casting processes need say 100 measurements per second, to properly track what’s happening.
Most challenges are easy enough to overcome. It could be collecting and aggregating data from factory process controllers, provisioning enough bandwidth on the factory network connections, some are more difficult. For example, it can be hard to work with data that has been stored in a very pre-aggregated way. Since that introduces a lot of noise between the reality of the process and what the AI model sees.
If I had to limit myself to three recommendations when it comes to preparing manufacturing data for AI, I’d say the first would be to store raw data as close to the actual measurements as possible, rather than storing aggregated or interpreted views of the data.
The second would be to leverage your existing data, whether it’s handwritten industry 1.0 or industry 3.0 PLC data by commissioning an AI PLC, with what you already have, rather than installing Internet of Things sensors.
The third recommendation would be to adopt an append-only approach to your data storage. That means when it comes to historical data, that is your source of truth, and you should only add to it, not change it.
Download our White Paper and learn more about the Data Requirements for AI in manufacturing.
Sign up to receive personalized, hot-of-the-press news about DataProphet events, and our latest innovations and products.