Faster Batch Reports from Multiple Data Sources

by Steve Comeau on Apr 29, 2021 10:30:00 AM

Faster batch reports

No one wants to wait minutes for a control system batch report to run. When reports need to run fast, design choices must be used that focus on the needs of the end user and consumers of the data. The standard for report design is to base the report data on “a single version of the truth”. This approach generally means getting the raw data for the reports as close to the source as possible. But in some cases this approach does not work well for the end user.

As the report data expectations increase in high-production environments, it becomes difficult to design batch reports which return results to users quickly. The time needed to extract data from multiple data sources such as several SQL databases and many Historian tags, can become unacceptable. This is even more of a challenge in when users ask for results from thousands batches occurring over many months and with data merged from multiple data sources.

To solve this problem the “a single version of the truth” dogma needs to be challenged. A workable solution is to pre-process and store the data in a single wide-table. This results in a copy of the data records from the various sources being stored as a single record of results for each batch. For batches with multiple values per batch, the multi-record related data is stored as a separate table from the batch record. The objective is to store the data in a form that is optimized for reporting and not for the transactional database. The pre-processing can be a continuous automated process that can occur frequently, such as once per hour. When properly implemented, the result is fast and accurate reports for your end users; therefore less frustration, delay, and wasted time.

This solution will work, but there are some caveats. First, the measured values must be values that cannot be changed after first logged. This is normally the case with historical data, so this is not a problem. There can be a potential problem with attributes that give context for the measured values. An issue can arise where an attribute that describes the batching equipment, the location, or the product, changes at a point in time. In most cases, this would not be a problem since the new attribute value would correctly describe the attribute and the change would be legitimate. But if the change in value is intended to correct an error or is a new way to describe the same value, then records from the past will need to be either reprocessed or updated. Change needs to be considered and a plan should be in place that is intended to maintain data quality and possible changes to attribute values. But this should not be a deterrent to building solutions that provide fast and accurate batch data for the end users.

simplified data.jpg

An example of this approach is to combine manufacturing batch records with plant equipment metrics that are logged to a data historian during the batch. A batch record will have a Batch Id, Equipment Id, Product Id, Recipe Id, Start Time, End Time, and other columns related to the batch. This information comes from an SQL relational database. A complex query may be needed to obtain the batch records from many tables. During the batch, there may be many measured values such as flows, volumes and temperatures. These measured values can be merged with the batch record either as summary data, such as minimum, maximum, and average; as sample value at various points in time; or as a series of values for each batch. Since extracting the data from the historian is a slow computing process, this is the primary speed bottleneck. During the pre-processing, the historian data is extracted and stored in the SQL database with the batch record.

The end result of pre-processing batch data from multiple data sources is that the data is now available in a greatly simplified form. The batch records can then be accessed by reports or user queries simply and fast without a need for knowledge of the underlying complexity of the source databases or tagnames.

Hallam-ICS can help you with the installation and configuration of reporting and data analytics solutions. Please contact us for more information.

If you would like more information about Data Historians you can contact Steve directly via email scomeau@Hallam-ICS.com or through our Help Desk.

About the Author

Steve has retired from Hallam-ICS, but his contributions to the company continue to be valued.

Read My Hallam Story

About Hallam-ICS

Hallam-ICS is an engineering and automation company that designs MEP systems for facilities and plants, engineers control and automation solutions, and ensures safety and regulatory compliance through arc flash studies, commissioning, and validation. Our offices are located in Massachusetts, Connecticut, New York, Vermont and North Carolina and our projects take us world-wide.