A few weeks ago we posted about the concept of “local data” that is often used in our analysis algorithms. This is data already present somewhere at the client that is very valuable in the analysis process. This data is either used as a means of classifying the different data streams (what is the installation doing) or as actual data that is treated and transformed into parameters indicating a state-of-health or warning for upcoming failures. To get this data flowing into our algorithms, we need to get them out of their source. This is always done using a specific “protocol”, a language and way of transferring the data such that it happens in a structured and reliable way.
The protocols we encounter most often are in this list: FTP, OPC UA, OPC DA, Modbus TCP, Modbus RTU, Profibus, Profinet, Bacnet, Ethercat, MQTT, API. The goal of this post is not to explain all of these in detail (we will do that later on), but to already give you an introduction and explain what they’re used for.
A first observation from our side is that different companies are structured…differently. In some cases the department responsible for setting up the connections with the data sources is referred to the OT department. Other clients bring us in touch with IT, while others still refer to the Automation department.
All of these protocols have advantages and disadvantages, but one seldom free to choose. Most often historical choices related to PLC architecture determine what is possible or not, or the choice is imposed by the brand of the historian or data source. Even the version of the database has an impact on what is possible or not: older versions of databases have no way of interacting through API, while some of the newer versions are API only.
An advantage of FTP for example is that it’s known by many people, but on the flip-side has serious security concerns. These in turn can be covered by using SFTP, but still the downside is that there’s no way to ensure that data is structured in the correct way. If the file contents is altered at the source-side, this may quite easily cause the process of automated data transfer and ingestion to stop. An other disadvantage is that this protocol relies on files, such that on the sender side often files need to be generated from a database, and on the receiver side file manipulations are needed to get the data into a database structure.
Both OPC (Open Platform Communications) options, OPC UA and OPC DA stem from standardization efforts that allow for fast and efficient data transfer between different sources. As the name states, the goal for it is to be Open. In practice however what one sees is that it is sometimes difficult to configure, and that different vendors include different flavors of an OPC protocol, adding up to the complexity. Once set up however the transfer is quite smooth and easy to scale up. The DA flavor (Data Access) is the older version, born around 1995, the UA version (Unified Architecture) is a more recent approach, which offers a higher flexibility related to data types and is ready to cope with situations that are at present still not known: it allows for evolution.
We’ll dig a bit deeper in all of these protocols at a later moment, but for now it should be clear that different options for data transfer exist, but that in real-time situations the choice will only partially depend on what is the most optimal, and will be governed mostly by what is possible given the IT/OT infrastructure and hardware available in your plant.