Size of tables in PostgreSQL vs Apache Cassandra…

PostgreSQL table with ts+key as primary key:  ~43GB

PostgreSQL wide column table with ts as primary key : 247GB
Cassandra wide column table with ts as primary  key: 4.5GB

Strange that in PostgreSQL a table with much less rows (but much more columns) needs a lot of more space (both tables store the same amount of data). )

It seems that the Apache Cassandra Column Store can compress the columns pretty good – factor 10 less disk space!

The source table in PostgreSQL (TimescaleDB) with a timestamp and a key column and 8 data columns had about 170 Mio rows.

CREATE TABLE candles
(
    instrument character varying(10) NOT NULL,
    ts timestamp(3) without time zone NOT NULL,
    o numeric,
    h numeric,
    l numeric,
    c numeric,
    primary key (instrument, ts)
)

I needed to flatten the table so that i have just the timestamp as primary key and many columns and each column is of a type. It ends up in a table with about 1.6 Mio rows and many columns.

CREATE TYPE price AS (
    o       float,
    c       float,
    h       float,
    l       float,
    volume  float
);

CREATE TABLE candles_wide
(
   ts timestamp(3) without time zone NOT NULL,
   AU200_AUD price,
   AUD_CAD price,
   AUD_CHF price,
   AUD_HKD price,
   AUD_JPY price,
   AUD_NZD price,
   ... 124 columns

Apache Cassandra wide column store table with ts as primary key and many columns.

CREATE TABLE candles (ts timestamp,
   AU200_AUD tuple<float,float,float,float,float>,    
   AUD_CAD tuple<float,float,float,float,float>,  
   AUD_CHF tuple<float,float,float,float,float>,  
   ... 124 tuples

 

 

Python and WinCC OA…

Connected Python to WinCC OA through a Websocket Manager. Python programs can connect to WinCC OA and read/write datapoints. Communication is JSON based, it’s simple to use in Python, see examples below (ws://rocworks.no-ip.org can be used for tests, but will not be available all the time).

https://github.com/vogler75/oa4j-wss

  1. dpGet
  2. dpSet
  3. dpConnect
  4. dpQueryConnect
  5. dpGetPeriod
  6. … more functions will be implemented

Required Python modules:

  • pip3 install websocket-client
  • pip3 install matplotlib

############################################################
# Open Connection
############################################################
import json
import ssl
from websocket import create_connection
url='ws://rocworks.no-ip.org/winccoa?username=demo&password=demo'
ws = create_connection(url, sslopt={"cert_reqs": ssl.CERT_NONE})

############################################################
# dpGetPeriod
############################################################
cmd={'DpGetPeriod': {
 'Dps':['ExampleDP_Trend1.'],
 'T1': '2018-02-07T18:10:00.000', 
 'T2': '2018-02-07T23:59:59.999',
 'Count': 0, # Optional (Default=0)
 'Ts': 0 # Optional (0...no ts in result, 1...ts as ms since epoch, 2...ts as ISO8601)
 }}
ws.send(json.dumps(cmd))
res=json.loads(ws.recv())
#print(res)
if "System1:ExampleDP_Trend1.:_offline.._value" in res["DpGetPeriodResult"]["Values"]:
 values=res["DpGetPeriodResult"]["Values"]["System1:ExampleDP_Trend1.:_offline.._value"]
 print(values)
else:
 print("no data found")

# Plot result of dpGetPeriod
%matplotlib inline 
import matplotlib.pyplot as plt
plt.plot(values)
plt.ylabel('ExampleDP_Trend1.')
plt.show()

############################################################
# dpGet
############################################################
cmd={'DpGet': {'Dps':['ExampleDP_Trend1.', 'ExampleDP_Trend2.']}}
ws.send(json.dumps(cmd))
res=json.loads(ws.recv())
print(json.dumps(res, indent=4, sort_keys=True))

############################################################
# dpSet
############################################################
from random import randint
cmd={'DpSet': {'Wait': True, 
 'Values':[{'Dp':'ExampleDP_Trend1.','Value': randint(0, 9)}, 
 {'Dp':'ExampleDP_Trend2.','Value': randint(0, 9)}]}}
ws.send(json.dumps(cmd))
res=json.loads(ws.recv())
print(json.dumps(res, indent=4, sort_keys=True))

############################################################
# dpConnect
############################################################
from threading import Thread

def read():
    while True:
        res=json.loads(ws.recv())
        print(res)
Thread(target=read).start()
    
cmd={"DpConnect": {"Id": 1, "Dps": ["ExampleDP_Trend1."]}}
ws.send(json.dumps(cmd))



Clojure connected to WinCC OA…

Because Clojure is a JVM language, oa4j can be used to connect to WinCC Open Architecture.

(def manager (new JManager))

(defn callback [values]
  (let [v (reduce #(+ %1 %2) (map #(.getValueObject %) values))];
    (dpSet :ExampleDP_Trend1. v)))

(defn -main [& args]
  (.init manager (into-array args))
  (.start manager)
  (dpSet {:ExampleDP_Arg1. 2.0 :ExampleDP_Arg2. 3.0})
  (println (clojure.string/join "," 
    (dpGet [:ExampleDP_Arg1. :ExampleDP_Arg2.])))
  (let [c (dpConnect [:ExampleDP_Arg1. :ExampleDP_Arg2.] callback)]
    (Thread/sleep 180000)
    (.disconnect c))
  (.stop manager))

Full example can be found here.

Matrix Multiplication in Clojure…

Clojure code for matrix multiplication (source: Rosetta Code) … short & smart…

(def mmul (fn [a b]
  (let [nested-for (fn [f x y] (map (fn [a] (map (fn [b] (f a b)) y)) x))
        transpose (fn [m] (apply map vector m))]
    (nested-for (fn [x y] (reduce + (map * x y))) a (transpose b)))))

(def ma [[1 2 3 4]
         [4 3 2 1]])

(def mb [[1]
         [2]
         [3]
         [4]])

(defn -main [& args]
  (println (mmul ma mb)))

Streaming SQL for Apache Kafka & WinCC OA with Docker…

KSQL makes it easy to read, write, and process streaming data in real-time, at scale, using SQL-like semantics. It offers an easy way to express stream processing transformations as an alternative to writing an application in a programming language such as Java or Python. https://www.confluent.io/product/ksql/

With WinCC OA Java (https://github.com/vogler75/oa4j) we can stream data from WinCC OA to Apache Kafka, use KSQL to produce some insights and send it back to WinCC OA by using a WinCC OA Driver written in Java connected to Kafka.

Attached you will find a docker-compose.yml to setup KSQL + WinCC OA Connector and Driver to test it. Just use “docker-compose up -d” to start up everything. Before you should set the “data” and “event” environment variables in the docker-compose.yml to point to a running WinCC OA project.

root@docker1:~/docker/builds/winccoa# docker-compose up -d

Creating winccoa_frontend_1 ==> collect data from OA and publish it by ZeroMQ

Creating winccoa_backend-kafka_1 ==> get the data from the Frontend and write it to Kafka

Creating winccoa_driver-kafka_1 ==> OA driver to read data from kafka.

Creating winccoa_zookeeper_1
Creating winccoa_kafka_1
Creating winccoa_schema-registry_1
Creating winccoa_ksql-cli_1

We use Docker to startup WinCCOA Mangers (frontend, backend) and Drivers.

Afterwards you can start KSQL: docker-compose exec ksql-cli ksql-cli local –bootstrap-server kafka:29092

Create a stream of the topic which is sent from WinCC OA to kafka (currently every change of value in WinCC OA is sent to Kafka):

CREATE STREAM Scada_FloatVar (TimeMS BIGINT, Status BIGINT, Value DOUBLE) WITH (kafka_topic=’Scada_FloatVar’, value_format=’JSON’);

Create a result table in KSQL which will be read by the WinCC OA Driver, here we detect if a datapoint changes more often than 5 times in 10 seconds. Just a simple example to show how KSQL can be used:

CREATE TABLE result WITH (PARTITIONS=1) AS SELECT rowkey AS “Name”, count(*) AS “Value” FROM Scada_FloatVar WINDOW TUMBLING (size 10 second) GROUP BY rowkey HAVING count(*) > 5;

In WinCC OA you should put a peripheral address on a datapoint with the example driver (num 4) to get the result back (you will need the panels and scripts from here https://github.com/vogler75/oa4j/tree/master/Project/Drivers/Kafka to use the driver).

docker-compose.yml

Siemens IoT2000 and WinCC OA….

Got an IoT2040 from Siemens. It is Arduino compatible and runs on Yocto Linux. ETM managed to compile the SCADA System WinCC Open Architecture on it, additionally we used a MQTT server (mosquitto), Node-Red with MQTT, and a MQTT Driver for WinCC OA written in Java… runs well. Just connected a simple LED and a Button with Node-Red, those can now be controlled by WinCC OA. Because WinCC OA runs on it, it can be used to connect a wide range of different devices (a lot of drivers are available for WinCC OA).

Perfect as an Edge device to collect data from the field… Industrial IoT with Siemens!