PostgreSQL table with ts+key as primary key: ~43GB
PostgreSQL wide column table with ts as primary key : 247GB
Cassandra wide column table with ts as primary key: 4.5GB
Strange that in PostgreSQL a table with much less rows (but much more columns) needs a lot of more space (both tables store the same amount of data). )
It seems that the Apache Cassandra Column Store can compress the columns pretty good – factor 10 less disk space!
The source table in PostgreSQL (TimescaleDB) with a timestamp and a key column and 8 data columns had about 170 Mio rows.
CREATE TABLE candles
(
instrument character varying(10) NOT NULL,
ts timestamp(3) without time zone NOT NULL,
o numeric,
h numeric,
l numeric,
c numeric,
primary key (instrument, ts)
)
I needed to flatten the table so that i have just the timestamp as primary key and many columns and each column is of a type. It ends up in a table with about 1.6 Mio rows and many columns.
CREATE TYPE price AS (
o float,
c float,
h float,
l float,
volume float
);
CREATE TABLE candles_wide
(
ts timestamp(3) without time zone NOT NULL,
AU200_AUD price,
AUD_CAD price,
AUD_CHF price,
AUD_HKD price,
AUD_JPY price,
AUD_NZD price,
... 124 columns
Apache Cassandra wide column store table with ts as primary key and many columns.
CREATE TABLE candles (ts timestamp,
AU200_AUD tuple<float,float,float,float,float>,
AUD_CAD tuple<float,float,float,float,float>,
AUD_CHF tuple<float,float,float,float,float>,
... 124 tuples