Functional Data Engineering — a modern paradigm for batch data processing

Reproducibility

Pure tasks

Table partitions as immutable objects

a simplified DAG of partitions

A persistent and immutable staging area

Changing logic over time

But what about dimensions?

Unit of work

A DAG of partitions

Past dependencies

simplified DAG of partitions with a past dependency

Late arriving facts

A standard deviation

In retrospect

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Maxime Beauchemin

Maxime Beauchemin

Founder and CEO at Preset, creator of Apache Superset and Apache Airflow