--

The example you gave deals with two practices.

  1. We should not do processing with airflow executor. Executor should simply start the process on a dedicated system. Since you mentioned copying data from MySql to Object store, this type of task can be done by Sqoop service on an cluster.
  2. What I mentioned is that suppose we have two tasks. Second task needs data from the first task, in this case, we should not pass the data itself, rather store that intermediate data outside airflow system and pass the reference. Xcoms store in the airflow metadat db. If we directly pass data between tasks using xcoms it will overload the metadata db.

--

--

Amit Singh Rathore
Amit Singh Rathore

Written by Amit Singh Rathore

Staff Data Engineer @ Visa — Writes about Cloud | Big Data | ML

No responses yet