Exit Oracle Data Integrator (or other ELT), Migrate and sustainably break the complexity of flows.

mars 04, 2025

Exit Oracle Data Integrator

(or other ELT),

Migrate and permanently break down the complexity of flows.

ODI (Oracle Data Integrator), and other legacy ELT solutions, automatically generate SQL queries that execute directly in the database. We're not telling you anything new 😉.

The ELT engine generates new nested subqueries and joins as the system evolves. These transformations eventually become buried within massive SQL blocks, leaving teams with little understanding of them.

Furthermore, traditional ELTs are facing significant challenges from IT modernization projects, the cornerstone of which is often a "move to the cloud."
Cloud-native ELTs, which use the same logic as their predecessors, are rapidly expanding: Matillion, DBT, Dataflow (Google), and Glue (Amazon) are just a few examples. They are coupled with powerful cloud databases such as Redshift, BigQuery, and Snowflake. The combination is impressively efficient!

In a migration context, the goal will be to move transformations from the source to the target, implementing upstream mechanisms that create simplicity and observability.
Then, two approaches will clash regarding how the flows are executed in the target:

Continue to support transformations on the database (ELT mode), to leverage the power of the target Data Warehouse.
Or, outsource job execution to containers for maximum control.

We'll explain it to you.

Essentialize, migrate, simplify

data pipelines

👉 Only essential transformations

Jobs and sub-jobs from ODI (or other legacy ELT) will be analyzed to identify sources, targets, filters and different transformation logics.
The intersection of actual information usage and technical data lineage will make it possible to eliminate unnecessary transformations upstream of the migration (sometimes more than 50% of the whole ).
Thus, only truly useful transformations will be directed towards the target .

👉 Conversion to SQL

Once the key transformations are isolated, they will be converted into SQL queries, step by step.
Each request will correspond to a distinct phase of the source processing.

👉 Break down the complexity

This step-by-step SQL approach will allow the processing to be structured into several levels to avoid generating complex monolithic queries (as in the source).
The nested SQL will be reworked, which is crucial for the maintainability of the flows.
Materialized tables or volatile tables can also be used to store intermediate results . This will allow for a better understanding of the data flow.
We will create explicit checkpoints at each step of SQL transformation (which does not exist in "legacy" ELTs, such as ODI for example).

Option #1:

Execute the transformations in the target database, in ELT mode

👉 Target audience adaptation

SQL transformations will be directly integrated into Cloud-native ELT jobs.
The generated SQL can be adjusted to fit the target syntax.
Option: adding jobs to project configuration, defining dependencies between jobs, etc.
Job testing to verify their operation and possibly data testing to guarantee their integrity.

👉 Advantages

Latency reduction through transformations within the database.
Native exploitation of the computing capabilities of the target Data Warehouse.

Option #2:

Execute the transformations in containers, in ETL mode

👉 Containers

The transformations will be executed in an external container, detached from the target database and the target ELT. It will serve as a pre-configured application gateway, simple to set up, easily duplicable, to parallelize certain pipelines if necessary.
Temporary containers will serve as intermediate storage areas where data will be loaded before transformation.
The volume of the container holding the buffer database will be externalized so that developers can access it.
The jobs will be tested to verify their functionality (and possibly the data, to guarantee its integrity).

👉 Advantages

Reduced load on the target database : jobs will be executed in parallel with a "multi-threaded" architecture.
Continuity of service ensured even in the event of intensive processing (heavy processing or batch operations).
Increased flexibility for workflow execution and orchestration.

CONCLUSION

In a cloud migration project involving pipelines managed by legacy ELTs, optimizing and improving SQL transformations is crucial to ensuring the performance and maintainability of the target system. The choice between execution in the target database (ELT) and externalization in containers (ETL) will then depend on business objectives and governance goals.

{openAudit} , our software, will automate all these steps to control the migration (timing, price) and fully exploit the new capabilities of target Cloud-native solutions.

Rechercher dans ce blog

{openAudit}, le data lineage et les usages de l'information pour transformer les SI