Moving from Teradata to BigQuery? Automate the process!

février 07, 2024

Switching from Teradata to BigQuery?

Automate the process!

The Teradata release - a technical challenge

Teradata is an appliance chosen by countless organizations. Teradata's specialization in data warehousing and analytics has enabled the implementation of solutions with exceptional computing power and high scalability.

But most of Teradata's historical users are now in a phase of switching to the Cloud.

One of our clients has decided to switch its Teradata assets to Google Cloud Platform (BigQuery), as well as migrate a number of data visualization technologies (SAP BO, Power BI).

We are sharing with you the methodology implemented as part of this migration.

Some key indicators regarding the platform to be migrated

Asset data:

400,000 data "containers";
270,000 tables;
11,000 files.

Script assets:

122,000 views;
1,200 macros;
500 BTEQ injection processes;
600 BO universes;
100,000 web reports;
30,000 data manipulation processes distributed across 450 Stambia ETL projects.

Usage statistics*:

30% of the data used;
30% of tables/views/files used;
50% of the transformations used.

*which lead to a decision-making report or application use

Processing:

1,500,000 requests per day,
880,000 insert/update/merge

...

Three major challenges have been identified to ensure the success of this migration.

It was necessary to be able to define in continuous time the existing state on the source platform, with all its dependencies;
It was necessary to be able to make a permanent inventory of the progress of the migration: what has been migrated, what needs to be migrated;
The migration process needed to be shared with everyone to avoid misunderstandings.

Automating these tasks was essential.

Step #1

Mastering the source platform by mapping it

{openAudit} made it possible to control internal processes via physical data lineage, at the field level, in Bteq, but also in ETL/ELT, Views, Macros, and other scripts associated with feeding the flows.

{openAudit} helped to identify information usage patterns, through an analysis of audit database logs, for data consumption and injection.

{openAudit} analyzed the scheduling of tasks and linked it to data lineage, as well as to data usage.

{openAudit} highlighted the impacts on data visualization tools associated with Teradata (e.g., Power BI, SAP BO...), to glimpse the related complexity (business rules) and to be able to do truly end-to-end data lineage.

Step #2

Automating migration

Through a series of mechanisms, {openAudit} has reproduced the essential processing in BigQuery : parsing, then production of an enriched standard SQL.

Note that some encapsulations (Shell, others) may degrade the output.

It should also be noted that the existence of an ETL/ELT process in the source system necessitates a transposition of the processes. For some of these, {openAudit} can accelerate the project.

Step #3

Mastering deployment in GCP through mapping

{openAudit} performed dynamic parsing of BigQuery, scheduled queries, scripts of views and load files such as Json, CSV, to enable intelligent construction of flows.

{openAudit} analyzed the logs in Google Cloud's Operations (Stackdriver) to immediately understand the uses of the information.

{openAudit} defined the task scheduling, to link it to data lineage and data usage.

{openAudit} has introspected certain "target" data visualization technologies that rely on GCP (Looker, Data Studio, BO Cloud, Power BI ...), in order to be able to reconstruct intelligence by comparing the responses.

Furthermore, the connectors were able to be migrated to BigQuery (in the case of connectors with performance degradation via the datometry hyper-Q middleware).

Rechercher dans ce blog

{openAudit}, le data lineage et les usages de l'information pour transformer les SI

Moving from Teradata to BigQuery? Automate the process!

Some key indicators regarding the platform to be migrated

Three major challenges have been identified to ensure the success of this migration.

Commentaires

Enregistrer un commentaire

Posts les plus consultés de ce blog

Power BI libère les utilisateurs… Mais comment garder la maîtrise de sa plateforme dans le temps ?

De la source à la cellule du dashboard : Cartographier le SI pour le reconstruire intelligemment

Migrer de SAP BO vers GCP Looker - Garder ses données en source ? Possible ?