25
12

CDC using In-Out parameters in the Cloud Data Integration mapping

Last updated at Posted at 2023-12-14

Introduction

Change Data Capture (CDC) is a crucial aspect of modern data integration, enabling organizations to capture and track changes in their data sources efficiently. Informatica Intelligent Data Management Cloud (IDMC) provides robust tools for implementing CDC. In this guide, we'll explore the use of in-out parameters in the Data Integration mapping to enhance the CDC process and streamline data updates.

In the realm of Change Data Capture (CDC), the Mass Ingestion service within Informatica Intelligent Data Management Cloud (IDMC) stands out as a powerful solution. However, it comes with a trade-off – while excelling in rapid data ingestion, it offers limited transformation options. For scenarios where CDC implementation is coupled with the necessity for intricate data transformations, an alternative approach takes center stage. This article delves into a solution that leverages In-Out parameters in Informatica Cloud Mapping, providing an ideal choice for customers seeking the perfect balance between efficient CDC and robust data transformation capabilities.

What are in-out parameters?

In-out parameters are a type of variable in Informatica IDMC that can hold and modify values during the execution of a mapping. They differ from regular input parameters in that they can be updated and used within the mapping itself.

Background:

A company maintains customer data in an Oracle database. The user wants to implement Change Data Capture (CDC) to extract data from the "CUSTOMER" table incrementally based on the "MODIFIED_DATE" column. The goal is to perform a full data extraction initially and then capture and load incremental changes on a daily basis into a Snowflake target.

Implementing CDC with in-out parameters in Informatica IDMC:

  1. Create an in-out parameter: In the parameter panel, define a parameter with the desired data type (typically date/time) and set the default value to a specific date (e.g., initial start date).
    image.png

  2. Map the parameter to your CDC source: In the source properties, use Query Options to define the expression to dynamically filter data based on the parameter value. For example, filter based on a field representing the last update date, ensuring it's greater than the parameter value.
    image.png
    image.png

  3. Update the parameter within the mapping: Use an expression transformation to update the parameter value after processing each batch of data. You can extract the maximum update date from the captured data and assign it to the parameter.
    image.png
    image.png

  4. Configure a Target connection and table. Save and validate the mapping.

  5. Create a Mapping task on top of the mapping.

  6. Before you run the mapping task, you can see the default value and current value have the same value that you defined while creating the parameter in the mapping.
    image.png

  7. The first time you run the mapping task, it will extract all data from the source and load it into Snowflake. Once the mapping task is completed, you can see the in-out parameter value will be updated automatically to the maximum value from the source CDC column.
    image.png

  8. For validation, you can confirm this value by comparing it with your source table.
    image.png

  9. The next day or before the next run, I observed that there are 3 records newly created in my source table. I want to automatically identify and bring this data into my target.
    image.png

  10. Automate the process: Schedule your mapping to run periodically. The updated parameter value will be automatically used in subsequent runs, ensuring continuous and efficient CDC.
    image.png

  11. You have the flexibility to either reset to the initial value or set a custom date for data extraction by using the 'edit' option in the mapping task.
    image.png

Advantages:

Imagine managing customer data across a CRM system and a marketing automation platform. By utilizing an in-out parameter for CDC, you can:
• Capture only new or updated customer records after the last successful synchronization.
• Eliminate redundant data transfer and optimize processing time.
• Dynamically adjust the extraction range based on specific business needs, such as focusing on changes from a particular campaign.
• Simplify the mapping design and reduce maintenance overhead.

Conclusion:

In-out parameters offer a powerful and flexible approach to performing CDC in Informatica Cloud. By leveraging this technique, you can significantly improve performance, gain greater control over the extraction process, and ensure efficient data synchronization across your systems.

25
12
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
25
12