Dedup Component | Notion

🎒Understanding the Dedup Component

The Dedup Component identifies and removes duplicate records within a dataset, ensuring uniqueness and accuracy. It cleans datasets by eliminating redundant entries, thus enhancing data quality.

🚀 Key Features

Duplicate Detection: Identifies duplicate records based on specified fields.
Automatic Removal: Removes duplicates to maintain a clean and accurate dataset.

📘 How to Use the Dedup Component

Configuration Steps

Add a New Component: Select “Dedup” as the component type and assign a descriptive name that reflects its function.
Configure the Component: Click on the component to open its configuration settings.
Configure the Deduplication Field: Specify the field that will be used to identify duplicates. The component will automatically detect and remove duplicate data based on this field's value.

Component Input

Type: List of Dictionaries
Description: The input consists of a list where each item is a dictionary. Each dictionary has consistent fields and data types.

Component Output

Type: Processed list of dictionaries.
Description: The output list is deduplicated based on the values of the configured field, ensuring no duplicate entries.

📖 Scenarios