Merging two datasets is the process of bringing two datasets together into one, and aligning the rows from each based on a common column. By simple drag and drop this operation can be added to the flow. After the dragging, the combining datasets should be specified as left on and right on. At the how part specific types of merging should be specified as inner, left, right, and outer.

Select Combination

  • left_on
    Columns that you wish to match from the left dataset.
  • right_on
    Columns that you wish to match from the right dataset.

You can add multiple left-right combinations, to do that, simply click the plus button to add another left-right match.

How

Merge Types
There are four different types of merges you can choose from: left, right, inner, and outer. Let's go through each one with examples.

1.1 Left Merge

A left merge takes all the rows from the left dataset (Dataset A) and tries to find matching rows in the right dataset (Dataset B) based on the common column. If a match is found, the information from Dataset B will be added to the corresponding row in Dataset A. If no match is found, the resulting row will still contain the data from Dataset A, but with empty values (NaN) for the columns from Dataset B.

Example

Dataset A:

Employee_IDName
1Alice
2Bob
3Carol

Dataset B:

Employee_IDDepartment
1HR
3Finance
4IT

Left Merge Result:

Employee_IDNameDepartment
1AliceHR
2BobNaN
3CarolFinance

1.2 Right Merge

A right merge is similar to a left merge, but with a focus on the right dataset (Dataset B) instead. This time, all rows from the right dataset will be included in the result, and matching rows from the left dataset (Dataset A) will be added. If no match is found, the resulting row will contain the data from Dataset B, but with empty values (NaN) for the columns from Dataset A.

Right Merge Result:

Employee_IDNameDepartment
1AliceHR
3CarolFinance
4NaNIT

1.3 Inner Merge

An inner merge combines the datasets based on the common column but only includes rows where a match is found in both datasets. In other words, the result will only contain rows that have matching data in both Dataset A and Dataset B.

Inner Merge Result:

Employee_IDNameDepartment
1AliceHR
3CarolFinance

1.4 Outer Merge

An outer merge includes all rows from both datasets, whether or not there is a match. If a match is found, the data from both datasets will be combined in the resulting row. If no match is found, the row will contain data from either Dataset A or Dataset B, with empty values (NaN) for the columns from the other dataset. This is also known as a full outer join.

Outer Merge Result:

Employee_IDNameDepartment
1AliceHR
2BobNaN
3CarolFinance
4NaNIT