Merging two datasets is the process of bringing two datasets together into one, and aligning the rows from each based on a common column. By simple drag and drop this operation can be added to the flow. After the dragging, the combining datasets should be specified as left on and right on. At the how part specific types of merging should be specified as inner, left, right, and outer.
Columns that you wish to match from the left dataset.
Columns that you wish to match from the right dataset.
You can add multiple left-right combinations, to do that, simply click the plus button to add another left-right match.
There are four different types of merges you can choose from: left, right, inner, and outer. Let's go through each one with examples.
1.1 Left Merge
A left merge takes all the rows from the left dataset (Dataset A) and tries to find matching rows in the right dataset (Dataset B) based on the common column. If a match is found, the information from Dataset B will be added to the corresponding row in Dataset A. If no match is found, the resulting row will still contain the data from Dataset A, but with empty values (NaN) for the columns from Dataset B.
Left Merge Result:
1.2 Right Merge
A right merge is similar to a left merge, but with a focus on the right dataset (Dataset B) instead. This time, all rows from the right dataset will be included in the result, and matching rows from the left dataset (Dataset A) will be added. If no match is found, the resulting row will contain the data from Dataset B, but with empty values (NaN) for the columns from Dataset A.
Right Merge Result:
1.3 Inner Merge
An inner merge combines the datasets based on the common column but only includes rows where a match is found in both datasets. In other words, the result will only contain rows that have matching data in both Dataset A and Dataset B.
Inner Merge Result:
1.4 Outer Merge
An outer merge includes all rows from both datasets, whether or not there is a match. If a match is found, the data from both datasets will be combined in the resulting row. If no match is found, the row will contain data from either Dataset A or Dataset B, with empty values (NaN) for the columns from the other dataset. This is also known as a full outer join.
Outer Merge Result:
Updated 5 months ago