Merge as of

The merge_asof operation is similar to a merge but is particularly useful when you need to combine datasets with non-exact matches based on a common column. This is often used when dealing with time series data, where you want to merge the datasets based on the closest matching timestamp without going past it.

Select Combination

  • Left On
    Columns that you wish to match from the left dataset.
  • Right On
    Columns that you wish to match from the right dataset.
  • Allow Exact Matches
    By default, it is set to True, which means that exact matches are allowed in the merge. This means that rows with identical values in the join columns will be included in the merged dataframe. If it is set to False, exact matches won't be allowed, and rows with identical values in the join columns will result in NaN match values.

The "direction" parameter determines how the function should find the nearest matching row in the right dataframe based on the key column values. There are three possible values for the "direction" parameter:

  • 'forward': When the direction is set to 'forward', the function looks for the nearest row in the right dataframe that has a key value greater than or equal to the key value in the left dataframe. In other words, it searches forward in the right dataframe for the first matching row. If no matching row is found, a NaN value will be assigned.
  • 'backward': When the direction is set to 'backward', the function looks for the nearest row in the right dataframe that has a key value less than or equal to the key value in the left dataframe. In other words, it searches backward in the right dataframe for the closest matching row. If no matching row is found, a NaN value will be assigned.
  • 'nearest': When the direction is set to 'nearest', the function looks for the row in the right dataframe that has the nearest key value to the key value in the left dataframe, regardless of whether it is greater or smaller. In case of a tie, the function chooses the first row found in the right dataframe.

The choice of direction depends on the specific use case and the desired behavior when merging the dataframes.

Example
Dataset A:

TimestampValue
2023-01-01 10:00:0010
2023-01-01 10:15:0012
2023-01-01 10:30:0014

Dataset B:

TimestampMetric
2023-01-01 10:05:00A
2023-01-01 10:25:00B

Results:

TimestampValueMetric
2023-01-01 10:00:0010NaN
2023-01-01 10:15:0012A
2023-01-01 10:30:0014B