Replace Records Via Joins in Power Query

I got an email from a friend today who was using some complicated logic to replace specific records in a table with records from another table.  His query was running pretty slow, so he reached out for a little help. In this post I'll show how to replace records via joins in Power Query; a much easier (and what should be a faster) solution to his issue.

Data Background

The data footprint that was sent to me looked something like this:

image

And the desired output is shown below:

image

So basically, we want to take the record for Unit002 from the Override table and replace the Unit002 value in the Original Data table.

At first glance, this looks hard.  And my friend cooked up something pretty complicated to make this work.  Funny thing is (and believe me… I've had this happen to me as recently as last week…) when you put another pair of eyes on it, you suddenly realize it's much easier than you first saw.

In this case we can actually solve this very easily by using a couple of Power Query's different Join types!

Laying the Groundwork

If you want to follow along, grab the sample workbook here.  You'll notice that we have taken the following actions already:

  • Select any cell in the Original Data table
  • Create a New Query –> From Table
  • Go to Home –> Close & Load To… –> Connection Only
  • Select any cell in the Override With table
  • Create a New Query –> From Table
  • Go to Home –> Close & Load To… –> Connection Only

Which leaves us with the following queries in the Workbook Queries pane:

image

We are now set to replace the records.

Replace Records Via Joins in Power Query

This actually takes a Merge and an Append in order to complete the job.  So let's start at the merge.

  • Right click the "Original" query –> Reference

This creates a pointer to the data in the "Original" query, showing all four rows of data in the table.  The challenge here is that we only want the rows which are NOT being replaced.  The secret to getting those?  An Anti-Join!

  • Go to Home –> Combine –> Merge Queries
  • Choose the Override query
  • Select the Unit column on both the top and bottom queries
  • Change the Join Kind to "Left Anti (rows only in first)"

image

  • Click OK

At this point, you'll have 3 rows left, as shown below:

image

Why only 3 rows?  Because the Left Anti Join only returns the rows which don't match what is in the other table.  So where Unit002 exists in the second table, it cause it to pull everything EXCEPT Unit002 from the left table.  (For more on using Anti-Joins in Power Query, see this blog post.)

Joining tables does create a new column however, even if it is full of null values (as this one is.)  Since we don't need it, let's just delete that column:

  • Right click the NewColumn column –> Remove

Now we just need to add the record(s) from the Override table to this list.  That's fairly easy:

  • Go to Home –> Combine –> Append
  • Choose the Override table
  • Right click the Unit column –> Sort –> Ascending (this step is optional, and done for readability only.)

And you're done!  5 steps (after the connection only queries were created), 100% user interface drive, and should perform quite quickly. Smile

6 thoughts on “Replace Records Via Joins in Power Query

  1. Nice trick, Ken 🙂 Very usefull.
    But does you know that we can do this in this way?

    let
    Source = Table.Combine({Override,Original}),
    #"Remove duplicates" = Table.Distinct(Source, {"Unit"})
    in
    #"Remove duplicates"

    I want only to say "we can" and not "this is a better way".
    Truly, I do not know which is better or faster 🙂

    Cheers

  2. Great point, Bill. Many ways to skin this cat! Matt Allington also mentioned to me (via email) that we should consider doing two anti joins. The issue he raised is what would happen if the "replacement' table contained a record that did not exist in the original table. Both my and your solutions would add that record. By using an anti join the other way... we could avoid that.

  3. This is not a big problem 🙂
    The only we have to do is preparing the "Override" table.
    I mean something like this below.
    let
    OrgUnit = Original[Unit],
    Source = Excel.CurrentWorkbook(){[Name="Org_Override"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Unit", type text}, {"Organization", type text}, {"Department", type text}, {"Team", type text}}),
    ToRemoveFromOverride = List.Buffer(List.Intersect({#"Changed Type"[Unit], OrgUnit})),
    OverrideReady = Table.SelectRows(#"Changed Type", each List.Contains(ToRemoveFromOverride,[Unit]) )
    in
    OverrideReady

    Cheers 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *