Update: Refresh Power Queries With VBA

Some time back I posted a routine to refresh Power Queries with VBA, allowing you to refresh all Power Queries in a workbook.

Things Changing…

The challenge with that specific macro is that it refers to the connection name, which may not always start with Power Query.  (As we move closer to Excel 2016, we are now aware that these queries will NOT have names starting with Power Query.)

I have seen an approach where someone modified the code I provided earlier to include a variable, allowing you to easily change the prefixed text from Power Query to something else.  While that works, it’s still not ideal, as names can be volatile.  It makes more sense to try and find a method that is cannot be broken by a simple name change.

Refresh Power Queries With VBA

The macro below is a more targeted approach that checks the source of the query.  If it is a connection that is built against Power Query engine (Microsoft.Mashup), then it will refresh the connection, otherwise it will ignore it.

Public Sub UpdatePowerQueriesOnly()
Dim lTest As Long, cn As WorkbookConnection
On Error Resume Next
For Each cn In ThisWorkbook.Connections
lTest = InStr(1, cn.OLEDBConnection.Connection, "Provider=Microsoft.Mashup.OleDb.1")
If Err.Number <> 0 Then
Err.Clear
Exit For
End If
If lTest > 0 Then cn.Refresh
Next cn
End Sub

This macro works with Excel 2010, 2013 and 2016, and should survive longer than the original version I posted.

Remove All Rows Up To A Specific Value

A couple of weeks ago, Rudi asked how you would go about setting up a query to remove all rows up to a specific value.  Specifically, his question was this:

The other day I was asked if Power Query could delete all top rows up to a found value. I could not find a solution and its been a burning question till now.
For example: If I import a csv file containing columnar info, but the headings for the list are in different rows for each import. I know that the first heading in column A is called "ID Number", but each import has this heading in a different row.
How do I determine an applied step to delete all rows above "ID Number". I cannot use the delete top rows as its not always 5 rows, some import the headings start in row 10, others in row 3...but the label I am looking for is always "ID Number".

While the question was answered in the initial post, I still though it would be interesting to do a full post on this for others who might need to create similar functionality.

The Data Set

I didn’t have Rudi’s exact data, so I knocked up a little sample with an ID Number and Amount column starting in row 3, which you can download here.

The components I was after here was the ID Number header and an extra column of some kind.  In addition, I wanted to have some garbage data above, as I didn’t want to give the impression we can just filter out blank rows.

Now, I assume this data is loaded from an external file, but it doesn’t really matter, I’ll just load this from a range, as it’s just a data source.  The key is that the header row is not the first row.  So I defined a new range to cover the data

  • Select A1:B7 –> Name Box –> Data

I can then select the name from the Name box to select my data:

SNAGHTML1549a4a4

I then loaded it into Power Query by creating a new query –> From Table

Determine the Header Row

This is the first job.  In order to remove any rows above the header row, we need to know which row the header resides.  To do that, we need to add an index column, and filter to the specific header that we’re after.

  • Add Column –> Index Column –> From 0
  • Filter Column1 –> only select ID Number

This results in a single row of data, and conveniently has the Index number of the ID Number row listed.  In this case it’s 2.

Call up the Original Table

We’ll park that value for a moment, and call up the original table.  To do that, we click on the fx button beside the formula bar to create a new step, then replace the formula with =Source.

image

Remove All Rows Up To A Specific Value

Now comes the tricky part.

We cant’ remove all rows up to a specific value immediately.  We need to insert a step that removes the top 2 rows first, then modify it.  So let’s do that:

Home –> Remove Rows –> Remove Top Rows –> 2

This gives us the following table and formula:

image

The key is to understand this formula first.

  • The Table.Skip() function removes (actually skips importing) the first x rows
  • Custom1 is the name of our previous step
  • 2 is the number of rows

So what we really need to get is the number of rows.  We can extract that from the Filtered Rows step like this:

#”Filtered Rows”[Index]{0}

Where:

  • #”Filtered Rows” is the name of the step in the Applied Steps window
  • [Index] is the column we want to look at in that step
  • {0} indicates the first row of that step (since Power Query starts counting at 0

So let’s modify the function to:

= Table.Skip(Custom1,#"Filtered Rows"[Index]{0})

As you can see, it works nicely:

SNAGHTML1559cc91

In fact, we can go even better than this.  Why are we referring to Custom1 at all?  Doesn’t that just refer to Source anyway?  Let’s also replace Custom1 with Source, and remove the Custom1 step from our query:

SNAGHTML155cd378

Cleanup

We can now promote our header rows, remove the unnecessary Changed Type step and set our data types correctly:

SNAGHTML155f21ae

Testing

If you try inserting new rows at the top of the data range, then refreshing the completed query… it just works!  The output table will never end up with extra rows at the top, as we’re filtering for to start at the ID Number row.

SNAGHTML156196d4

Using M

If you learn to write your own M code from scratch, you can start combining lines as well, skipping the next to explicitly filter the primary table.  Bill S provided code in answer to Rudi’s question (adapted to this workbook), which is shown below:

let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
AddIndex = Table.AddIndexColumn(Source, "Index", 0, 1),
Skip = Table.Skip(Source, Table.SelectRows(AddIndex, each ([Column1] = "ID Number")){0}[Index]),
PromoteHeaders = Table.PromoteHeaders(Skip)
in
PromoteHeaders

It essentially replicates the same steps as the query I built via the user interface, only he combined the Filtered Rows step into the Row Removal step.  To bonus here is that the code is a bit shorter.  The drawback is that it might not be quite so “de-buggable” to someone who isn’t as experienced with M.

Multiplying NULL values in Power Query

I was working through a scenario today and came up against something unexpected when multiplying NULL values in Power Query.

Background

I have a fairly simple table of transactions that looks like this:

image

And wanted to turn it into this:

image

Seems simple enough, but I ran into an odd problem.

Steps

Getting started was pretty easy:

  • Pull the data into Power Query using From Table
  • Remove the final column
  • Select the Bank Fee column –> Transform –> Standard –> Multiply –> –1

So far everything is good:

image

Then I tried to do the same to the Discount column.

Multiplying NULL values

At this point, something odd happened.  I did the same thing:

  • Select the Discount column –> Transform –> Standard –> Multiply –> –1

But instead of getting a NULL or 0 for John’s record, it gave me –1.  Huh?

image

This is honestly the last result I expected.  How can a NULL (or empty) cell be equivalent to 1?  I think I’d rather have an error than this.

Regardless, easy enough to fix, I just inserted a step before the multiplication step to replace null with 0:

image

Good stuff, so now just finish it off:

  • Right click the Customer column –> UnPivot Other columns

And all looks good…

image

… until I load it into the Excel table:

image

Seriously?  Negative zero?

To be honest, if I’m feeding a PivotTable with this anyway, I really don’t need the discount record for John.  To fix this I just went back to the Power Query and inserted another step right before the Unpivot portion when replaced 0 with null.  The result is now really what I was originally after:

image

End Thoughts

I can’t help but think that this behaviour has changed, as I actually tripped upon it refreshing a previous solution.  The killer is that the data has changed, but I’m pretty sure the old data set had these issues in it too, and it wasn’t a problem.

Regardless, I”m a little curious as to your opinions.  Is this expected behaviour?  Bug or feature?

Request new features for Excel

Have you ever wanted to request new features for Excel?  Maybe an upgrade of data validation, a new function, or something else?  Well guess what… there is a way…

In a recent blog post on the Office Blogs called “Your top 10 questions about Office 2016 for Mac answered”, answer number 7 was to the question “How can I request a new feature?”

And the cool thing?  This isn’t just applicable to Mac Excel.  It’s applicable to ANY AND EVERY flavour of Excel (Windows, Android, iPad, Mac, Online)

How to request new features for Excel

Go to the Excel Suggestion Box site at http://excel.uservoice.com

  • Create a user account if you want to add a new suggestion or vote on existing suggestions.
  • Select one of the feedback forums listed.
  • Check out the ideas others have suggested and vote on your favourites.

The Excel team does suggest that you search for your suggestion among existing requests before you post a new one.  This is actually important as you can then vote someone else’s suggestion up.  The more votes, the more likely it has a chance of being seriously looked at.

So on that note, check out the ones that are there.  Vote them up and comment if they are in line with what you’re thinking.  But add your own if you can’t already find one in place.

How to request features for Word, PowerPoint, Outlook and OneNote

As it would happen, they all have their own sites.  You can find them at the links below:

Rolling 12 Months in Power Query

Last week, in a blog comment, a reader asked how to filter their data to only show the most recent rolling 12 month period.  This post looks at how I made that work in Power Query.

Background

Assume we have a table set up as follows:
image

As you can see, we have sales categories down the left, months across the top, and values in the middle.  A classic setup when users are tracking information.  And now we need to pull the most recent 12 months only from that table.

You can download the completed sample file if you’d like to follow this along as well.

Filter the most recent rolling 12 months from a table in Power Query

Step 1: Grab the data

First thing we need to do is grab the data. To do this, I clicked in the table and:

  • Power Query –> From Table –> Confirm the range (if required)
  • Changed the query name to Rolling12

Helpfully, Power Query identified the data types in all the columns for me, so I’m pretty much ready to go.

Step 2: Show the most recent record first

To do this, we really need to get the data into an unpivoted list.  Easy enough to do:

  • Select the “Sales” column
  • Right click –> UnPivot Other Columns

Note: The “unpivot other columns” command was added to the right click menu in version 2.24.  While you can most likely still access this command, you are definitely running an older version, and for a multitude of important reasons, you should really update to the latest version of Power Query.

If you don’t want to (or can’t) do this for some reason, then go to Transform –> Unpivot –> Unpivot Other columns to accomplish the same thing.

We now get a nice unpivoted list:

SNAGHTML846714

Next I cleaned up that Attribute column:

  • Right click the Attribute column –> Rename –> Date
  • Right click the Date column –> Changed Type –> Date

And finally I sorted the records to show the most recent ones at the top:

  • Click the Filter icon on the Date column –> Sort Descending

Leaving us with this:

SNAGHTML87924c

Step 3: Create variables to hold the required data range

Next we need to work out the dates that we want to use for the top and bottom range of dates in the query.  This is a bit tricky, but uber powerful once you realize you can do it.

To start, click the fx icon in the formula bar:

image

NOTE: If you don’t see the formula bar, go to the View tab, and check the box next to Formula Bar

What this does is add a new step in the formula bar called “Custom1”.  And if you check the formula bar you’ll see that it just refers to the previous step:

image

The cool thing here is that we can modify this.  Why don’t we add some data to the end of that statement to pull the first value from the Date column of that table?  To do that, change the text in the formula bar to read as follows:

=#”Sorted Rows”[Date]{0}

Recognize here that:

  • #”Sorted Rows” refers to the table in the previous step
  • [Date] tells Power Query that you only want the Date column
  • {0} tell Power Query that you want the first value from that column (remember that Power Query starts counting from 0)

The result is a single cell with the most recent date:

image

Let’s keep things clean in our Steps window… right click the Custom1 step and rename it to “MaxDate”.

image

What’s cool here is that we’ve essentially created a variable to work out and hold the most recent date.

So now that we have the top of the range, why don’t we create another step to modify it to the date for the lower end of the range?

  • Click the fx step in the formula bar
  • Modify the formula to read as follows

=Date.AddYears(MaxDate,-1)

And the result is a new Custom1 step that shows in the formula bar as follows:

image

Note:  The formulas you can use are documented at Microsoft’s site here: http://office.microsoft.com/en-us/excel-help/power-query-formula-categories-HA104122363.aspx

Pretty cool, don’t you think?  We’ve now got a step that holds the lower end of the results too.  Let’s do our cleanup again.

Right click the Custom1 step and rename it to “AfterDate”

Step 4: Implement the variables into a filter

Our last step is to implement the variables into the filter, cutting the data down to the most recent rolling 12 months of data.  Before we can do that, however, we really need to get back to the table we had in the “Sorted Rows” step.

The challenge is that we can’t select and work from that step, as it’s earlier in the process than the creation of our variables.  So how do we get back to that step AFTER we’ve created our variables?

  • Click the fx button in the formula bar

Once again, we get a new step in the formula bar:

image

The problem is that it’s referring to the previous step.  So what if we changed it to point to the #”Sorted Rows” step?

image

Now how cool is that?  Not only can we refer to the previous step of our query, we can change that to point to ANY previous step, or type in our own formulas against any previous step!

NOTE:  If your steps have spaces in the, don’t forget to wrap the step name in quotes and then preface it with the # symbol.  If your steps don’t have spaces, then you don’t need to do this.

Okay, so now let’s filter our data.  We’ll start by doing it manually:

  • Click the filter icon on the Date column
  • Select Date Filters –> Custom Filter
  • Set up your filter as follows:

image

Notice that we have to pick the values from the list here.  (Wouldn’t it be cool if we could type in our variable names here?  That would be awesome!)  Regardless, we can set up the filter as we’d expect to use it.  This will filter our list, and leave us with the following formula in the formula bar:

image

NOTE:  You can expand the formula bar to show as I have by clicking the little down arrow icon.

Good stuff… now we need to do a little surgery.  Let’s replace the manual dates with our variables:

= Table.SelectRows(Custom1, each [Date] <= MaxDate and [Date] > AfterDate)

If you check the table now, you’ll see that it is filtered down to only contain records between Mar 1, 2014 and Feb 28, 2015.  And better yet, because the variables are created dynamically when the query is run, it will ALWAYS return the most recent rolling 12 months!

Step 5: Pivoting the data back into the original format.

Now we need to put the data back into the format the user wanted.  To do this, we need to pivot it back.

The trick to pivoting in Power Query is to select the column you want to use as the new column headers.  This time it is the Date column.  So…

  • Select the Date column –> Transform –> Pivot Column
  • Change the “Values” column to the one that holds your values (in this case it’s actually called Value)

image

And the result:

image

Bingo!  The most recent rolling 12 months of data from our table.

At the point you can click File –> Close & Load, and load it to a table.

Proof Positive

Go and add a new column of data.  You can insert it into the existing table, put it on the end, it really doesn’t matter since Power Query will sort it anyway.  Once you’re done, right click the new table and refresh it, and you’ll find it works nicely.

One minor point of note… in the version I did we’ve actually reversed the column order (the most recent date has moved to the left from the right.)  If we wanted to change that it’s fairly easy too.  Just before we pivot the data back into the pivoted form, just sort it in descending order.

Important Power Query Update Available

At last, the Power Query update I’ve been waiting for has finally landed in the download site.  This is version 2.24, and sets my parameter tables technique back to a working state!

You can download version 2.24 direct from Microsoft’s site by clicking here.

Why is this Power Query Update important?

This Power Query update is pretty important for a few reasons:

  • It fixes the issues with the MultipleUnclassified/Trusted error on refreshing parameter tables (as I blogged about here)
  • If you’re running version 2.22 it also fixes issues with loading to the data model

Does this Power Query Update have any NEW features?

Of course it does!

  • Improvements to ODBC Connector.
  • Navigator dialog improvements.
  • Option to enable Fast Data Load vs. Background Data Load.
  • Support for Salesforce Custom environments in Recent Sources list.
  • Easier parsing of Date or Time values out of a Text column with Date/Time information.
  • Unpivot Other Columns entry in column context menu.

The big thing to me, though, is the Power Query update fixes the critical bug(s) listed above.  If you’re running 2.22 or 2.23 I highly recommend updating.  (And if you’re running an older version I’d update too, as there is new functionality released every month.)

Fix Date Errors

I was teaching a course on Power Query yesterday in which we imported a text file.  Almost immediately, some of my users pointed out that their dates weren’t importing correctly, and we had to cover how to fix date errors right away.

And to underscore the importance of this… this morning I woke up to comments on one of my previous blog posts with similar issues.  I figured it’s time to cover the easy way to fix date errors.

The Symptom

My data came from a text file, and was shown in the following format:

image

As you can see, the data is set up in the Month/Day/Year format.  The issue for the user is that their system is set to a different format… in the case of the users in my class their default windows settings were set to Day/Month/Year, which is the Canadian date format.  When they tried to convert the data from an “any” data type to a date, it messed it up.

The problem comes in to play because many systems export into a MDY format as they were programmed using US date standards.  But with our operating system set to a different format, it tries to interpret dates in the standard set there.  So when importing a text file, it looks at 1/12/2000 and interprets it as Dec 1, 2000, not Jan 12, 2000.

Then it hits a date like 1/13/2000.  Because there is no 13th month, it returns an error.

One of the class attendees brought up an interesting point as I was explaining this… “I thought that dates were just a format on top of a serial number?  So how can it get it wrong”.  He was absolutely correct.  But in this case we are importing data from another source into Excel (via Power Query)… Excel (Power Query) is trying to determine what that date serial number is based on the system settings.  That’s where the issue hits us.

How to fix date errors

At first, you might be tempted to flip the date format in your Windows settings, but that won’t actually help you in the long run.  In fact, in the worst case it may fix the issue for the current import, and blow apart other solutions that you’ve built.  So that’s really not a practical solution.  What we need is a way to tell Power Query what the date settings are for THIS data source.  Fortunately, there is a way to override the date format.  In truth, this doesn’t so much fix date errors, but rather prevents them from occurring in the first place.

What we do is select the column with our dates in it then:

  • Right click the column
  • Choose Change Type –> Using Locale

image

(Yeah, I know… this is hardly a term that Excel users are familiar with, but it allows you to force a different regional setting on the data source.)

You’ll then be prompted with a new dialog where you’ll choose the date, then the Locale you want to use to read it:

image

The key here is to recognize WHICH locale your data format is emulating.  There are hundreds of countries in this listing.  My guess is that you’re probably going to pick either your own or English (United States) most of the time.  In truth, when working with dates, the country is actually not the important part.  The important part is that you pick a country where the MDY or DMY format is consistent with your data source.

Future Proofing

I’ve been fortunate enough to have to deal with this issue very little in my career.  I generally leave my system in a US English configuration and most of my imports follow the US date standard, so no issue.  But I recognize this as a huge issue for Europeans, as well as any company that conducts business in multiple countries.  In both cases this issue comes up over an over again.

There are two great things about using the “Change Type With Locale” feature:

The first is that it avoids relying on an implicit shortcut, explicitly declaring the source data format.  This is REALLY handy for future proofing.  In Canada, we typically end up with some users in the organization using Canadian standards and some who use US standards (we are a very confused country sometimes.)  By specifically declaring the data type, I know that this solution will continue to work even when I send it to someone who uses a different date standard on their PC.  Why?  Because it’s now defined for the DATA, not the SYSTEM.

The second great thing is that this is a DATA SOURCE SPECIFIC feature.  I can set a different format for each data source used in my solution, allowing me to combine several data sets from different countries and still get it consistent.

Drawbacks/Improvements

One thing that I struggled with is this.  Back in the old text import tool for Excel, we had a nice feature that looked like this when we were setting data types:

image

This was fantastic, as I didn’t need to try and figure out which region the data came from, I simply chose the date format that I could see.  While it’s great that we can now exhibit some national pride by choosing our country, that doesn’t always help.  In the case of Canada, I’d bet that if I asked 5 different people what our official data format is, I’d get 5 different answers.

clip_image002

It would be SO handy if the Power Query team would add some indicator at the end of these options to indicate what the format is.  That would be such an easy change to make here, and SOOOOO useful.  I honestly don’t think I need to care if my setting is set to English Australia/UK/Ireland/Canada/Belize if it gives me interprets my date in the correct MDY order.

(I actually did email this thought to one of the program managers a few days ago.  So hopefully one day we’ll see that change take place.)

MultipleUnclassified/Trusted error

If you’ve been using my fnGetParameter function to build dynamic content into your queries, you may have noticed that passing some variables recently started triggering an error message that reads as follows:

We couldn’t refresh the connection ‘Power Query – Categories’. Here the error message we got:

Query ‘Staging-Categories’ (Step ‘Added Index’) used ‘SingleUnclassified/Trusted/CurrentWorkbook/’ data when last evaluated, but now is attempting to use ‘MultipleUnclassified/Trusted’ data.

If you’re like me, that was promptly followed by cursing and head scratching in the pursuit of a solution.  (I would suggest clicking the “was this information helpful?” link in the bottom of the error, and submitting a “no”.

image

As far as I can tell, this error started showing up in Power Query around v 2.21 or so.  (I don’t have an exact version.)  I know that it existed in 2.22 for sure, and still exists in version 2.23.

Triggering the MultipleUnclassified/Trusted error

Triggering the error isn’t difficult, but it does seem to show up out of the blue.  To throw it, I built a solution that uses my fnGetParameter function to read the file path from a cell, and feed it into a query as follows:

let
fPath = fnGetParameter("FilePath"),
    Source = Csv.Document(File.Contents(fPath & "SalesCategories.txt"),null,",",null,1252),
#"First Row as Header" = Table.PromoteHeaders(Source),
#"Changed Type" = Table.TransformColumnTypes(#"First Row as Header",{{"POSPartitionCode", Int64.Type}, {"POSCategoryCode", type text}, {"POSCategoryDescription", type text}, {"POSReportingGroupCode", type text}, {"POSTaxTypeCode", type text}}),
#"Inserted Merged Column" = Table.AddColumn(#"Changed Type", "Lnk_Category", each Text.Combine({Text.From([POSPartitionCode], "en-US"), [POSCategoryCode]}, "-"), type text),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Merged Column",{"POSCategoryCode"}),
#"Added Index" = Table.AddIndexColumn(#"Removed Columns", "Index", 0, 1)
in
#"Added Index"

The killer here is that – when you originally build the query – it works just fine!  But when you refresh it you trigger the error.

Fixing the MultipleUnclassified/Trusted error

There are two options to fix this particular error:

Option 1

The first method to fix this problem is to avoid the fnGetParameter function all together and just hard code the file paths.  While this works, you cut all dynamic capability from the query that you went to the effort of implementing.  In my opinion, this option is awful.

Option 2

If you want to preserve the dynamic nature of the fnGetParameter function, the only way to fix this error today is to perform the steps below in this EXACT order!

  1. Turn on Fast Combine (Power Query –> Options –> Privacy –> Ignore Privacy Levels)
  2. Save the workbook
  3. Close Excel
  4. Restart Excel
  5. Refresh the query

Each of the steps above are critical steps to ensure that transient caches are repopulated with the “fast combine” option enabled – merely setting the “fast combine” option is not enough on its own.

Expected Bug Fix

Microsoft is aware of this, and has been working on a fix.  I believe (although no guarantees), that it should be released in version 2.24.  Given the update schedule in the past, we should see it any day now.  Fingers crossed, as this is just killing my solutions!

I’ll keep checking the Power Query download page and post back when the new version goes live.

Create a Dynamic Calendar Table

I know this topic has been covered before, but I’m teaching a course on Power Pivot tomorrow, and it’s something that I’ll probably be brining up.  As we need a calendar table for our Power Pivot solutions, a method to create a dynamic calendar table is pretty important.  If you haven’t seen this before, I think you’ll be surprised at how easy it is to create a complete calendar driven by only a few Excel formulas and Power Query.

Setting up a dynamic source

The first key we need to do is set up a parameter table in order to hold the start and end date.  To do that, I’m creating a basic parameter table as described in this post.  Mine looks like this:

SNAGHTML3f7266cf

  • B3 is simply a hard coded January 1, 2014
  • B4 contains the =TODAY() function, returning today’s date

Once I created this table, I named it “Parameters” (as described in the aforementioned blog post), then created the fnGetParameter function in Power Query (again, as described in the aforementioned blog post.)

With that work done, it was time to move on to creating my calendar.

How to create a dynamic calendar table in Power Query

What I did at this point was create a new blank Power Query:

  • Power Query –> From Other Sources –> Blank Query

In the formula bar, I created a simple list by typing the following:

={1..10}

SNAGHTML3f7892f2

As you’ll see if you try this, it creates a simple list from 1 through 10.  That’s great, but it’s just a temporary placeholder.  Now I need to get my hands a bit dirty… I want to use the fnGetParameter function to load in my start and end dates as date serial numbers (not actual dates.)  To do this, I’ll retrieve them and explicitly force them to be Numbers.

  • Go to View –> Advanced Editor
  • Insert two new lines as follows:

let
    StartDate = Number.From(fnGetParameter("Start Date")),
EndDate = Number.From(fnGetParameter("End Date")),

Source = {1..10}
in
Source

So, as you can see, we’ve used the fnGetParameter function to retrieve the Start Date and End Date values from the Excel table, then converted them to the date serial numbers (values) by using Number.From.

With that in place, we can then sub the StartDate and EndDate variables into the list that we created in the Source step:

let
StartDate = Number.From(fnGetParameter("Start Date")),
EndDate = Number.From(fnGetParameter("End Date")),
Source = {StartDate..EndDate}
in
Source

And when we click OK, we now have a nice list of numbers that spans from the first date serial number to the last from the Excel table:

SNAGHTML3f810c61

“Great,” you’re thinking, “but I really want dates, not the date serial numbers.”  No problem, lets do that.

First we need to convert our list to a table:

  • Go to Transform –> Convert to Table –> Click OK (with the default options)

image

  • Select Column1 –> Transform –> Data Type –> Date
  • Right click Column1 –> Rename –> Date

And look at that!

SNAGHTML3f84f29c

And now it’s just a matter of adding the different date columns we need.  If you select the Date column, you’ll find a great variety of formats available under Add Column –> Date.  Just browse into the subcategory you want (year, month, day, week) and choose the piece you want to add.  In the table below, I added:

  • Date –> Year –> Year
  • Date –> Month –> Month
  • Date –> Month –> End of Month

SNAGHTML3f87a345

There are a lot of transformations for a variety of dates built in… for numeric or date values.  One thing that’s missing though, is text versions.  For those you need to add a custom column.  Here’s 3 formulas that you may find useful if you want to add text dates to your table:

  • Date.ToText([Date],"ddd")
  • Date.ToText([Date],"MMM")
  • Date.ToText([Date],"MMMM dd, yyyy")

To use them, go to Add Column –> Add Custom Column and provide those as the formula.  Their results add a bit more useful data to our query:

SNAGHTML3f907072

As you can see, they work like Excel’s TEXT function, except that the characters are case sensitive.

Conclusion

Overall, it’s super easy to create a dynamic calendar table using Power Query to read the start and end date from Excel cells.  This makes it very easy to scope your calendar to only have the date range you need, and also gives you the ability to quick add columns on the fly for formats that you discover you need, rather than importing a massive calendar with a ton of formats that you will never use.

In addition to being easy, it’s also lightening quick if you’re prepared.  It takes seconds to create the Excel parameter table, a few more seconds to set up the fnGetParameter function (if you have the code stored in a text file/bookmarked), and only a little while longer to create the original list and plumb in the variables once you’re used to it.  I can knock up a calendar like this in less than two minutes, and let it serve my data model every after.  :)

PowerQuery.Training

I should also mention that this is one of the techniques (amongst MANY others) that we cover in our PowerQuery.Training course.  We’ll be announcing a new intake soon, so don’t forget to sign up for the newsletter in the footer of the site so you’ll know when that happens!

Listing Outstanding Cheques

Power Query is all about transforming and filtering data, and automating the process.  One of the tedious tasks that accountants get to deal with all the time is bank reconciliations, which is essentially the process of filtering and matching items to see what is left over.  It’s been on my list for a while now, but I’ve been thinking that we can use Power Query for listing outstanding cheques (or checks if you’re in the USA).

The completed workbook is available for download by clicking here.

Background

Because I don’t want to run on to 100 pages, I’m going to start with two lists that show just the cheques, in two tables:

SNAGHTML1901ec84

The table on the left is the table of cheques that have been issued, as per the list maintained in the General Ledger. We’re making the assumption that we’ve dumped that list into an Excel worksheet, formatted it as a table, and given the table the name “GLListing”.

The table on the right is the table of the cheques that have cleared the bank.  Again, the assumption is that we’ve been able to download a list of the transactions, and filtered them down to show just the cheques that have cleared.  (Maybe we’d even use Power Query to do this.)  This table has been named “Bank”

Creating the data Staging queries

The first step is to create the staging queries to connect to these two tables.  One of the important things I wanted to ensure is that I can match transactions where both the cheque number and amount are identical.  (If a cheque clears for the wrong amount, I want to list it as outstanding at this point, as I need to review it.)  I’m going to keep that in mind as I create my staging tables.

The GLListing table:

To set this up I:

  • Clicked inside the GLListing table –> Power Query –> From Table
  • Set the data types on each column (Whole Number, Date, Decimal)
  • Selected the Cheque and Amount columns –> Add Column –> Merge
    • Separator:  Custom (I used a dash)
    • Name:  Issued

The end result in Power Query:

image

  • Go to Home –> Close & Load –> Close & Load To… –> Only Create Connection

The Bank table:

It’s virtually the identical process:

  • Click inside the Bank table –> Power Query –> From Table
  • Set the data types on each column (Whole Number, Date, Decimal)
  • Selected the Cheque and Amount columns –> Add Column –> Merge
    • Separator:  Custom (I used a dash)
    • Name:  Cleared

The end result in Power Query:

image

  • Go to Home –> Close & Load –> Close & Load To… –> Only Create Connection

Listing Outstanding Cheques

Now to build the important part.

  • In the Workbook Queries pane, right click the GLListing query—> Reference

At this point you’ll have a pointer to the GLListing table.  We also want a pointer to the Bank table.  To do that, let’s click the fx icon on the formula bar:

image

This will create a new step in your query.  The formula in the formula bar will read =Source (it refers to the previous step), and you’ll see a new step in your Applied Steps area called Custom1.  Let’s update both of those:

  • Change the formula to =Bank
  • Right click and rename the step to “Bank”

image

The key things we now have are a Source step (which contains the output of the GLListing query) and a Bank step (which contains the output of the Bank query).  The Source step has an Issued column, the Bank query a Cleared column, and we’ll like to know which items between those two columns are different.

To work this out we’re going to knock up a little M code.  Here’s how:

  • Click the fx icon on the formula bar
  • Replace the formula (it will read =Bank) with this:

=List.Difference(Source[Issued],Bank[Cleared])

The result will be as follows:

image

So what happened here?  Let’s break this down.

List.Difference generates the items that are different between two provided lists.  And fortunately, when we feed a column to the function, it passes it in as a list.  So that’s what you’re seeing there:

  • Source[Issued] is the Issued column from the Source step of our query
  • Bank[Cleared] is the Cleared column from the Bank step of our query

And the result is the only items that don’t exist in both lists.

Expanding the Details

As great as this is, it is returning a list of values.  We want to convert this back into a table, and get the original data of issue as well.  So let’s do that.

  • Go to Transform –> To Table

image

  • When prompted, select that the list has a Custom Delimiter of a dash and click OK

You should now have a nice table split into two columns:

image

Let’s clean this up:

  • Right click Column1 –> Rename –> Cheque
  • Right click Column2 –> Rename –> Amount

Now, the next tricky part is getting the issue date back in.  I’d like to feed this from the current query – just to keep it self contained – but it’s easier to start by merging it with another query.

  • Go to Home –> Merge Queries –> GLListing
  • Choose to merge based on the Cheque field on both tables
  • Make sure you check to “Only include matching rows”

image

  • Click OK

This works nicely to add our column, but we’ve already pulled this data into this query once, so why reach outside it again?  If you look in the formula bar, you can see that the formula reads as follows:

= Table.NestedJoin(#"Renamed Columns",{"Cheque"},GLListing,{"Cheque"},"NewColumn",JoinKind.Inner)

Highlighted in the middle of the text is the name of the table we merged into this one.  So why not just replace that with the step name from this query that has the same table?  Modify the formula in the formula bar to read:

= Table.NestedJoin(#"Renamed Columns",{"Cheque"},Source,{"Cheque"},"NewColumn",JoinKind.Inner)

It doesn’t look like anything happened, does it?  That’s okay.  Remember that the source step just pulls in the data from the GLListing query.  Since we didn’t do anything to it in that step, it SHOULD look identical.

Now we can continue on and finalize the query:

  • Expand the “NewColumn” column:
    • Only expand the Date column, as we have the others we need
    • Uncheck the “Use original column name as prefix” setting
  • Move the Date column between then Cheque and Amount columns

image

  • Rename the query to “Outstanding”
  • Go to Home –> Close & Load

And the final result:

SNAGHTML1b93e715

Final Thoughts

Figuring out which records match is actually pretty easy.  We simply merge two tables, and choose to only include matching rows.  Working out differences is obviously a bit harder.  (Wouldn’t it be awesome if there was an inverse setting on that merge dialog that let us only include unmatched rows?)

I left my full time controllership job before I ever got the chance to implement this technique for our bank reconciliations.  Currently there is a lot of VBA and manual work needed to clear both the cheques and deposits on a monthly basis.  Given this, however, I know that I could have re-written the bank reconciliation to very quickly eliminate all the records that match, leaving me with only the transactions that I actually needed to focus on.

Taking it even one step further, with another table of adjustments added in to the mix, I’m sure I could build it to actually produce an ever diminishing listing of un-reconciled transactions, and most likely even an output report replicating a full bank reconciliation.  Pretty cool, especially when you consider how much could be refreshed when you start the process next month!