MultipleUnclassified/Trusted error

If you’ve been using my fnGetParameter function to build dynamic content into your queries, you may have noticed that passing some variables recently started triggering an error message that reads as follows:

We couldn’t refresh the connection ‘Power Query – Categories’. Here the error message we got:

Query ‘Staging-Categories’ (Step ‘Added Index’) used ‘SingleUnclassified/Trusted/CurrentWorkbook/’ data when last evaluated, but now is attempting to use ‘MultipleUnclassified/Trusted’ data.

If you’re like me, that was promptly followed by cursing and head scratching in the pursuit of a solution.  (I would suggest clicking the “was this information helpful?” link in the bottom of the error, and submitting a “no”.

image

As far as I can tell, this error started showing up in Power Query around v 2.21 or so.  (I don’t have an exact version.)  I know that it existed in 2.22 for sure, and still exists in version 2.23.

Triggering the MultipleUnclassified/Trusted error

Triggering the error isn’t difficult, but it does seem to show up out of the blue.  To throw it, I built a solution that uses my fnGetParameter function to read the file path from a cell, and feed it into a query as follows:

let
fPath = fnGetParameter("FilePath"),
    Source = Csv.Document(File.Contents(fPath & "SalesCategories.txt"),null,",",null,1252),
#"First Row as Header" = Table.PromoteHeaders(Source),
#"Changed Type" = Table.TransformColumnTypes(#"First Row as Header",{{"POSPartitionCode", Int64.Type}, {"POSCategoryCode", type text}, {"POSCategoryDescription", type text}, {"POSReportingGroupCode", type text}, {"POSTaxTypeCode", type text}}),
#"Inserted Merged Column" = Table.AddColumn(#"Changed Type", "Lnk_Category", each Text.Combine({Text.From([POSPartitionCode], "en-US"), [POSCategoryCode]}, "-"), type text),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Merged Column",{"POSCategoryCode"}),
#"Added Index" = Table.AddIndexColumn(#"Removed Columns", "Index", 0, 1)
in
#"Added Index"

The killer here is that – when you originally build the query – it works just fine!  But when you refresh it you trigger the error.

Fixing the MultipleUnclassified/Trusted error

There are two options to fix this particular error:

Option 1

The first method to fix this problem is to avoid the fnGetParameter function all together and just hard code the file paths.  While this works, you cut all dynamic capability from the query that you went to the effort of implementing.  In my opinion, this option is awful.

Option 2

If you want to preserve the dynamic nature of the fnGetParameter function, the only way to fix this error today is to perform the steps below in this EXACT order!

  1. Turn on Fast Combine (Power Query –> Options –> Privacy –> Ignore Privacy Levels)
  2. Save the workbook
  3. Close Excel
  4. Restart Excel
  5. Refresh the query

Each of the steps above are critical steps to ensure that transient caches are repopulated with the “fast combine” option enabled – merely setting the “fast combine” option is not enough on its own.

Expected Bug Fix

Microsoft is aware of this, and has been working on a fix.  I believe (although no guarantees), that it should be released in version 2.24.  Given the update schedule in the past, we should see it any day now.  Fingers crossed, as this is just killing my solutions!

I’ll keep checking the Power Query download page and post back when the new version goes live.

Create a Dynamic Calendar Table

I know this topic has been covered before, but I’m teaching a course on Power Pivot tomorrow, and it’s something that I’ll probably be brining up.  As we need a calendar table for our Power Pivot solutions, a method to create a dynamic calendar table is pretty important.  If you haven’t seen this before, I think you’ll be surprised at how easy it is to create a complete calendar driven by only a few Excel formulas and Power Query.

Setting up a dynamic source

The first key we need to do is set up a parameter table in order to hold the start and end date.  To do that, I’m creating a basic parameter table as described in this post.  Mine looks like this:

SNAGHTML3f7266cf

  • B3 is simply a hard coded January 1, 2014
  • B4 contains the =TODAY() function, returning today’s date

Once I created this table, I named it “Parameters” (as described in the aforementioned blog post), then created the fnGetParameter function in Power Query (again, as described in the aforementioned blog post.)

With that work done, it was time to move on to creating my calendar.

How to create a dynamic calendar table in Power Query

What I did at this point was create a new blank Power Query:

  • Power Query –> From Other Sources –> Blank Query

In the formula bar, I created a simple list by typing the following:

={1..10}

SNAGHTML3f7892f2

As you’ll see if you try this, it creates a simple list from 1 through 10.  That’s great, but it’s just a temporary placeholder.  Now I need to get my hands a bit dirty… I want to use the fnGetParameter function to load in my start and end dates as date serial numbers (not actual dates.)  To do this, I’ll retrieve them and explicitly force them to be Numbers.

  • Go to View –> Advanced Editor
  • Insert two new lines as follows:

let
    StartDate = Number.From(fnGetParameter("Start Date")),
EndDate = Number.From(fnGetParameter("End Date")),

Source = {1..10}
in
Source

So, as you can see, we’ve used the fnGetParameter function to retrieve the Start Date and End Date values from the Excel table, then converted them to the date serial numbers (values) by using Number.From.

With that in place, we can then sub the StartDate and EndDate variables into the list that we created in the Source step:

let
StartDate = Number.From(fnGetParameter("Start Date")),
EndDate = Number.From(fnGetParameter("End Date")),
Source = {StartDate..EndDate}
in
Source

And when we click OK, we now have a nice list of numbers that spans from the first date serial number to the last from the Excel table:

SNAGHTML3f810c61

“Great,” you’re thinking, “but I really want dates, not the date serial numbers.”  No problem, lets do that.

First we need to convert our list to a table:

  • Go to Transform –> Convert to Table –> Click OK (with the default options)

image

  • Select Column1 –> Transform –> Data Type –> Date
  • Right click Column1 –> Rename –> Date

And look at that!

SNAGHTML3f84f29c

And now it’s just a matter of adding the different date columns we need.  If you select the Date column, you’ll find a great variety of formats available under Add Column –> Date.  Just browse into the subcategory you want (year, month, day, week) and choose the piece you want to add.  In the table below, I added:

  • Date –> Year –> Year
  • Date –> Month –> Month
  • Date –> Month –> End of Month

SNAGHTML3f87a345

There are a lot of transformations for a variety of dates built in… for numeric or date values.  One thing that’s missing though, is text versions.  For those you need to add a custom column.  Here’s 3 formulas that you may find useful if you want to add text dates to your table:

  • Date.ToText([Date],"ddd")
  • Date.ToText([Date],"MMM")
  • Date.ToText([Date],"MMMM dd, yyyy")

To use them, go to Add Column –> Add Custom Column and provide those as the formula.  Their results add a bit more useful data to our query:

SNAGHTML3f907072

As you can see, they work like Excel’s TEXT function, except that the characters are case sensitive.

Conclusion

Overall, it’s super easy to create a dynamic calendar table using Power Query to read the start and end date from Excel cells.  This makes it very easy to scope your calendar to only have the date range you need, and also gives you the ability to quick add columns on the fly for formats that you discover you need, rather than importing a massive calendar with a ton of formats that you will never use.

In addition to being easy, it’s also lightening quick if you’re prepared.  It takes seconds to create the Excel parameter table, a few more seconds to set up the fnGetParameter function (if you have the code stored in a text file/bookmarked), and only a little while longer to create the original list and plumb in the variables once you’re used to it.  I can knock up a calendar like this in less than two minutes, and let it serve my data model every after.  :)

PowerQuery.Training

I should also mention that this is one of the techniques (amongst MANY others) that we cover in our PowerQuery.Training course.  We’ll be announcing a new intake soon, so don’t forget to sign up for the newsletter in the footer of the site so you’ll know when that happens!

Listing Outstanding Cheques

Power Query is all about transforming and filtering data, and automating the process.  One of the tedious tasks that accountants get to deal with all the time is bank reconciliations, which is essentially the process of filtering and matching items to see what is left over.  It’s been on my list for a while now, but I’ve been thinking that we can use Power Query for listing outstanding cheques (or checks if you’re in the USA).

The completed workbook is available for download by clicking here.

Background

Because I don’t want to run on to 100 pages, I’m going to start with two lists that show just the cheques, in two tables:

SNAGHTML1901ec84

The table on the left is the table of cheques that have been issued, as per the list maintained in the General Ledger. We’re making the assumption that we’ve dumped that list into an Excel worksheet, formatted it as a table, and given the table the name “GLListing”.

The table on the right is the table of the cheques that have cleared the bank.  Again, the assumption is that we’ve been able to download a list of the transactions, and filtered them down to show just the cheques that have cleared.  (Maybe we’d even use Power Query to do this.)  This table has been named “Bank”

Creating the data Staging queries

The first step is to create the staging queries to connect to these two tables.  One of the important things I wanted to ensure is that I can match transactions where both the cheque number and amount are identical.  (If a cheque clears for the wrong amount, I want to list it as outstanding at this point, as I need to review it.)  I’m going to keep that in mind as I create my staging tables.

The GLListing table:

To set this up I:

  • Clicked inside the GLListing table –> Power Query –> From Table
  • Set the data types on each column (Whole Number, Date, Decimal)
  • Selected the Cheque and Amount columns –> Add Column –> Merge
    • Separator:  Custom (I used a dash)
    • Name:  Issued

The end result in Power Query:

image

  • Go to Home –> Close & Load –> Close & Load To… –> Only Create Connection

The Bank table:

It’s virtually the identical process:

  • Click inside the Bank table –> Power Query –> From Table
  • Set the data types on each column (Whole Number, Date, Decimal)
  • Selected the Cheque and Amount columns –> Add Column –> Merge
    • Separator:  Custom (I used a dash)
    • Name:  Cleared

The end result in Power Query:

image

  • Go to Home –> Close & Load –> Close & Load To… –> Only Create Connection

Listing Outstanding Cheques

Now to build the important part.

  • In the Workbook Queries pane, right click the GLListing query—> Reference

At this point you’ll have a pointer to the GLListing table.  We also want a pointer to the Bank table.  To do that, let’s click the fx icon on the formula bar:

image

This will create a new step in your query.  The formula in the formula bar will read =Source (it refers to the previous step), and you’ll see a new step in your Applied Steps area called Custom1.  Let’s update both of those:

  • Change the formula to =Bank
  • Right click and rename the step to “Bank”

image

The key things we now have are a Source step (which contains the output of the GLListing query) and a Bank step (which contains the output of the Bank query).  The Source step has an Issued column, the Bank query a Cleared column, and we’ll like to know which items between those two columns are different.

To work this out we’re going to knock up a little M code.  Here’s how:

  • Click the fx icon on the formula bar
  • Replace the formula (it will read =Bank) with this:

=List.Difference(Source[Issued],Bank[Cleared])

The result will be as follows:

image

So what happened here?  Let’s break this down.

List.Difference generates the items that are different between two provided lists.  And fortunately, when we feed a column to the function, it passes it in as a list.  So that’s what you’re seeing there:

  • Source[Issued] is the Issued column from the Source step of our query
  • Bank[Cleared] is the Cleared column from the Bank step of our query

And the result is the only items that don’t exist in both lists.

Expanding the Details

As great as this is, it is returning a list of values.  We want to convert this back into a table, and get the original data of issue as well.  So let’s do that.

  • Go to Transform –> To Table

image

  • When prompted, select that the list has a Custom Delimiter of a dash and click OK

You should now have a nice table split into two columns:

image

Let’s clean this up:

  • Right click Column1 –> Rename –> Cheque
  • Right click Column2 –> Rename –> Amount

Now, the next tricky part is getting the issue date back in.  I’d like to feed this from the current query – just to keep it self contained – but it’s easier to start by merging it with another query.

  • Go to Home –> Merge Queries –> GLListing
  • Choose to merge based on the Cheque field on both tables
  • Make sure you check to “Only include matching rows”

image

  • Click OK

This works nicely to add our column, but we’ve already pulled this data into this query once, so why reach outside it again?  If you look in the formula bar, you can see that the formula reads as follows:

= Table.NestedJoin(#"Renamed Columns",{"Cheque"},GLListing,{"Cheque"},"NewColumn",JoinKind.Inner)

Highlighted in the middle of the text is the name of the table we merged into this one.  So why not just replace that with the step name from this query that has the same table?  Modify the formula in the formula bar to read:

= Table.NestedJoin(#"Renamed Columns",{"Cheque"},Source,{"Cheque"},"NewColumn",JoinKind.Inner)

It doesn’t look like anything happened, does it?  That’s okay.  Remember that the source step just pulls in the data from the GLListing query.  Since we didn’t do anything to it in that step, it SHOULD look identical.

Now we can continue on and finalize the query:

  • Expand the “NewColumn” column:
    • Only expand the Date column, as we have the others we need
    • Uncheck the “Use original column name as prefix” setting
  • Move the Date column between then Cheque and Amount columns

image

  • Rename the query to “Outstanding”
  • Go to Home –> Close & Load

And the final result:

SNAGHTML1b93e715

Final Thoughts

Figuring out which records match is actually pretty easy.  We simply merge two tables, and choose to only include matching rows.  Working out differences is obviously a bit harder.  (Wouldn’t it be awesome if there was an inverse setting on that merge dialog that let us only include unmatched rows?)

I left my full time controllership job before I ever got the chance to implement this technique for our bank reconciliations.  Currently there is a lot of VBA and manual work needed to clear both the cheques and deposits on a monthly basis.  Given this, however, I know that I could have re-written the bank reconciliation to very quickly eliminate all the records that match, leaving me with only the transactions that I actually needed to focus on.

Taking it even one step further, with another table of adjustments added in to the mix, I’m sure I could build it to actually produce an ever diminishing listing of un-reconciled transactions, and most likely even an output report replicating a full bank reconciliation.  Pretty cool, especially when you consider how much could be refreshed when you start the process next month!

Calculate Hours Worked

Chandoo posted an interesting challenge on his blog last Friday, challenging users to calculated hours worked for an employee name Billy.  This example resonated with me for a couple of reasons.  The first is that I’ve had to do this kind of stuff in the past, the second is because I’ve got a new toy I’d use to do it.  (Yup… that toy would be Power Query.)

It always blows my mind how many people respond on Chandoo’s blog.  As the answers were pouring in, I decided to tackle the issue my own way too.  I thought I’d share a bit more detailed version of that here as I think many users still struggle with time in Excel.

Background and Excel Formula Solution

Chandoo provided a sample file on his blog, so I downloaded it.  The basic table of data looks like this:

SNAGHTMLeb2f08

Now, for anyone who has done this a long time, there a few key pieces to solving this:

  • The recognition that all times are fractions of days,
  • The recognition that if you omit the day it defaults to 1900-01-01, and
  • The data includes End times that are intended to be the day following the Start time

The tricks we use to deal with this are:

  • Test if the End time is less than the start time.  If so, add a day.  (This allows us to subtract the Start from the End and get the difference in hours.
  • Multiply the hours by 24.  (This allows us to convert the fractional time into a number that represents hours instead of fractions of a day.)

Easy enough, and the following submitted formula (copied down from F4:F9 and summed) works:

=(D4+IF(C4>D4,1,0)-C4)*24

SNAGHTMLea6d3e

Also, there was a great comment that Billy shouldn’t get paid for his lunch break.  Where I used to work (before I went out on my own), we had a rule that if you worked any more than 4 hours you MUST take a lunch break.  Plumbing in that logic, we’d would need a different formula.  There’s lots that would work, and this is one:

=((D4+IF(C4>D4,1,0)-C4)*24)-IF(((D4+IF(C4>D4,1,0)-C4)*24)>4,1,0)

SNAGHTMLf008e9

So why Power Query?

If we can do this in Excel, why would we cook up a Power Query solution?  Easy.  Because I’m tired of having to actually write the formula every time Billy sends me his timesheet. Formula work is subject to error, so why not essentially automate the solution?

Using Power Query to Calculate Hours Worked

Okay, first thing I’m going to do is set up a system.  I’m set up a template, email to Billy and get him to fill it out and email it to me every two weeks.  I’ll save the file in a folder, and get to work.

  • Open a blank workbook –> Power Query –> From File –> From Excel
  • Browse and locate the file
  • Select the “Billy” worksheet (Ok, to be fair, it would probably be called Sheet1 in my template)

image

  • Click Edit

And now the fun begins…

  • Home –> Remove Rows –> Remove Top Rows –> 2
  • Transform –> Use First Row as Headers
  • Filter the Day column to remove (null) values
  • Select the Day:End columns –> right click –> Remove Other Columns

And we’ve now got a nice table of data to start with:

SNAGHTMLfb3d71

Not bad, but the data type for the Start and End columns is set to “any”.  That’s bad news to me, as I want to do some Date/Time math.  The secret here is that we need our values to be Date/Times (not just times), so let’s force that format on them, just to be safe:

  • Select Start:End –> Transform –> Date/Time

Next, we need to test if the Start Date occurs after the End Date.  Let’s use one step to test that and add one day if it’s true:

  • Add Column –> Add Custom Column
    • Name:  Custom
    • Formula:  =if [Start]>[End] then Date.AddDays([End],1) else [End]

So basically, we add 1 day to the End data if the Start time is greater than the end time.  Once we’ve done that, we can:

  • Right click the End column –> Remove
  • Right click the Custom column –> Rename –> End

And, as you can see, we’ve got 3 records that have been increased by a day (they are showing 12/31/1899 instead of 12/30/1899

image

Good stuff, let’s figure out the difference between these two. The order of the next 3 steps is important…

  • Select the End column
  • Hold down the CTRL key and select the Start column
  • Go to Add Column –> Time –> Subtract

Because we selected the Start column second, it is subtracted from the End column we selected first:

image

Now we can set the Start and End columns so that only show times, as we don’t need the date portion any more.  In addition, we want to convert the TimeDifference to hours:

  • Select the Start:End columns –> Transform –> Time
  • Select the TimeDifference column –> Transform –> Decimal Number

Hmm… that didn’t work quite as cleanly as we’d like:

image

Ah… but times are fractions of days, right?  Let’s multiply this column by 24 and see what happens:

  • With the TimeDifference column selected:  Transform –> Standard –> Multiply –> 24
  • Right click the TimeDifference column –> Rename –> Hours

Nice!

image

Oh… but what about those breaks?

  • Add Column –> Add Custom Column
    • Name:  Breaks
    • Formula:  =if [Hours]>4 then -1 else 0
  • Add Column –> Add Custom Column
    • Name:  Net Hours
    • Formula:  =[Hours]+[Breaks]

And here we go:

image

At this point I would generally:

  • Change the name of the query to something like:  Timesheet
  • Close and Load to a Table
  • Add a total row to the table

SNAGHTML12fe424

But just in case you only cared about the total of the Net Hours column, we could totally do that in Power Query as well.  Even though it’s not something I would do (I’m sure Billy would trust YOU implicitly and never want to see the support that proved you added things up correctly…), here’s how you’d do it:

  • Go to Transform –> Group By
  • Click the – character next to the Day label to remove that grouping level
  • Set up the Grouping column:
    • Name:  Net Hours
    • Operation:  Sum
    • Column:  Net Hours

Here’s what it looks like if you set the column details up first, indicating where to click to remove the grouping level:

image

And the result after you click OK:

SNAGHTML1344d4b

Holy Cow that’s a LOT of Work!?!

Not really.  Honestly, it took me about a minute to cook it up.  (And a LOT longer to write this post.)  But even better, this work was actually an investment.  Next time I get a timesheet, I just save it over the old one, open this file, right click the table and click Refresh.  Done, dusted, finished and time to move on to more challenging problems.

Even better, if I wanted to get really serious with it, I could implement a parameter function to make the file path relative to the file, and then I could pass it off to someone else to refresh.  Or automate the refresh completely.  After all, why write formulas every month if you don’t have to?

:)

Naming Conflict Fun

Jeff Weir published a post at DDOE last week on global names freaking out when a local name is encountered.  It reminded me that I ran into something similar when I was testing text functions in Power Query a while back; a naming conflict when I created a table from Power Query.

Interestingly, I can replicate this without Power Query at all using just native table functionality.

Set up a Table

To begin with I created a very simple table:

SNAGHTML868f307

Then I gave the table a name.  In this case, for whatever reason, I chose “mid”:

SNAGHTML869e527

Enter Wonkiness (Naming Conflict)…

Okay, it’s a weird table name.  I get that.  But in my original example I was comparing Power Query’s Text.Range function with the MID function, which is why I named my table mid…  anyway…

Add a new column and type in =MID

SNAGHTML86ce5c5

You can see that we’ve plainly chosen the MID that refers to the function, not the table.  I even set the capitalization correctly to make sure I got the right one.  Now complete the formula:

=MID([@Product],3,1]

SNAGHTML86e317f

And press Enter:

SNAGHTML86f7672

Nice!  Apparently Excel is too smart for it’s own good and overrules the interpretation of built in functions with table names, resulting in a #REF! error.

Fixing the issue

The solution to fix this should be pretty obvious… rename the table.  When you do, you’ll see that it also updates the formula:

SNAGHTML8739b0e

Plainly Excel was very confused! So now we just need to fix the formula:

SNAGHTML87501e1

And we’re good.  :)

End Thoughts

To cause this issue from Power Query, you simply need to give your query a name that conflicts with an Excel function (like MID).  When it's loaded to an Excel table, that table inherits the query name as the table name.

The naming conflict issue has probably existed since tables were implemented in Excel.  It’s not good, but at the same time, it’s taken me a long time to trip on this, as I don’t usually use a table name that conflicts with a built in function name… at least not one that I use.

Long story short:  Avoid naming your tables (or Power Queries) after Excel function names.  😉

Retrieve Related Tables in Power Query

As I was working on one of the assignments for our upcoming Power Query training course, it occurred to me that I’ve never blogged about this feature: how to retrieve related tables in Power Query.

Retrieve Related Tables in Access

If you’re using Access, you’re stuffed.  I mean… you can do it manually then merge the tables.  Here’s the database I connected to:

image

Yet when I pull the tblChits table, I get the following columns:

SNAGHTML24047fb1

All the columns from the tblChits table, but nothing from the other tables.  Too bad really.

So basically, you can still retrieve related tables in Access, you just need to create a connection to each table, then merge the queries manually.  The steps to do this, in brief:

  • Create a “Connection Only” query called Chits that points to the tblChits table
  • Create a “Connection Only” query called Items that points to the tblItems table
  • Create a “Connection Only” query called Categories that points to the tblCategories table
  • Create a new query that references the Chits query
  • Merge the Items query, making the relationship between the POSItemCode in both queries
  • Merge the Categories query, making the relationship between the POSCategoryCode in both queries

So basically, we can still retrieve related tables in Access, we just need to understand the relationships in our database and merge them manually.

Retrieve Related Tables in SQL

In SQL server, it actually pulls some things over for you nicely.

In this case I connected to the AdventureWorks database that we’re hosting in Windows Azure for course participants to use.  (Does anyone else provide an Azure hosted database to practice with?)  Specifically, I:

  • Connected to the Azure hosted AdventureWorks database
  • Selected the SalesOrderHeader table (only)
  • Filtered down the dataset to get a shorter table
  • Removed a bunch of unnecessary columns from the SalesOrderHeader table

And look what I’m left with:

SNAGHTML240c8e12

The Sales.SalesOrderDetail contains a full table of related records for each row in my table.  Those related records come from a completely separate table that I didn’t even ask for, how awesome is that?

SNAGHTML240e1703

And the Sales.SalesTerritory column shows a “value” for each, which also has more data that can be expanded (including sub tables):

SNAGHTML240f07eb

Pretty slick, and saves me the effort of having to sleuth out and perform the joins manually.  I sure wish Access had this ability as well!

Don’t have an Azure database?

This technique works for SQL server (on prem) as well.  I’m not 100% sure which other databases support this technique, as I don’t have access to others, but if you do know, please comment here and I’ll add them to the list.

And if you really wish you had access to try out sourcing data from an Azure database… it’s not too late to sign up for our course!

Data From Different TimeZones

A friend of mine emailed yesterday asking how to compare data from different timezones.  With how good the UI is in Power Query, you’d think this would be easy.  Unfortunately it’s a bit less than that, so I thought it would make a good example for today’s post.

Background

Let’s assume that we’ve got two columns of data; an Order Date and a Shipping Date.  We’d like to work out the number of days it took to ship our order.  Easy enough, we just need to subtract one from the other… except… the system that holds the Order Date reports it in UTC +0:00, and the shipping date is done from my home time zone (UTC –7:00).

The data table we’re starting with looks like this:

image

And you can download a copy of the workbook from my OneDrive here if you’d like to follow along.

Avoiding Temptation

So the first thing to do is pull the data in to Power Query.  So I clicked in the table, went to the Power Query tab, and chose From Table.  At this point we’re greeted with a nice table, and our first temptation is to go directly to the Transform tab and set the Data Type to Date/Time/Timezone:

image

And herein lies a problem.  The system has forced my local TimeZone on the data.  As specified in the initial problem, I need this to carry a UTC +0:00 distinction.

It’s a shame that there is no intermediate step here (how often do I ask for MORE clicks?) which allowed you to specify WHICH TimeZone.  If you’re into working with data from different regions (I.e. this feature), I’d don’t think I’m venturing out on a limb to say that this is pretty important.

To further complicate things, that is the extent of the TimeZone functionality in the UI.  And that’s not going to help us.  So let’s knock off the “Changed Type” step and look at this another way.

Using M to Deal with Data From Different TimeZones

The secret to making this work is to take explicit control of the time zone settings using some Power Query M functions.  It’s not as hard as it sounds.  In fact, we’re only going to use two in this example:

  • DateTime.AddZone to add a time zone to a DateTime data type
  • DateTimeZone.SwitchZone to convert from one time zone to another

I discovered both of these functions by searching the Power Query formula categories article on Microsoft’s site.

Forcing a DateTime to a Specific Time Zone

So we’re currently looking at this data in Power Query:

image

Let’s create a new column to convert the OrderDate:

  • Add Column –> Add Custom Column
    • Name:  Order Date (UTC +0:00)
    • Formula:  =DateTime.AddZone([OrderDate],0)

The secret here is in the last parameter, as we get to specify the time zone.  Since we know these dates/times come out of our system in UTC +0:00, we’re good to not add anything to it.  The result is shown below:

image

Converting a DateTime to a Different Time Zone

Now, in order to be able to compare our DateTimes easily, we want them both to be based in our own time zone.  Since my business works in UTC –7:00, I really want my Order Date represented in that time zone as well.  So let’s convert it.

  • Add Column –> Add Custom Column
    • Name:  Order Date (UTC -7:00)
    • Formula:  =DateTimeZone.SwitchZone([#"OrderDate (UTC +0:00)"],-7)

SNAGHTML1f066319

Beautiful.

Just a note here… It may have been tempting to force this data to UTC –7:00 when we added the time zone above, but that would have assigned the date based in the wrong time zone.  I.e. our first record would have returned 7/4/1996 1:12:00 PM –07:00, which is not the same as what we ended up with.

Forcing another DateTime to a Different Time Zone

Now we need to deal with the ShippedDate column, forcing that to my local time.  I could just select the column and turn it into a Date/Time/Timezone data type, but I won’t.  Why?  What if I send this workbook to another user?  It will return THEIR time zone, not mine.  And that could be different.  Much better to explicitly set it.

  • Add Column –> Add Custom Column
    • Name:  ShippedDate (UTC –7:00)
    • Formula:  DateTime.AddZone([ShippedDate],-7)

Notice that this time we do force it to be in the –7 time zone, as these DateTimes originated from that time zone. The result:

SNAGHTML1f0e9e46

Fantastic.  We’ve added time zone data, without changing the original times.

Let’s just go do a little bit of cleanup now:

  • Select the OrderDate and ShippedDate columns
  • Transform –> Data Type –> Date/Time
  • Select OrderDate (UTC +0:00) through ShippedDate (UTC –7:00)
  • Transform –> Date Type –> Date/Time/Timezone

Excellent.  Now they should show up correctly when we load them to an Excel table instead of losing their formatting.

Making Comparisons

We’re at the final step now: Working out the time to ship.  This is relatively straight forward:

  • Add Column –> Add Custom Column
    • Name:  Days to Ship
    • Formula:  [#"ShippedDate (UTC -7:00)"]-[#"OrderDate (UTC -7:00)"]
  • Select the Days to Ship column
  • Transform –> Data Type –> Duration

Note:  You can just double click the column names in the formula wizard and it will put the # characters in there for you.

And the final look in Power Query:

SNAGHTML1f1602fa

With that all complete, the final step is to give the query a name (I chose ShippingTimes) and load it to a worksheet:

image

Final Thoughts

Personally, I like to take explicit control over my data types.  Call me a control freak if you like (I’ve been called much worse) but relying on implicit conversions that set to “local time” scare me a bit, particularly if I’m going to be sending my workbook off to someone who lives in a different zone than I do.  Once you know how to do this it’s not super difficult, and I now know EXACTLY how it will represent on their side.

I’ll admit also that I’m a bit disappointed in the UI for datetime conversions.  To me, anyone playing in this field needs very granular control over every column.  An extra step in the Transform to Date/Time/Timzone step would go a long way to solving this, as you’d be able to skip writing custom formulas.  Hopefully that’s on the Power Query team’s radar for the future, as well as a full datetime menu that would allow us to easily choose from/add/convert to the majority of the formulas found in the article referenced above.

Power Query Training

Also don't forget.  If you love Power Query or are intrigued by the things you can do with it, we have an online training course coming up soon.  Check it out and register at www.powerquery.training/course

What Power Query Functions Exist?

I know that this topic has been covered before by others, but I think it’s still pretty valuable for a user to be able to figure out what Power Query functions exist, especially since they are often different than what we’re used to in Excel.

NOTE:  This article was updated 2015-05-20 at the request of a reader to include more coverage on implementing the discovered function into the solution.

 

Power Query Functions Documentation on the web

There’s a pretty good resource site available on the Microsoft Support site.  Personally I have that one bookmarked and head over there often when I’m looking for a new function to do something.  I find that with a quick CTRL + F on the page, I can quickly search and narrow in on the function I think I need in order to learn it’s syntax.

To be fair, I’m not always in love with the actual examples (many lack a power query UI view), but overall the site is fairly useful.

Power Query Functions Documentation in the client

Now that’s all good, but what if you’re working on a plane with no WIFI, and you need to figure out the syntax for a new function?

As luck has it, there is a way to pull up the list for most functions right in the client.  To do this, I:

  • Clicked Add Column –> Add Custom Column
  • Typed a 1 and clicked OK
  • Went to the Power Query formula bar and typed the formula below.  (Notice that this is case sensitive)

 =#shared

(Why the custom column? Because typing in the formula bar replaces the previous step, and I want to be able to revert to that since it’s part of my logic:

image

Now, you’ll see you get a list of (almost) all the functions that you can access:

SNAGHTML894ffc53

Now, let’s assume I’m trying to find a formula to remove certain characters from a text string.  I really need to search for “Text.”, but there isn’t a search option.  No big deal, let’s convert this list into a table:

SNAGHTML89522a10

Once we’ve done that, we get a nice table of all of the functions, and we can filter them to our heart’s content.  Here’s my table filtered down to just rows that begin with “Text”:

SNAGHTML897b95c3

And a page or so down, I found something that looks like it might work:  Text.Remove.

Investigating the Function Syntax

I clicked on the green Function beside the Text.Remove entry.  It pops up an Invoke Function box, and behind that is the syntax for how it’s supposed to work.  So that’s pretty cool.  I tried it out with some text, as shown below:

txtremove1

Clicking OK returned the following:

txtremove2

Now this is a bit… weird… and frustrating. Value? Why Value? (I actually don’t know why, you’d think it would have been the function name, wouldn’t you?)

I stepped back to the Value step of the query, as I wanted to look at the syntax page that popped up behind the Invoke Function dialog:

image_thumb.png

 

My only complaint here is that once you land in this window, the only indicator of the actual function name is in the smallest font on the page, buried in the middle. You’d think that the name would should up a little more prominently. Regardless, I copied the name of the function, then stepped back to the Invoked FunctionValue step and replace Value in the formula bar with the function name:

txtremove3

Perfect, it works.

Implementing the Function in the Solution

Now let’s see if I can get it into my original query. To do that I:

  • Copied that entire line of M from the formula bar,
  • Selected the Source step (I wouldn’t be able to do this if I had typed #shared while I had the Source step selected originally),
  • Choose to Add New Column –> Add Custom Column –> Accept the inserted step,
  • Pasted the copied M into the formula area, and
  • Replaced the original text (“My –Dog –Has –Fleas”) with the name of the appropriate column from my data set.

Visually, it looks like this:

txtremove4

 

And then I checked the query to see that it worked:

txtremove5

 

Cleanup

Now that I’ve been able to explore the functions and found and implemented the one I’d like to use, I can just knock off the extra steps shown below in yellow, returning me back to my next step:

txtremove6

 

 

 

Learning more about Power Query functions

For reference, this is one of the many things that Miguel and I will be covering in our upcoming Power Query training workshops.  Learn more about the workshop and register here:  http://powerquery.training/course/

Announcing Online Power Query Training!

I’m really pleased to announce that a new project I’ve been working on is live: powerquery.training, a website that offers online Power Query training!

online Power Query training

PowerQuery.Training

About PowerQuery.Training

PowerQuery.Training is a joint effort between Miguel Escobar (of PoweredSolutions.co) and myself: the two guys who are writing that Power Query book – M is for Data Monkey book.

M is for Data Monkey

M is for Data Monkey

What you can find at PowerQuery.Training

We believe that Power Query is a super important tool in the toolbox of every Excel user out there.  It's so easy to use, and so powerful, that we think everyone should know about it.  To that end, we've decided to include the following two areas on the site:

Power Query patterns

One of the areas of focus is to showcase Power Query patterns (techniques).  These patterns are well illustrated articles with supporting workbooks, which will help you build solutions for common scenarios.  There will be more coming as we have time to add them, but today you'll already find:

Online Power Query Training

We know that not everyone wants to read pages of documentation, trying to figure out how to use a new technology.  Some people want to carve out a block of time, link up with an instructor, and be taught the basics, and how to avoid the inevitable pitfalls.  That's our aim with our online Power Query training workshops.

That’s right, there will be live online offerings in order to help get you skilled up on Power Query without ever leaving the comfort of your own office!  All you need to do is dedicate 3 days of your time – well okay… and some space on your credit card – and you’ll be off to the races with this software, taming and automating the cleanup and refresh of your data.

Interested in seeing what the course covers?  Download the Course Agenda here.

Load Power Query directly to Power Pivot in Excel 2010

One of the cool features in Excel 2013’s Power Query is being able to load to the Data Model (PowerPivot) directly.  But Excel 2010 doesn’t appear to have this feature.  Interestingly, you can still load Power Query directly to Power Pivot in Excel 2010, it just takes a bit of a careful workaround.

Let’s look at the required steps

Step 1: Create Your Connection

First, I’m going to load in the content of a text file.  So I:

  • Go to Power Query –> From File –> From Text
  • I browsed to the file I needed, and imported it into Power Query
  • I do whatever cleanup is needed and name the query Sales
  • Next, we go to the Home tab –> Close and Load –> Close and Load To…

And here’s the important part:

  • Choose “Only Create Connection” –> Load

And I’ve now got a basic connection to my sales table without landing it in a worksheet:

image

Step 2: Grab the Connection String

Now, here comes the secret.  We need to get the connection string that Excel uses to connect to the Power Query.  Here’s how:

  • Go to the Data tab –> Connections

In there, you’ll see the name of your new connection:

image

  • Select your Query and click Properties
  • Click the Definition tab

Now you’ll be looking at something like this:

image

Notice that this query is actually an OLE DB Query that is simply “SELECT * FROM [Sales]”  That seems easy to work with.  But the key for us is the connection string shown (#2 in the image above).

  • Select the ENTIRE connection string
  • Press CTRL + C to copy it
  • Click Cancel

Note:  Make sure you start at “Provider=” and highlight all the way to the end.  (It’s much longer than what you see in that little box.)

Load Power Query directly to Power Pivot

Finally, we’re going to pull this into Power Pivot.  To do this:

  • Go to the Power Pivot tab –> PowerPivot Window
  • From Other Sources –> Others (OLEDB/ODBC) –> Next

image

  • Name your table
  • Paste your Connection String in the box

image

  • Click Next –> Next –> Finish –> Close

And voila!  We have our Power Query linked directly into Power Pivot in Excel 2010!

image

Just remember… if you do this, NEVER modify this table in Power Pivot.  Always go back to modify the table in the Power Query stage.  Failure to do so could set the table into a non-refreshable state.