Pulling Excel Named Ranges into Power Query

The comments of my last post collected a tip that I thought it was worth exploring.

I made the claim that Excel MUST have the data in an official Excel table.  As LoganEatsWorld pointed out, that’s actually not true any more.  If you’d like to give this a go, you can download this workbook to follow along.

What’s in the file?

The file is very basic.  It simply contains one table, and one named range of data:

image

The table is highlighted in the blue table style, and bears the name “Stats”.  The named range is surrounded by the black outline, and is called “Breeds”.

Connecting to Excel Data:

The reason I never found this is that my method was always to go to the Power Query tab and click –> From Table.  That will work great to get the data out of a table, but it won’t work for the named range.  So let’s try this a different way…

  • Go to Power Query –> From Other Sources –> Blank Query
  • Click in the formula bar and type the following:

= Excel.CurrentWorkbook()

(Yes, it’s case sensitive… I’m starting to reconcile myself to the fact that it’s Power Query so I’m just going to have to get over it.)

What ends up happening is a bit of magic:

image

Interesting… we have two tables listed!  The first is our official table, the second is our named range.  Cool!

Let’s click in the blank space to the right of the green “Table” text in the Breeds row:

image

The preview pops up and, sure enough, that’s our named range data:

image

Working With The Data

All right, let’s click the green Table text and break open that named range:

image

One notable difference here (in fact really the only one), is that Power Query doesn’t automatically recognize the header row.  This is due to the fact that an Excel table actually has a named header row to promote, where a named range does not.  No big deal though, as we can easily deal with that:

  • Go to Transform –> Use First Row As Headers

At this point, we could save the table to the worksheet or data model, as we need.

Observations

So this is cool.  It’s awesome that we can get to named ranges, as I have a LOT of workbooks that use these, and there are occasions where I don’t want to convert them to official Excel tables.  Despite the fact that we can, however, you pretty much need a secret decoder ring to find it, and that’s not so good.

It would sure be nice if there was a more discoverable way to pull in a named range… but where?

Suggested Accessibility Option 1

When I look at Power Query’s “Get External Data” function, it seems logical to me that it should end up somewhere in that area.  Looking at the group:

image

I kind of like the ability that comes with the “From Table” feature which works from the table you’re in (if you are), and lets you create a table if you’re not inside a table when you click that button.  But I wonder if it would be better served as a SplitButton/menu/submenu structure that offered the following options:

  • From Table
    • Current Table
    • Other Table
      • List of other tables in the workbook
    • Named Range
      • List of named ranges in the workbook
    • Create New Table
    • Create New Named Range

Actually, there is another change I would make to that group, and that’s to move the “From Blank Query” out of the “From Other Sources”, and give it it’s own button.  (I create a LOT of queries from scratch now, and it’s just extra clicks in my way to do so.)

Suggested Accessibility Option 2

I’m not sure this is so much of an alternate as something additional I’d like to see, actually.  An “additional sources” button on the Home tab would be awesome.  If that had the ability to pull up all the existing tables or named ranges in the workbook, and add them to the Power Query script as a “Source2=…”

I think the implications of this would be two fold:

  1. It would allow you to add a data source after creating a blank query, and/or
  2. It would allow you to add additional data sources into the same query.

The latter is certainly something I do semi frequently, as I don’t want to have multiple Power Queries created that are then merged together.  I’ll add both sources manually in the same query then merge them.

At any rate, just some thoughts.  If you have any on the subject please feel free to leave them in the comments.  :)

Un-pivoting With Subcategories in Power Query

My last few posts have been relatively technical, so this time I figured I’d look at something practical.  I can’t believe it’s been almost a year since I blogged about Un-Pivoting data in Power Query, so it’s about time we looked at that again… but this time with a twist.  This time we’ll add sub categories to the data.

Background

The data we’re going to start with looks like a typical financial report.  Whether a restaurant or a shoe store, in the manager’s office you’re liable to come up with a report that looks something along the lines of this:

SNAGHTMLf92afc

Now for the challenge… someone decides they need an alternate view of this data.  So how do you quickly un-pivot this into a format that you can use for other things?

If you want to follow along, you can download the file from my OneDrive.

Issue 1 – Getting the Data into Power Query

The first issue we come across is that, while Power Query can consume data from inside an Excel file, it MUST be formatted as a table.  But this hardly looks like it’s conducive to a table format with all those blank rows and such.  But what the heck… let’s apply a table to it anyway, and see what happens.

  • Click anywhere
  • Choose Power Query –> From Table
  • Adjust the range to cover all of the data (A4:H17)
  • Uncheck the box that indicates your table has headers

When you’re done, the box should look as follows:

image

And when you say OK, you should be taken into Power Query.

If you take a quick peek back at Excel, you can see that the table has indeed been applied, and that there are generic column headers above each column:

SNAGHTML100f05e

This is also reflected in Power Query:

image

Data Cleanup

Before we get into the trick of how to deal with subcategory columns, let’s clean up some of the garbage here.  Ideally what we’d like to get here is a nice pure table that we can easily un-pivot, just like we did in the prior article.

Cleanup Step 1:

Looking at the first column, we’ve got a bunch of null values in there, as well as some section headers.  What we really need is those section headers repeated on the lines below them.  So let’s make that happen.

  • Select Column1
  • Go to Transform –> Fill –> Down

You’ll see that the section headers are filled into any of the null areas. As soon as they encounter data however, they stop. (I’ve drawn a box around the Revenue lines below – notice how they fill until they reach Total Revenues, which then fill until they reach Expenses, and so on.)

image

Cleanup Step 2:

Now, we don’t really need any of the rows that are showing null values in Column3 through Column 8.  Let’s filter Column3 to remove those.  Click the drop down on Column3 and uncheck (null).  The result looks like this:

image 

So… why didn’t I filter the null values out of Column2?  After all, there are blank data rows, and with a PivotTable we can recreate the subtotals…  The answer is that I’m not ready to lose the first two rows yet.  I need those in order to un-pivot my data. ;)

Issue 2 – Un-Pivoting the Data

If things were lined up perfectly, we could just select Column3 through Column 8 and un-pivot it now.  Unfortunately, if we do we’ll get some really wonky results.  (Go ahead and try it if you like. Remember to click the x to the left of the “Unpivoted Columns” step once you confirm you’ve made a disaster of it!)

image

Preparing to Un-Pivot

Okay, so what do we need to do… well, the first thing we need to do is fill the first row (containing April and May) across the columns.  Here’s the rub though… there is no Fill—>Across feature.  So how do we do it?

Transposing the Table

To an accountant, transposition is an evil word that means you made a mistake and flipped two digits around.  It’s nasty and something we never look forward to.  But to Power Query it’s simply awesome.  Check this out…

  • Go to Transform—>Transpose

This instantly flips the columns to rows and rows to columns!

SNAGHTML1147243

And would you look at that… April and May are in Column1 and below them… null values!  We know what to do with those now!

  • Select Column1
  • Go to Transform –> Fill –> Down

Is this enough though?  Nope, sorry.  You might be tempted to “un-transform” and then un-pivot it, but you’d still end up with garbage.  We still need to do a bit more.

Concatenating the Category and Subcategory

This is the trick to un-pivoting with subcategories: you need to concatenate them first, then un-pivot, then split them up.  So let’s get to it.  Using the tip from Merging Columns with Power Query, let’s join up Column1 and Column2.

  • Select Column1
  • Hold down CTRL or SHFT and select Column2
  • Click Transform –> Merge Columns
  • Choose the Colon for the separator (or whichever you prefer)

Note:  If you don’t have the Merge Columns feature, you’re using an old version of Power Query. Either download the latest version, or you’ll need to manually join the columns by:

  • Adding a new column
  • Using the formula =[Column1]&”:”&[Column2]
  • Delete Column1 and Column2
  • Move the new column to the first position

Once you’ve got it done, the output should look as follows:

SNAGHTML120c673

Un-transposing the Table

Awesome… we’ve got concatenated headers now.  We just need to flip the table back right side up and we’re almost ready to un-pivot it:

  • Go to Transform –> Transpose

Final Preparations

The very last thing we need to do before we un-pivot our table is provide some decent headers.  This will ensure that the data will make sense when it is un-pivoted.  To that end:

  • Go to Transform –> Use First Row as Headers
  • Rename the first column to “Class”
  • Rename the second column to “Category”
  • Filter “Category” to remove the null values

Our table now looks nice and clean:

image

And we’re ready!

Un-Pivot It!

We now follow the steps of a regular un-pivot operation, with only a minor extra step:

  • Select the Class column
  • Hold down CTRL or SHFT
  • Select the Category column
  • Go to Transform –> Unpivot Columns –> Unpivot Other Columns
  • Rename the “Value” column to “Amount”

And now the extra step:

  • Select the “Attribute” column
  • Go to Transform –> Split Column –> By Delimiter –> Colon
  • Rename the Attribute.1 column to “Month”
  • Rename the Attribute.2 column to “Measure”

The results:

SNAGHTML12d5ffa

That’s pretty much it.  The last thing I’d do is change the Query name from Table1 to something more intelligible… maybe Data or something… then load it to the worksheet.

From a Static Report to a Data Source

Now that we’ve got our report reformatted into a data source, we can click anywhere in the table and pivot it to our heart’s content!

A Quick Recap

To be fair, this post has been pretty long, but only because I included a LOT of pictures and detailed instructions.  Once you’ve got the process nailed down, it doesn’t take all that long at all.  Remember, the big key to this is:

  • Suck your report into a table (without headers)
  • Fill any rows you can
  • Transpose the data
  • Concatenate your category and subcategory fields together
  • Un-transpose it
  • Un-pivot it
  • Split the category and subcategory back into their pieces

Enjoy!  :)

Power Query – The Round Function

The other day I asked one of my co-workers how many ways he knew of to round a number.  His answer was one… if it ends in .4 it rounds down and if it ends in .5 it rounds up.  My guess is that most people would answer along similar lines.

Interestingly though, there are a bunch of different ways to round, depending on your needs, and Excel has a bunch of functions to support them: ROUND, ROUNDUP, ROUNDDOWN, FLOOR, CEILING, EVEN, ODD, TRUNC, INT and MROUND.

Power Query also has a bunch of rounding formulas as well but, given that the function names can be somewhat different in Power Query (as we first saw here), I thought it might be interesting to see how to convert each of those functions from Excel to Power Query’s structure.

Background Setup

To start with, I created a very simple structure: a two column table with some random values in the “Value” column and then rounded them to 2 decimals using the formula =ROUND([@Value],2)  The output, after feeding it through Power Query, looks like this:

SNAGHTML20a5291b

The blue table on the left is the Excel table, and the green table on the right is the Power Query output.  (There is a completed example file available hereexample file available here.)

Creating the Round function

I love the ROUND function in Excel.  I use it constantly – rounding everything that uses multiplication or division – and pretty much have it burned into muscle memory.  So to me this was a logical place to start with Power Query.  Naturally, the syntax is just a bit different from Excel though:

Syntax
Excel =ROUND(number,digits)
Power Query =Number.Round(value, digits, roundingMode)

Hmm… we know that the Power Query function will be case sensitive.  In addition, it has an extra parameter.  The valid options are:

  • RoundingMode.Up
  • RoundingMode.Down
  • RoundingMode.AwayFromZero
  • RoundingMode.TowardZero
  • RoundingMode.ToEven

Let’s see what we can do with this.

I open up my query, select Add Column, and put in the formula as shown below:

image

Pretty easy, just Number.Round([Value],2).  In fact, it’s so similar to Excel it’s awesome!

So I click OK, save the query, and have a look at my results.  And that’s when I notice something isn’t quite right.  I’ve added some conditional formatting to the table below so you can see it easily:

image

This is the default?

Notice all the numbers that don’t match?  Can you spot the pattern?  It’s the oddest damn thing I’ve ever seen, to be honest, and I’ve never heard of anyone rounding in this way.

The default “RoundingMode” for Power Query is “Round to Even”.  What that means is that if there is a tie in the numbers it will round up or down to the closest even number.  So in the case of 1.645, it will round down to 1.64 as that is closer than 1.66.  (1.64 is .05 away from 1.645, where 1.66 is 0.15 away from 1.645).

I find this deeply disturbing.  I personally think that every user would expect Excel and Power Query’s default rounding methods to line up exactly, and this doesn’t.  How serious is this?  I’m not sure.  I think I’ll let someone from the scientific community ponder that.

Using RoundingMode.Up

Since the default plainly doesn’t work for us, it looks like it’s time to figure out which of the additional parameters we need.  Let’s try adding RoundingMode.Up to see if that will fix it.

I open Power Query again, and added a new custom column with the following formula:

=Number.Round([Value],2,RoundingMode.Up)

And the results are as follows:

image

Um… uh oh.  It seems to work above 0, but below is another matter.  That –5.245 is rounding down, not up! (Yes, from a technical perspective I am aware you can argue the words I used, but you get the idea.)

Using RoundingMode.Down

Now I’d be surprised if this came up with numbers consistent with the Excel formula, but let’s just check it for good measure.  The formula is:

=Number.Round([Value],2,RoundingMode.Down)

And the results:

image

So now numbers greater than 0 get rounded down, where numbers less than 0 are being rounded up (away from zero).

Let’s try another:

Using RoundingMode.AwayFromZero

Here’s our next option:

=Number.Round([Value],2,RoundingMode.AwayFromZero)

And these results are pleasing!

image

Look at that… we finally found the one that works!

Using RoundingMode.TowardZero

We’ve only got one other option we haven’t explored, so we might as well use it too, just for the sake of completeness:

=Number.Round([Value],2,RoundingMode.TowardZero)

For some reason, I’m incapable of typing TowardZero the first time I type this.  I always type TowardsZero and end up with an error!  At any rate, the results:

image

Thoughts

As a tool that is built for the Excel audience, I am having some real difficultly accepting the default parameter for this function in Power Query.  I HOPE that this is a bug, and not a design choice, although the documentation would suggest it is the latter.  If that’s the case, I think it’s a HUGE mistake.

Excel’s ROUND formula defaults to round away from zero.  Power Pivot’s DAX ROUND formula defaults to round away from zero.  VBA’s Application.Round function defaults to round away from zero. (As pointed out by Rory Archibald on Twitter, VBA’s Round function – without the application. prefix – does use banker’s rounding though.)

In my impression, if the Power Query formula holds the same name (at least after the Number. portion) it should return the same results as the Excel function.  In fact, I would venture to say that virtually every Excel pro would expect this.

My bigger concern would be that, with one of Power Query’s big selling features being it’s ability to re-shape and process large volumes of data, how quickly will a user realize that the Rounding function they thought they had is NOT working the way they expected?  Not good news at all.

I’m curious to hear your impressions.  Please leave a comment!

Want to see if for yourself?

Download the example file with all the formulas already in place.

Power Query – Multi Condition Logic

In my last post, we looked at creating an IF statement using Power Query.  This time we’re going to go a bit deeper and look at a scenario where we need to choose between several options.

The base scenario

In the last post I talked about my billing code setup.  As a reminder, it’s a 10 digit code that is constructed as follows:

  • Char 1 – Alpha – Indicates the division (G = Golf, F = Fitness, M = Marina)
  • Char 2 – Alpha – Indicates the billing type (D = Dues, S = Pass, A = Annual Moorage, P = Periodic Moorage)
  • Char 3-4 – Numeric – Indicates the number of months of coverage for the product (1-12)
  • Char 5-6 – Numeric – Indicates the start month (and subsequent anniversary) for the customer’s product
  • Char 7-8 – Variable – Slip length (in feet) for a boat in the case of marina customers, or SG, CP or CS for golf (indicating single, couple primary or couple spouse)
  • Char 9 – Text – A variety of single letter codes indicating specific things we want to know. (Outlined below)
  • Char 10 – Text – Indicates the payment method (F = Financed, P = Paid up front, C = Comp/Honorary)

(Note that the sample data only includes records for Marina data)

Sample file

If you’d like to follow along, download the sample file here.

Multi condition logic – Using IF

So, building on my previous two posts (using text functions and creating IF statements), we could easily break the first character into pieces by nesting 2 IF tests together:

=if Text.Start([BillingCode],1)=”G” then “Golf” else “two options left”

=if Text.Start([BillingCode],1)=”F” then “Fitness” else “Marina”

into

if Text.Start([BillingCode],1)=”G” then “Golf” else if Text.Start([BillingCode],1)=”F” then “Fitness” else “Marina”

Not too hard really.  In fact, we can even build each IF statement separately, then just copy the second to replace the “two options left” part without making any other changes at all.  No parentheses or anything needed.

More than 3 options

But what if you have a whole bunch of options that you need to work with?  Let’s look at the 9th character in our billing code.  I haven’t given the details yet for that one, but here are the options:

E = Employee, S = Yacht Club, N = Non-Taxable, R = Restricted, I = Inactive, L = Social, M = Medical, U = Regular

Wow.  That’s a whole lot of possibilities, and would make for one monster nested IF statement.  That wouldn’t be a lot of fun to write, nor maintain.  So how would we deal with it?

In Excel proper, we would probably separate these options into a separate table, then use VLOOKUP to pull the appropriate value into the table.  So we just need a Power Query VLOOKUP function… except there isn’t one.

We actually have a couple of different methods to deal with this.  We could either:

  1. Split the first character into one column, create an Excel table with the first letter in column 1 and the appropriate match in column 2, then merge the two using Power Query’s merge function. (Maybe I’ll write a future post on it.)
  2. Build our own CHOOSE function inside Power Query (or SWITCH if you prefer Power Pivot’s DAX version.)  This is way more fun, so let’s do that.  :)

Building a CHOOSE function

This actually isn’t too hard once you know the basic structure.  It basically goes like this:

function_name = (input) => let
   values = {
         {result_1, return_value_1},
         {input, “Undefined”}
     },
   Result = List.First(List.Select(values, each _{0}=input)){1}
in
   Result,

The key parts to recognize here are:

  • We can change the “function_name” part to be whatever we want/need,
  • result_1 is the first of the possibilities that we may pass TO the function
  • return_value_1 is the value that we’d like to return if the first value is result_1
  • if we need more values, we just insert another comma after the value_1 section and put in a value_2 section
  • we can keep adding as many values as we need.
  • the “Undefined” value will return the text “Undefined” if the value you pass isn’t in your provided list of options (it’s the Else portion of the CHOOSE statement)

Using this structure, we could write a CHOOSE function for our scenario as follows:

fnChoose_CustCode = (input) => let
   values = {
         {“E”, “Employee”},
         {“S”, “SCYC”},
         {“N”, “Non-Taxable”},
         {“R”, “Restricted”},
         {“I”, “Inactive”},
         {“L”, “Social”},
         {“M”, “Medical”},blah
         {“U”, “Regular”},
         {input, “Undefined”}
     },
   Result = List.First(List.Select(values, each _{0}=input)){1}
in
   Result,

Notice that I changed a couple of things:

  1. I gave the function a name so that I can recognize it, and also so that I can create more than one function with different names.  This one is fnChoose_CustCode.
  2. I created a list of all the options I needed.

Implementing the CHOOSE function

Okay, so now we have this, how do we use it?  Again, we’ve got two options.  I’ll focus on the other option at some other time, but for this scenario I want to build this right into an existing query.  So here’s how I do it.

First I created a brand new query that just pulls my table into Power Query, resulting in the following:

SNAGHTML76c03ee

Let’s go and inspect the code that pulls this in.  We need to click View –> Advanced Editor.  That will bring up the following code:

let
    Source = Excel.CurrentWorkbook(){[Name="Customers"]}[Content]
in
    Source

Not too complicated (yet).  Let’s paste in our code just before the Source = line:

let

fnChoose_CustCode = (input) => let
   values = {
         {“E”, “Employee”},
         {“S”, “SCYC”},
         {“N”, “Non-Taxable”},
         {“R”, “Restricted”},
         {“I”, “Inactive”},
         {“L”, “Social”},
         {“M”, “Medical”},blah
         {“U”, “Regular”},
         {input, “Undefined”}
     },
   Result = List.First(List.Select(values, each _{0}=input)){1}
in
   Result,

   Source = Excel.CurrentWorkbook(){[Name="Customers"]}[Content]
in
    Source

Perfect.  And yet it doesn’t exactly look like much.  In fact, beyond adding a new line in the Steps section of the Editor, we don’t see any changes:

image

So what good did that do then?

As it turns out, we’ve only pasted in our function to make it available to the Power Query engine.  We haven’t actually told Power Query to do anything with it.  So why don’t we do that now?

Using our new CHOOSE function

You’re going to be amazed how easy this is…

First we’re going to add a new column (Add Column –> Add Custom Column).  When the dialog pops up, we’ll create a formula to return the letter we want to pass to the function:

=Text.Range([BillingCode],8,1)

And that gives us the following result (assuming we provided the column name of Status):

SNAGHTML29d7316

Cool stuff.  Now, let’s amp this up and use our function.  We’ll click the gear next to the Status step and wrap the existing formula with our function call.  (Don’t forget the extra  parenthesis needed at the end):

=fnChoose_CustCode(Text.Range([BillingCode],8,1))

Which gives us the following:

SNAGHTML2a06cbf

The end effect

You’ll find that all the sample codes in the data work just fine, and that nothing comes back as undefined.  If you’d like to see how the query reacts to different items, go back to the Customers table and try changing the second to last letter to something else.  When you refresh the table, you’ll find that it will evaluate the new character and return the appropriate result.

Caveat

It should be noted that the function as written above is case sensitive, meaning that a code of MP010450uP would return “Undefined”.  This is expected in my case, as valid codes are made up of upper case letters.

If I wanted to accept either case I would need to modify my Text.Range function to force it to upper case.  This would result in a function call that reads as follows:

=fnChoose_CustCode(Text.Upper(Text.Range([BillingCode],8,1)))

Which would work, as you can see here:

SNAGHTML2be7f6a

Power Query “Trusted” Locations

My last two blog posts detailed some frustration that I ran into when working with Power Query.  First the issue that I couldn’t trust folders, but got stuck trusting files, and then the issue where I overloaded my credentials area.  I’ve actually got a solution, in a way, for both.

One thing to be aware of here… there are actually two distinct entities that show up in the Data Source Settings dialog: Security Credentials and Data Privacy Settings.  You can see this in the following picture:

image

Notice how Exchange and the Database don’t show a privacy level on the right?  Those are security credentials.  Interestingly here, you can see two entries for the same thing (like a website), where one is the security credential, the other the privacy setting.

Ok, with that out of the way, let’s get on to the solutions…

Clearing out Security Credentials

In my last post I had over 700 security credential entries for ip-api.com that needed to be cleared.  To clear them manually took clicking the item, clicking delete and clicking ok before moving on to the next item.  Even if I could do one click per second (I find this UI slow and sometimes I missed the target), it would still take me 35 minutes.  I really wanted a “remove all” feature.  Sure I’ll lose Exchange, sure I’ll lose my database, but so what.  Small pain vs spending 30-60 minutes clicking Delete.

So I coded something to do it.  The download link is at the end of the post, but one of the features of the tool is to let you clear out your entire list of security credentials with on click:

image

And voila!

image

Now, are you going to use this every day?  Hell no!  But if you screw up like I did, you just may need a weapon like this in your arsenal.  :)

“Trusted” Folders/Files and URLs

As I mentioned in my last post, Ehren, a developer on the Power Query team messaged me on Twitter to tell me of one solution to the “Trusted Folder” problem.

If you set the privacy level for the folder, it applies to everything within that folder, including files in subfolders

Now that sounded cool.  So I set out to test it out.  First thing I did was went to add a new Folder to my Data Source Settings dialog and… there’s no option to do that. Hmm… okay… so how?

Well, you could go and build a solution that references a folder, just so that you can get a folder in there, I suppose.  But that’s awkward and contrived really.  You have to waste a bunch of time concocting something you hope will work.  And the worst part to me is that I don’t just have to do it on my system, I somehow have to deploy it to other people as well.  Ugh.

Screw it, I’m building a tool.  Here’s the features of what we’ve got:

Clear out all privacy levels

To be fair, this was probably more useful for me while testing, but I included it in case you run into the same issues.  Basically you click this button:

image

And now all the Data Privacy Settings are gone too:

image

Add Folder

So this one is way more useful to solving my issues.  I’m going to click “Add Folder” on the left.  It will let me browse for a folder:

image

And prompt me for the Privacy Level:

image

And once I click OK, it sets it up as a folder in the Data Source Privacy settings.  And yes, it really does trust all files in all subfolders:

image

Add Files

Same thing really, it just allows you to select a specific file.  This is a bit redundant to just declaring them as you go along but hey, once I already wrote the code for the folder this was a snap to add:

image

Add URLs

I love this one.  Easy to use, just provide the URL and the privacy level:

image

And boom!  There you are:

image

And again, this trusts all sub-sites of the main domain.  This was particularly key for me as I’m querying data from this site and have to provide a different URL for each year.

Now, again, on this one, the first time you query the site you’ll get another entry for each URL you touch:

image

This is because you have declared the privacy level, but not the security (unfortunately I can’t get in to modify the security files, for obvious reasons.)  So here’s what I’m going to do to fix this:

  • Select the first URL in the list
  • Click Edit Credential
  • Change the setting to apply to the root domain:

image

  • Click Save

You’ll notice that the first one disappears now.  What actually happened was that it merged the security credentials with the privacy level in one entry.  Cool.  So now I’ll just go back and delete the other two. And it never asks me again on refresh.  :)

The “Big Red Button”

So when it all goes really wrong, and you want to reset Power Query to a default state, what then?  That’s what this button is for.  It will wipe out all of your security credentials, data privacy settings, reset your Power Query formula bar and more.  It’s like a total factory reset of Power Query.

image

Some More Technical Stuff

I’ve now had this installed on Office 2010, Office 2013 Professional Plus and Office 365 Pro Plus versions of Excel, both 32 and 64 bit.  In addition, it’s been installed on Windows 7 Pro and Windows 8.1 systems as well.

Interesting enough, despite being an .MSI installer file, in my tests it has NOT required admin privileges to install.  (On one machine I installed this first without prompting, then got prompted for admin credentials when I went to update Power Query.)

Some Final Observations

I found this project pretty interesting, and it’s given me some ideas for some more useful tools to work with Power Query and other Power BI add-ins.  While I can’t promise a timeline on delivery, I do plan on adding a bunch of new useful stuff to this add-in and releasing a Pro Tools version at some point.

Before you download and install this, I also want to make something very clear.  THIS IS BETA SOFTWARE.  I’m pretty sure it’s stable, and shouldn’t affect anything else, BUT YOU DOWNLOAD AND INSTALL AT YOUR OWN RISK.

The Installer

You can download the installer from this link.

Pain Points – Power Query Credential Management

After my last post on Power Query Security Woes, Ehren – a developer on the Power Query team – sent me a message on Twitter about it.  I will blog about that later, after I’ve had a chance to test it out, but before I did, I thought I’d go and clean up some PQ stuff that I messed up.

A little history…

A while back I got the bright idea to pull down our web leads (at my day job).  I then though, “Hey, wouldn’t it be cool if I could plot all of them on Power Map?”  Sure it would.  But all I had was their IP address.  “No problem!”, I figured, “I’ll just use Power Query to feed out the IP to a web service and return a function to turn it into a real address!”

Success!

I cooked up a little script fed the IP to http://ip-api.com, which did exactly that. Here’s a sample from a random address:

image

I actually tried several services before I settled on this one, but ultimately  it seemed to generate the most accurate results.  Perfect, so I let the script fly, and it was awesome!

Success turns to failure…

Except… somewhere during my batch of 3,500 addresses it stopped working.  And when I went back to the site, I found out that I’d been blocked.  Ooops.  Apparently if you send over a certain threshold of queries in a certain amount of time you’re not a good person. (That’s why I’m not sharing the script.)  Sorry about that ip-api.com, I promise I’ll be good in future!

… and creates a mess in the process…

At any rate, the issue I then ran into is that I ended up with an entry for each URL in my Data Source Settings.  I was a while back, so I don’t remember if it happened by default, or if it happened because I was learning and just clicked the wrong thing.  End result is that I have an entry in my list for each IP I queried.

image

Since I’m going to be a good boy and not steamroll their site any more, I kind of want to clean them up.  No problem, right?  Click on the first address on screen, hold down SHIFT and click on the last address on screen and… huh?  I’ve only got the last address?  Okay, let’s try with CTRL… click, click.. ARGH!!

… and the mess turns into major frustration!

This interface only allows you to select one item at a time?  Really?  So to delete my 200 entries I have to click the line item, click Delete (at the top), click “Yes I’m sure” and then move on to the next one.  Ouch.  I see a lot of wasted time in my future if I really want to clean up now.  :(

The missing feature(s)

Power Query team, (if you’re still listening,) please give us the ability to multi-select in this dialog, and multi-delete items.  We need to be able to clean up, as we’re all human and make mistakes as we go along.  This feels like a severe punishment right now!

In addition, while I’m here, why isn’t there an “Add” button in this interface?  When I go to edit one of these addresses, I see the following:

image

I’m guessing that if I’d just set that to the root at the beginning, I wouldn’t have an entry for every IP in that query.  Man!  If only I’d been able to declare this up front and realized that this was an option!  Yet there is no way to do this from a simple UI.  Instead, I (believe) I have to:

  • Create a query to one page at the domain
  • Set it’s security level as I pull data
  • Save the query
  • Go back to this interface and edit the anonymous access

Life would be SO much easier if I just had the ability to go into this UI and create my favourite and most used sites.  (And bonus points if I could push them out to users through group policy like I can with Office Trusted Locations!)

Okay, enough for now.  I’ll be back later once I’ve tested Ehren’s suggestion.  :)

Power Query Security Woes

Recently I’ve been working on automating a business process for a client.  It’s become a really interesting project that uses a mix of Excel tables, Power Query, Power Pivot and VBA, as well as a WinAutomation script.  Before I talk about the issue that is facing me, it would be helpful to provide a bit of background on what we’re doing, and why so many of the Power BI pieces are needed.

Solution Background

My client has a business in which they outsource employees to other companies.  Each of those companies maintains the records of hours, and lets my client download a spreadsheet version of the hours logged by the subcontracted employee.  This is fairly helpful, as my client is the one that pays all of the employees, so getting the hours lets him know who gets paid for how many hours.  Easy enough, right?  But each spreadsheet is in a different format, and that causes a lot of manual entry to try and standardize it into a file that can be uploaded to the payroll processor.

So here we enter Power Query.  With Power Query we can import each of the files individually, reformat the data into a consistent set of columns, and output it into a table.

Each pay period we start a new Excel file, and import the data files for that pay period by running a WinAutomation script.  The script logs in to the remote systems, passes the correct parameters to the system, and downloads the Excel and CSV files required.  It then saves them in a subdirectory of our application under the pay period end date.  My Power Query scripts then use the pay period end date, dynamically build a file path to the file and import the data.

It’s beautiful… except…

The Issue

Here’s where the pain begins…  Every time you touch a new data file in Power Query that you haven’t used before you get a security message asking you what type of data (Public, Private or Organizational) you are using.  (You can learn more about those here:  http://office.microsoft.com/en-us/excel-help/privacy-levels-HA104009800.aspx) The issue I have is that each payroll I create new files in a subdirectory, so Power Query sees them as unique.  To that end I can’t just trust the data files once and be done with it.

Now, there is a workaround… just enable the Fast Combine feature.  That kills off the warning and lets them go, but it also has an issue.  This setting is workbook specific, which means that when my VBA saves the master payroll control file under a new name (we preserve history this way) the setting doesn’t stay selected.  Grrr.  Given that there is no way to touch this setting via code, my user has to remember to click the Fast Combine button every time they run the update.  Is it minor?  Sure.  But it’s ridiculous, it gets forgotten, and concerns them when they get the permissions messages.

The Solution We Need

Now don’t get me wrong, I’m all for security, but where it makes sense.  In Office 2007 we got a new macro paradigm that allowed us to trust folders on our computers/networks.  This setting is set on an application level basis, and persists between Excel sessions.  Beautiful, as I can set it to a specific folder and forget it.  I throw trusted files in there, I don’t throw in the ones I don’t know.  It actually allows me to practice safer computing overall.

In my opinion, this setting is drastically lacking in Power Query.  I really need the ability to set my Power Query add-in to have trusted root folders with trusted subfolders.  This would allow me to trust my application’s data directory and not have to remember to click Fast Combine each time I create new files.  It also means that I might pay attention to the message when it does come up in future, as it would be unusual.  Currently I see the error so much I just cancel it and go straight to Fast Combine.  Hmm… kind of like how we set macro security to “Low” in Excel 2003 and earlier to bypass the annoying message, which essentially left us unprotected.

The Solution (Some of us) Can Implement

To be fair, I believe I do have a workaround for this now, but I don’t think it should be necessary.

Basically what I could do is use VBA to drive the refresh of my tables, and therefore the Power Query scripts.  Given that, I could use VBA to copy and replace the files in a central repository where the hierarchy does not change.  I.e. I could set up a folder like AppData\Timesheets\Current and use VBA to copy the required data files from my dated subfolders, replacing the ones in the “current” folder.  I would then target my Power Query scripts against the files in the “Current” folder, and build my solution off that.

Because Power Query holds a list of trusted files at an application level (rather than workbook level), these files should stay trusted even if I do create a new file, removing the need to constantly click the “Fast Combine” button.  So I think this should work.

So what’s the problem?  It only involves creating a VBA macro to do that.  Fine for me, as I know how to code with VBA and make this happen.  But for most of the Power Query target market I would suggest that this is probably outside of their normal skill set.

Final Thought

Even though it is technically possible to work around this issue, I still argue should not be necessary.  We need a proper option to trust the files in a local folder so that dynamically referring to a file in Power Query can be scripted without invoking a painful security paradigm.

Importing Dates in Power Query

A couple of weeks ago I was teaching a course on Power Query.  I distributed a bunch of sample files, started to walk the attendees through the steps, and a hand went up.

“I get an error in my date column…”

For a second it stumped me… I tested my examples 15 times before pushing them out to the class.  Why now?

As it turns out it’s that pesky issue that drives many non-North Americans crazy.  I keep my regional settings set to use the MM/DD/YYYY format.  I just find it makes life a lot easier for me.  What we ran up against here was a user who was using the Canadian standard of DD/MM/YYYY.  Yuck.

I promised a way to show how to fix it, and am finally getting around to posting it…

Replicating the Issue

The file I’m working with here is a csv file (saved in a txt format) that uses the DD/MM/YYYY format, as that replicates the issue for me.  So I import it and end up with this:

image

So what’s the problem?  Well, if you look at the areas I circled in red, the ChangedType step changed the data type for the TranDate column to Text.  That’s no good.  So I highlight the TranDate column and change it to Date:

image

Great.  An error.

So this is what I’m talking about.  The dates above have been converted to dates, but not the 1st of February, 2nd of February and so on.  Instead it’s January 2nd, February 2nd, etc.  As soon as it gets to the 13 it gags as there is no 13th month.

This is not an uncommon issue, by any means.  If you database exports in one format and your system settings are in another, you’ll run into this.  Multinationals that exchange files across borders see this all the time (just look in any forum for the amount of help posts on the topic.)

I REALLY wish that Power Query were smarter about this.  I also wish there was an easy way to just tell Power Query which date format your data (not your system) was using, or an easy way to convert it.  Anything to let me do something in one click versus what we need to do to fix this.  (As an added insult, the old text import wizard DID let you declare this.  Go figure!)

The Solution

All right, so how do we fix it?  I’m sure there are a variety of ways, and hopefully a new ribbon will come along to make this obsolete, but here is ONE way.

Step 1: Remove the garbage

First I deleted the “Changed Type” step of the query.

Step 2: Split by the / character

Next I selected the TranDate column, went to the Transform Tab (if you don’t have it download the latest Power Query update), chose Split Column, then By Delimiter, then Custom.  I put in a / and clicked OK.

As an aside here, I wish Power Query suggested what it thought you wanted to use as the delimiter, without actually clicking OK for you.  In this case I think it’s pretty obvious, even though it’s not in the list of defaults.  Granted, I might want to override it with something crazy like a 0, which is why I wouldn’t want it to just click OK for me, but it might save me time if it made an intelligent suggestion.

At any rate, I end up with this:

image

Step 3: Rename the columns

This part is important for the next piece.  Rename your columns to reflect which part of the date they belong to.  You can do this by right clicking the column header and choosing Rename…  (Alternately you can edit the M code in the formula bar manually, but you need to edit the “SplitColumnDelimiter” step first, then edit the ChangedType step to reflect the new column names.  Probably easier to just right click and rename them:

image

Step 4: Putting the date together… correctly

OK, now the tricky part.  We need to insert a column and write a Power Query formula.  To do this, go to the Insert tab, and click Insert Custom Column.  You’ll be presented with a dialog to write your formula.  And here’s the formula you need:

=#date([Year],[Month],[Day])

Sadly, there really isn’t a lot of documentation yet that explains this stuff.  I believe that the # sign tells PowerQuery that this is a data type, the “date” portion determines what type it is, and then we feed it the components that make it up (for each row).  I also changed the name at the top of the dialog to “Date” and clicked OK:

image

Step 5: Force the data type

I’m not sure why this is, but even after declaring this as a date, it formats the column visually as a date, but the data type is left blank.  For whatever reason, I had to select the column and force the data type to a date in order to make it recognize it properly as such.

Step 6: Cleanup

So this part is optional, but you can now delete the Day, Month and Year columns if you don’t need them any more.  Unlike standard Excel, this move won’t leave you with a bunch of #REF! errors down the Date column.  :)

Implications/Modification

The implications of this are somewhat interesting to me.  Let’s assume that your data goes the other way.  I send you a file in MM/DD/YYYY format, and you use DD/MM/YYYY format.  You should be able to follow all of the above steps, with the only difference being which column you identify as which when you rename them.  (In this case it would go Month Day Year after splitting them.)

I’d love to get some feedback from others on this, but I believe the formula in the custom column should work for you without modification as long as the columns are named correctly.

I’ve attached two files to this post for testing purposes.  Right click the links below and SaveAs to get copies to play with:

Try them both, you should be able to generate a correct date for your format from either using these steps.

PowerXL Course Live in Victoria, BC

We are very excited to announce that we will be hosting an “Introduction to Power Excel” session in Victoria, BC on June 6, 2014.

PowerPivot is revolutionizing the way that we look at data inside Microsoft Excel. Allowing us to link multiple tables together without a single VLOOKUP statement, it enables us to pull data together from different tables and databases where we never could before. But linking data from multiple sources, while powerful, only scratches the surface of the impact that it is making in the business intelligence landscape. Not only do we look at PowerPivot in this session, but we’ll also explore the incredible companion product Power Query; a tool that will surely blow your mind. Come join Ken as he walks you through the process of building a Business Intelligence system out of text files, databases and so much more.

Full details, including an early bird signup offer, can be found at http://www.excelguru.ca/forums/calendar.php?do=getinfo&e=42&day=2014-6-6

Activating Debugging Symbols with Add-In Express

Some time ago I published a blog post about an add-in that I’m building, and the reasons why I elected to start using Add-In Express to manage the process.  I still think Add-In Express is the best product for this kind of task, but figured I’d publish this post mainly to save me time the next time I get a new PC and have to go through this again.

The issue I ran into is that I need to set the CLR to work with .NET 3.5.  To do this you have to set the host file to v2.0x.  This is pretty easy, and Eugene has a good picture on how to do this here:  http://www.add-in-express.com/forum/read.php?FID=5&TID=7912

Following the steps that Eugene provided set me off to the races… with Excel 2010.  But when I tried to debug in Excel 2013 I wasn’t having much luck.  I changed the “Program to Start” to Excel 2013, set a breakpoint in my code, and started debugging.  Even though I’d set a breakpoint in my code, it was never activated.  Instead, the breakpoint goes from a red dot to a red circle with a white fill, and mouse-ing over it yields the message “The breakpoint will not currently be hit.  No symbols have been loaded for this document.”

SNAGHTML64bc16

This really sucks, as it’s pretty tough to work with.  So what’s different, and how come Excel 2010 works, but Excel 2013 doesn’t?

The action of modifying that host config file as described by Eugene actually creates a new file that is stored within the appropriate Office subfolder held within the Program Files folder.  Unfortunately, the program that is defined in your “Program to Start” settings seems to be ignored, and it actually creates the file in the folder that holds the oldest version of Office on your system.  So in my case it created the following file:

  • C:\Program Files\Microsoft Office\Office14\excel.exe.config

Now, I stand to be corrected on this, but I believe that this file is only used when launching the debugging tools from Visual Studio.  When the app is started it will look for this file and, if it finds it will use it to provides the debugging symbols back to Visual Studio for the .NET framework specified.

At any rate, the key here is to copy this file and paste it into the same folder that contains the exe file specified in the “Program to Start” area in the Project Properties of Visual Studio.  And once you do that:

SNAGHTML72dbee

Beautiful!  Works nicely.

It’s also worth mentioning that this setup works whether you are using a version of Office that comes from an MSI installer file (such as a DVD or volume license version of Excel/Office), or if you are using the Click to Run (C2R) version of Office that you can download from Office365.  The only thing you need to be concerned about is the file path in which to store the config file and find the Excel.exe executable:

  • 64bit MSI:  C:\Program Files\Microsoft Office\Office15
  • 32bit MSI:  C:\Program Files (x86)\Microsoft Office\Office15
  • 64bit C2R: C:\Program Files\Microsoft Office 15\root\office15
  • 32bit C2R: C:\Program Files (x86)\Microsoft Office 15\root\office15

I know this isn’t super Excel focussed, but hopefully it will help someone out there if they run into this error.

I should also throw a shout out to Andrei from Add-In Express who helped me sort this out last time I needed to figure this out.  The support there was awesome in helping me get it resolved.