Power Pivot eBook Coming Soon

It's been a long time coming, but we are putting the finishing touches on the third installment of our free 'DIY BI' series. Consequently, we are excited to announce that the Power Pivot eBook will be officially released on Tuesday, July 3, 2018!

Power Pivot eBook

This brand new book will feature five of Ken's top tips, tricks, and techniques for Power Pivot, including:

  • Hiding fields from a user
  • Hiding zeros in a measure
  • Using DAX variables
  • Retrieving a value from an Excel slicer
  • Comparing data using one field on multiple slicers

Power Pivot eBook

 

About the 'DIY BI' Series

This free eBook series is available to anyone who signs up for the monthly(ish) Excelguru email newsletter. The series includes four books, one edition each for Excel, Power Query, Power Pivot, and Power BI. Each book contains five of our favourite tips, tricks, and techniques which Ken developed over years of research and real-world experience.

DIYBI eBook Series

We first launched this series in the spring of 2017 with the Excel Edition, and the Power Query edition followed later that summer. You can read some more about why Ken decided to create this series in his initial blog post about it.

The Excelguru Newsletter

The monthly Excelguru email newsletter features the latest updates for Excel and Power BI, as well as upcoming training sessions and events, new products, and other information that might be of interest to the Excel and Power BI community.

Don't Miss Out, Get Your Free Copy of the Series

If you're not already a newsletter subscriber, you can sign up here. We will send you the Excel Edition right away, and the Power Query Edition a few days later. All of our current and new subscribers will receive the Power Pivot edition once it is released on July 3, 2018. Be sure to keep an eye on your inbox for the new book.

We will be continuing to work on the fourth and final book, the Power BI Edition, over the coming months so stay tuned for details!

Number rows by group using Power Query

After one of my previous sorting posts, Devin asked if we can number rows by group.  Actually, that's a paraphrase… what he really asked was:

Any thoughts on how to produce something like a ROW_NUMBER function with PARTITION from T-SQL? Similar to this: https://docs.microsoft.com/en-us/sql/t-sql/functions/row-number-transact-sql?view=sql-server-2017#d-using-rownumber-with-partition

I've never used the PARTITION function in SQL, so I checked in with him just to confirm what he was after.  And here it is: take the first three columns of the table below, and add a final column with index numbers sorted by and restarting with each group:

Number rows by group example

And of course there is.  For this one, though, we actually do need to write a little code.

Preparing to Number rows by Group

Now here's the interesting part.  The source data looks exactly the same as what you see above, the only difference being that in the output we also added a number rows by group.  So how?

Well, we start by grouping the data.  Assuming we start by pulling in a table like the above with only the first 3 columns:

  • Sort the data based on the Sales column (in descending order)
  • Group the data by Group
  • Add a aggregation column for "All Rows"

Like this:

Group By dialog

Which yields this:

Table after aggregation

We are now ready to add the numbering.

Now to Number rows by Group

In order to number rows by group, what we really want to do is add an Index column, but we need to add it for each of the individual tables in our grouped segments.  If we try to add it here, we'll get 2 new values, showing 1 for Alcohol and 2 for Food.  That's not exactly what we need.  But if we expand the column, then the index will not reset when it hits Food, and the numbering won't be right either.

The secret here is to add an Index column to each of the individual tables in the Data column, then expand the columns.  The numbering will then be correct.

To do this, I added a custom column using the following code:

=Table.AddIndexColumn([Data], "Index", 1, 1)

Where did I get this code?  I actually added an Index column to the whole table.  That added the following code in the formula bar:

=Table.AddIndexColumn(#"Grouped Rows", "Index", 1, 1)

I copied that code, and deleted the step. Then I added my custom column, pasted the code, and replaced #"Grouped Rows" (the name of the previous step) with the column I wanted to use.

The result was this:

Table after Index column added

The final steps to clean this up:

  • Remove the Data column
  • Expand all columns from the Partitioned table except the Group (since we already have it)

Which leaves us with our table in place including the new column which does number rows by Group as originally planned.

If you want to play with this one, the example file can be found here.

I would also be remiss if I didn't mention that we have a great video in our Power Query Academy that covers this kind of operation (among others).  It's called "Advanced Row Context" (in our M deep dive section) where Miguel shows all kinds of cool stuff that you can do by adding new columns to Grouped Rows.

Trick to Protect Excel Tables

Slobodan emailed me to describe a trick to protect Excel tables that he is using to drive data validation lists.  The data validation lists are sourced from tables loaded via Power Query, and leverage a little hack to hide them from prying users eyes.  I thought it would be cool if he shared it with everyone, so asked him to write up a little blog post on it, and here it is!

Take it away Slobodan…

Hello everybody,

Recently, my team and I had faced a problem with refreshing PQ tables that we managed to solve with a simple trick (no VBA coding), and shared it with Ken who asked me to share it with community. Thank you Ken for this opportunity! Glad to make some kind of contribution, to all of you PQ users.

Solution Background

We created calculation model for our sales people (Full cost calculation).  Inside this Excel file, they have a lots of drop down lists from which they can choose customer, partner etc. The idea is to make these dropdown lists dynamic.  In other words, whenever a new customer is created in SAP, they should be able to select this customer in Excel using a dropdown list. This is where Power Query comes to the rescue.

We have scheduled daily export of all our customers from SAP to a file on a network drive, and use this file as the data source for a local PQ table in the workbook. We then use our Power Query table “Customers” as the source for dropdown lists in calculation model.

The Challenge

How to make it fully automated? We have two goals here:

  1. We want Power Query to be scheduled for automatic refresh on a daily basis
  2. At the same time, we would like to protect Excel tables sourced via Power Query from careless users

For the first point, we have Power Update - a tool which allows you to schedule daily refresh.

Note from Ken: I haven't seen Slobodan's model, so there may be a need to use Power Update to do what he's doing.  If you only need your Power Queries to update each time the Excel workbook is opened, however, you could force an update by changing the table's connection properties to force an update upon open.

Second issue, in order to protect Power Query table, we need to hide these sheets and protect the workbook.  The end result is that our Customers table is hidden and cannot be unhidden and everything looks promising.

clip_image002

Of course, Excel protects the whole workbook structure using this method, which causes Power Update to fail. In fact, query refreshes also fail if we try to refresh data manually.

clip_image001

So the obvious solution doesn't work.  I spent time Googling for solution to this but could not find one 🙂

Our Solution

I am not VBA guy, but I remembered one tip from Mynda Treacy’s dashboard course which I applied here.

Step 1

  • Hide the worksheet and open the Visual Basic Editor (press Alt+F11)

Step 2

  • In the Project Explorer Window (Ctrl + R if it's not showing) select the sheet which  contains the Power Query table

clip_image003

Step 3

  • In the Properties Window (press F4 to display this), set the Visible property to "2 - xlSheetVeryHidden"

clip_image004

Step 4

  • Go to Tools --> VBAProject Properties --> Protection
  • Check the box next to "Lock Project for Viewing"
  • Set a password so only you can access it
  • Close the Visual Basic Editor

The Effect

Our sheet containing the Customers table is hidden, and there is no possibility to unhide it.  It doesn't even show up in the menu!.

clip_image005

At this point the only way to unhide the worksheet is to go into the Visual Basic Editor, and reset the worksheet's Visible property - but you protected the VBA project with a password so no one can get in there.

The great thing is that refreshing the Power Query tables will work, because you didn’t actually lock the workbook structure.

Caveat

This solution is intended to protect data from regular excel users, who can easily mess up your workbook.  Do be aware that users with VBA skills will be able to break the password, or extract the hidden sheet contents.

Hopefully someone finds this useful 🙂

Take care!

Your Voice Matters – Power Query Quality

Some time ago we embarked on a bit of a crusade to get Microsoft to fix a specific issue with Power Query related to performance.  I posted about it in detail on the Power Pivot Pro blog, and have been encouraging people to vote on this at every possible turn.  Conferences, classes, my free e-Book series, even the Microsoft Data Insights Summit - none of them were immune to hearing me drum up support to get this fixed.

Your voice matters

You voted, and Microsoft listened.  I'm super excited to let you know that they have architected a fix based on your votes, and it's finally out of testing!

Even though the idea is still marked as "started" on UserVoice, (I got this directly from a reliable source,) it is starting to roll out to the Office Insider channels on Office 365.  I'm told that so long as you're on version 1801, build 9001 or higher, you're good to go and have the fix in place.  You can locate your version and build information via File --> Account:

image

Thank you for voting!

While Microsoft does listen to me, my voice only goes so far.  Your voice matters a huge amount in this process, as it validates that it is bigger than just one person.  I want to thank everyone who voted for this fix to raise it's importance along the way!

Can't wait for the fix?

You can get on the Insider channel, and that will get you the fix.  There are two methods to do this:

For consumer licenses of Office 365, you can find out more here.

For commercial users, you have to download and configure an installer, which you can do here.

Will this fix all your Power Query refresh speeds?

Absolutely not.  This deals with one specific technical issue that we identified which was re-doing work multiple times.  There is still much improvement that needs to be done, but at least we've got a start here to bring Excel back to parity with Power BI Desktop.

In the mean time, if your queries are going slow, you might want to consider our Power Query Academy.  We have a module on Query Optimization which teaches how to use Buffer functions, provides a strategy to reduce lag during development and also shows a few settings that can be tweaked to make your code run faster.

Ranking Method Choices in Power Query

My recent post on showing the Top X with Ties inspired a discussion on ranking methods.  Where I was looking to rank using what I now know as a standard competition rank, Daniil chose to use a dense ranking method instead.  Oddly, as an accountant, I've never really been exposed to how many different ways there are to rank things - and I'd certainly never heard the terms skip and dense before. (At least not tied to ranking!)

Naturally, after a few emails with Daniil and a bit of a read over at Wikipedia on 6 different common ranking methods, I had to see if I could reproduce them in Power Query.

What are the 6 different ranking methods?

Let's look at a visual example first.   These were all created in Excel using standard formulae:

image

The first thing I had to do was figure out what each ranking method actually does.  So here's a quick summary according to Wikipedia's article on the subject:

  • Ordinal Ranking - This ranking method uses sequential number for each row of data, without concern for ties
  • Standard Competition Ranking - Also know as a form of a Skip ranking, this method gives ties the same rank, but the following value(s) are skipped.  In this case, our values go 1,2,3,4,4, 6.  (5 is skipped as the 5th item is tied with the 4th)
  • Modified Competition Ranking - This is similar to the Standard Competition ranking method, but the skipped values come before the ties.  In this case, we would get 1,2,3, 5, 5, 6.  (As 4 and 5 are tied, they both get ranked at the lower rank.)
  • Dense Rank - In this ranking method, ties are given the same value, but the next value is not skipped.  In this case we have 1, 2, 3, 4, 4, 5.
  • Fractional Rank - Now this one is just weird to me, and I'd love to know if anyone has actually used this ranking method in the real world.  In this algorithm, ties are ranked as the mean of the tied ordinal rank.  Very strange to me, but it won't stop me from building it!

So know that we know what they all are, let's build them in Power Query so that we can perform them in both Power BI and Excel.

Groundwork for demonstrating the ranking methods

If you download the sample workbook, you'll see that it has the full table shown above.  To make this easy, I set up a staging table called SalesData as via the following steps:

  • Select a cell in the Excel table --> Data --> From Table/Range
  • Select the Item and Sales columns --> right click --> Remove Other Columns
  • Load it as a connection only

This gave me a simple table with only the product names and values as shown here:

image

As you can see, the values column has already been sorted in descending order, something that is key to ranking our ties.

One thing I should just mention now is that - for every ranking method - we will actually start every new query by:

  • Referencing the SalesData query
  • Renaming the new query to represent the ranking method being demonstrated

That means that I'm just going to give the steps each time based on the view above, since that's what we should get from the referencing step.

Ranking Method 1: Ordinal Rank

This ranking method is super easy to create:

  • Sort the Sales column in descending order
  • Sort the Item column in ascending order (to rank ties alphabetically)
  • Go to Add Column --> Index Column --> From 1
  • Rename the Index column to Rank
  • Reorder the columns if desired

Yes, that's it.  It simply adds a row number to the way you sorted your data, as shown  here:

Ordinal Ranking Method in Power Query

Ranking Method 2: Standard Competition Rank

This ranking method involves using a little grouping to get the values correct:

  • Sort the Sales column in descending order
  • Add an Index column from 1
  • Go to Transform --> Group
    • Group by the Sales column
    • Create the following columns:
      • Rank which uses the Min operation on the Index column
      • Data which uses the All Rows operation
  • Expand the Item column
  • Reorder the columns if desired

The result correctly shows that the Dark Lager and Winter Ale - 4th and 5th in the list, but tied at 557, each earn a rank of 4, and the Member Pale Ale (6th in the list) comes in with a rank of 6.  There is no item ranked 5th, since their rank was improved to be in a 4th place tie.

Standard Competition Ranking Method in Power Query

Ranking Method 3: Modified Competition Rank

To create ranking following the Modified Competition ranking method, we need to:

  • Sort the Sales column in descending order
  • Add an Index column from 1
  • Go to Transform --> Group
    • Group by the Sales column
    • Create the following columns:
      • Rank which uses the Max operation on the Index column
      • Data which uses the All Rows operation
  • Expand the Item column
  • Reorder the columns if desired

The only real difference between this ranking method and the standard competition rank is that we create the Rank column using the Max of the Index column instead of the Min used in the previous method.

The result correctly shows that the Dark Lager and Winter Ale - 4th and 5th in the list, but tied at 557, now earn a rank of 5 (not 4 like the standard rank).  There is no item ranked 4th, since their rank was dropped to reflect a 5th place tie.

Modified Competition Ranking Method in Power Query

Ranking Method 4: Dense Rank

The dense ranking method requires a change to the order of the steps from what we did in the standard competition ranking method.  Namely the Group By command must come before the addition of the Index column:

  • Sort the Sales column in descending order
  • Go to Transform --> Group
    • Group by the Sales column
    • Create the following columns:
      • Data which uses the All Rows operation
  • Add an Index column from 1
  • Expand the Item column
  • Reorder the columns if desired

This method will yield the results found here:

Dense Ranking Method in Power Query

The result correctly shows that the Dark Lager and Winter Ale - 4th and 5th in the list, but tied at 557, ranked in 4th place - just the same as the Standard Competition rank.  But where it differs can be seen in the ranking of the Member Pale Ale.  6th in the list, it is ranked 5th, as there are no gaps left after the ties.

Ranking Method 5: Fractional Rank

As I mentioned at the outset, I find this to be one of the strangest methods of ranking.  Like the others though, it's actually really easy to create when you know how. (And certainly more straight forward than using an Excel formula to calculate it!)

  • Sort the Sales column in descending order
  • Add an Index column from 1
  • Go to Transform --> Group
    • Group by the Sales column
    • Create the following columns:
      • Rank which uses the Average operation on the Index column
      • Data which uses the All Rows operation
  • Expand the Item column
  • Reorder the columns if desired

One thing I will say… it's certainly makes it obvious that there are other ties in the table.  Maybe that's the point of it?

Fractional Ranking Method in Power Query

Final Thoughts

I was actually surprised to see how easy it is to change the ranking methods with just some minor modifications to the order of steps and/or the aggregation chosen when applying the grouping method.  It certainly gives us some robust choices!

And while we can certainly create each ranking method using Excel formulas (each is demonstrated in the sample file if you're curious), this is even more awesome.  Now we don't need to load data and land it in the grid.  We can go straight to Power Pivot or Power BI should be need to.

If you'd like to download a file with each of the methods illustrated, just click here.