Disaster Recovery in Puerto Rico with Power Query

When Ken was at the Microsoft Business Applications Summit a few weeks ago, he met Mr. J.A. Garcia who has been doing some amazing work with Power Query. We wanted to share his story about how he has been using Power Query in helping with disaster recovery efforts in Puerto Rico:

"[In] my line of work there's been two defining moments that have changed the way we look at our tools. The first one was the Zika outbreak and the second one was Hurricane Maria.

The first time I saw Power Query was [as part of] Power BI during the Zika outbreak [in 2016]. One of our clients needed up-to-date information of the Zika outbreak and its effect on healthcare. With the help of a consultant, we started using Power BI and Power Query.

Aedes aegypti mosquito

An Aedes aegypti mosquito, one of the main transmitters of Zika virus.

I began taking courses during that time, and one of them was about Excel. That's when I learned about Get & Transform in Excel 2016.

Any new job that I received, I tried to use Power Query. I taught myself SQL so I could understand better the process of extracting data and how to integrate it into Power Query.

Our job was changing. We could give the tools to our clients that would let them refresh when they needed it the most. No more waiting [on] our area for a data refresh!

Then Hurricane Maria hit Puerto Rico [in September/October 2017]. It was a harsh two weeks of no communication. As soon as I came back from work, I noticed the change in attitude. As a healthcare company, we began doing Public Health.

Hurricane Maria - Disaster Recovery with Power Query

Hurricane Maria is regarded as being the worst natural disaster on record to affect Dominica and Puerto Rico and the deadliest Atlantic hurricane since Hurricane Stan in 2005.

My main job was identifying members with certain serious conditions. I used Power Query and Excel to create processes that obtain information from the assessment done to keep track of the efforts of the company. The clients could refresh the data and see who was missing, fix any data entry errors and more.

I'm very proud of my work, and Power Query in Excel and Power BI has been a large part of my growth. In the present, we have created a tool that refreshes constantly to help identify members with serious conditions. Now in case of any emergency, we'll know who to attend."

~ J.A. Garcia

We were very inspired how Mr. Garcia began is Power Query journey as part of the disaster recovery efforts after these emergencies, and that he and his team continue to leverage this powerful tool in both Excel and Power BI. Power Query really can help save lives!

Do you have an story to share about your Power Query journey? Maybe it hasn't saved your life literally, but perhaps it has saved you hours of time and effort, a significant amount of money, or even your sanity! Let us know in the comments below or contact us through the Excelguru site.

Power Query Challenge #2 Results

What an overwhelming response to Power Query Challenge #2!  We had 40 submissions, and some with multiple entries in a single submission.  Plainly you all enjoyed this!

Naturally, there were a couple of submissions that involved custom functions, and a couple who wrote manual grouping functions to get things done.  These folks obviously know how M works, so I'm going to focus more on the other entries to show different UI-driven routes to accomplish the goal.  Winking smile  Each of those is included in the workbook that you can download here.

The Base Query

I'm going to start this by creating a base query called "Source Data" which has only 2 steps:

  • Connect to the Data Table
  • Set the data types

This is going to make it easy to demo certain things, but also replicates what a lot of you did anyway.

Most Popular Solutions to Power Query Challenge #2

By far the most popular solution to Power Query Challenge #2 was by starting using one of the following two methods:

Method 1A - Group & Merge

  • Reference the Source Data query
  • Merge Customer & Membership
  • Remove duplicates on the merged column
  • Group by the Customer column and add a Count of Rows

Method 1B - Group & Merge

  • Reference the Source Data query
  • Remove all columns except Customer & Membership
  • Group by the Customer column and add a Count of Distinct Rows

Either of these methods would leave you with something similar to this:

image

Method 1 Completion

No matter which way you sliced the first part, you would then do this to finish it off:

  • Filter the Count column to values greater than 1
  • Merge the filtered table against the original data set:
    • Matching the Customer column
    • Using an Inner join
  • Remove all columns except the new column of tables

image

  • Expand all columns
  • Set the data type of the Data column and you're good

image

Of the 34 entries, this variation showed up in at least 25 of them.  Sometimes it was all done in a single query (referencing prior steps), sometimes in 3 queries, and sometimes it wasn't quite as efficiently done, but ultimately this was the main approach.

A Unique Solution to Power Query Challenge #2

I only had one person submit this solution to Power Query Challenge #2.  Given that it is 100% user interface driven and shows something different, I wanted to show it as well.  I've labelled this one as Pivot & Merge.

Here's the steps:

  • Reference the Source Data query
  • Remove all columns except Customer & Membership
  • Select both columns --> Remove Duplicates
  • Pivot the Customer column (to get of products by customer)

image

  • Demote the headers to first row
  • Transpose the table

And at that point, you have this view:

image

Look familiar?  You can now finish this one using the steps in "Method 1 Completion" above.

Personally, I don't think I'd go this route, only because the Pivot/Transpose could be costly with large amounts of data.  (To be fair, I haven't tested any of these solutions with big data.)  But it is cool so see that there are multiple ways to approach this.

The Double Grouping Solution to Power Query Challenge #2

This is the solution that I cooked up originally, and is actually why I threw this challenge out.  I was curious how many people would come up with this, and only a couple of people put this out there.  So here's how it works:

  • Reference the Source Data query
  • Stage 1 grouping:
    • Group the data by Customer and Membership
    • Add a column called "Transactions" using the All Rows operation

This leaves you here:

image

Now, you immediately group it again using a different configuration:

  • Group by Customer
  • Add columns as follows:
    • "Products" using the Count Distinct Rows operation
    • "Data" using the All Rows operation

Which leaves you at this stage:

image

It's now similar to what you've seen above, but we have a nested table that contains our original data.  To finish this off, we now need to do this:

  • Filter Products to Greater than 1
  • Expand only the Transactions column from the Data column
  • Right click the Transactions column --> Remove Other Columns
  • Expand all fields from the Transactions column
  • Set the data types for all the columns

And you're there!

image

Final Thoughts

Again, there were more solutions submitted for Power Query Challenge #2.  We had:

  • A couple of custom function submissions (of which each was slightly different)
  • A couple of custom grouping solutions (not written through the UI)
  • A couple of solutions that used grouping, then used a custom column to create a table based on the grouped output which filtered to distinct items

If I haven't covered yours here and you feel that I missed something important, please drop it in the comments below!

The part that fascinates me most about this is that we had UI driven submissions involving merging, transposing and grouping.  Three different methods to get into the same end result.

Thanks for the submissions everyone!

Update on the Master Your Data Book (Data Monkey v2)

Miguel and I were at the Microsoft Business Applications Summit last week, and we were frequently asked for an update on the Master Your Data book (aka M is for Data Monkey version 2.0). We were told that it’s time. People pointed out that they had pre-ordered it on Amazon ages ago. Enthusiasts asked why we don’t have a subscription model with monthly updates like Power Query does.

Master Your Data Book Cover

There’s a hunger to see the new version. We’re flattered that you rely on us, and honestly, we’re gutted that you are still waiting for it. And after attending the summit, we know that we need to give you an update on the Master Your Data book.

Some background on publishing…

Before we tell you what’s happening, I’d like to just explain a bit about the back story on what we have to consider when we write books on technology. The primary factors are:

  1. How to fit it in with our schedules. Even over the long term, books don’t come close to earning anywhere near the financial rewards of just dedicating time to consulting projects. (This is a big factor in the subscription question.)
  2. In today’s world of constant updates, we know that there are new features added on a monthly basis. The question is, which ones are serious enough to cause us to delay the release?

If you look back at M is for Data Monkey, we are really proud of its long-term value and continued relevance. It kills us that merges aren’t in there… they came out a few days after the book went to print. Would we have held the book for them if we had known? Yes. Would we have held the book for conditional columns? No. This is just one of the kinds of decisions we have to face.

No matter which way we go, we’ll always wish we waited for the next great feature. And we can’t. We know that. But our goal is to make sure that the material inside the book stands the test of time well and continues to hold relevance as features are added and changed. We believe that we’ve managed to do that with M is for Data Monkey fairly well, all things considered. Are there easier ways to do some things today? Yes. But does the book give you a deeper understanding and still let you accomplish the same goals? We believe it does.

The factors that lead to delays

Features are one thing. They generally add new functionality. But User Interface changes are something else entirely…

Earlier this year, we made the call to delay the Master Your Data book in order to get a clearer picture on what Excel 2019 was going to look like. We needed to know which Power Query features would be there, and which wouldn’t. It just doesn’t make sense to publish a book around the same time of Excel 2019’s release with Excel 2016 screen shots. We’ve already been told that some stuff in M is for Data Monkey looks “dated”. We certainly didn’t want the new Master Your Data book to be “dated” on the day of release due to a User Interface change.

And now, at the Microsoft Business Applications summit, we saw a preview of what is targeted for release into Power Query in the next few months. These new features are significant, they are impactful. But most relevant to us is that they contain a significant change to the Power Query User Interface. They will affect every single screen we use. They will affect every single screen shot we take. And if we don’t wait, we will deliver to you a brand-new book that has pictures that don’t look anything like the User Interface you see on screen. Even if we were to push material to the publisher today, it takes 2-3 months to get the book to Amazon, so best case, you get 2-3 months' use out of the book.

To us, that is irresponsible. We refuse to take your money and deliver you a substandard product. It’s just not right.

Just how significant were the MS Biz Apps announcements?

If you weren’t at the Microsoft Business Applications Summit, you might not know about these announcements. You can read the full list here, but let’s recap the key ones for us here. We can divide them into two categories:

  1. Awesome-but-not-critical (i.e. we would cry because we couldn’t include these, but wouldn’t delay the book for them)
    • New data connectors (including extract from PDF)
    • Fuzzy lookup
  2. Critical features (stuff that must be in the Excel version of Power Query before we can test material, write about nuances, shoot images and release the book)
    • Data profiling (quality) previews
    • M Intellisense in the Advanced Editor, formula bar and Add Custom Column window

These last two features will have a significant effect on the images of the book, as you can see here:

Power Query UI Preview

What is the revised timeline for the Master Your Data book?

The new Power Query features are estimated to arrive in Power BI Desktop by October 2018. And based on the historical pattern, these features will show up in Excel within 2-3 months of their Power BI release date. Giving us time to test the new features, take screenshots, revamp the book order to best tell the data story the way we need to… We are hoping to have the book in print by the end of Q1 2019. It’s still aggressive on our side, but that is our refined target. If the builds ship later, or things take longer then anticipated, it could slide it into Q2.

Yes, we know it’s a long way away. We know you’ve been waiting, and we wish it could be faster. But again, we hope that you understand that we are doing this to truly give you the best book that will last longer than it could otherwise.

In the mean time – can we give you something else?

We have been working on another product as well: Master Your Data in Excel & Power BI recipe cards.

Naturally, all members of our Power Query Academy will get a free copy of these. And due to the significant delay of the book, we’d also like to offer a free copy to anyone who has pre-ordered Master Your Data on Amazon. (More on this below.)

So, why can we do these, but not the book? It’s because they assume you already know Power Query’s User Interface, so only provide the steps on how to accomplish the goal. Loaded with before and after pictures, and the route to get from one to the other, we aren’t bound by User Interface design changes.

Here’s a quick sample of one of the cards:

Power Query Recipe: Pivoting Stacked Data

And another:

Power Query Recipe: Split Records into Columns

How will these be sold?

Ultimately, we plan on selling this product on a subscription basis through our web store as follows:

  • $14.99 for the purchase of the downloadable card set
  • $2.99/quarter for a subscription to updates

We already have 26 cards designed, with more on the way. As we expand the set, the original purchase will include the new cards. But for those on subscription, we will update your original purchase and give you access to the new cards when we release them. We’re not intending to hold these for quarterly release, but rather send a new one every time we build it. You might get five in one quarter and one in the next, but our intention is to keep delivering new patterns as we discover them and build summary tips cards to illustrate them.

Wait… didn’t you say subscription doesn’t work for publishers?

For books, yes, it’s really hard. They’re complicated and require ensuring that things are taught in the right order, with all the updated techniques along the way. These Master Your Data recipe cards are snapshots of what to do in certain data cases when working with data in Excel and Power BI, so are a much more refined scope.

It’s way easier for us to update cards, or add entirely new ones, as it doesn’t require re-writing precedent chapters. So in this case, it makes sense, as we can provide an initial catalogue of patterns, and add more over time. We’ve already got ideas for a bunch more to expand this set.

How can I get my hands on the Master Your Data Recipe Card set?

There are a few ways…

For those of you who are members of the Power Query Academy, we will add the tip cards as a resource as soon as we have them ready. It’s part of your subscription so, as long as you’re still an Academy member, you’ll get all the new ones we create. In addition, we will also make sure you get a copy of the new Master Your Data book as soon as it is released (even if your subscription has expired and you’re no longer an Academy member).

For those of you who have pre-ordered our new book on Amazon, please follow the Excelguru blog. We will post when the recipe cards are ready and will let you know what you need to do to get your free download of the initial package of cards. The subscription for updates will be available as well, but will be entirely optional.

And if you’ve just been waiting for the Master Your Data book and haven’t purchased yet, all good. We’ll be setting these up in an online store to allow you to buy the download version and (optionally) sign up for the updates as well.

When will the Master Your Data Recipe Card set be available?

Soon. We are in final design for the card set now and need to set up our web store to handle subscriptions. Our target is to have that all done by September 15, 2018, if not earlier. Keep watching here for the official announcement.

Ultimately…

…we wish we could send the Master Your Data book to you today, but hope that this will make a reasonable substitute to get you over the hump until we can. Thank you for your patience, understanding and trust in us as we work to deliver you the best version we possibly can.

Power Query Challenge #2

I'm at the Microsoft Business Application Summit this week, so I thought I'd post another Power Query challenge, especially since our last one was so successful.

For this Power Query challenge …

Our business challenge here is that we are in the process of working out how to reward customers that buy memberships in multiple business areas across our organization.  To perform this analysis, we have been provided a list of transactions that looks like this:

image

What we've been asked to generate here is a list of all the transactions pertaining to customers who have purchased from multiple business units.  In other words, we want this output:

image

There's a couple of pieces we need to watch for here:

  1. Susan and Bob are the only people in this list who bought memberships to multiple business units.
  2. Susan bought multiple Golf Course memberships (one for her and one for her spouse).  We need to keep both those transactions - even though they are in the same division - as she also bought a Marina and Fitness Club membership.
  3. Claire also bought two Golf Course memberships.  While she bought multiple products, they are from the same business area, so we want to ignore them.

What are the rules for Power Query challenge #2?

It's pretty simple really.  This is a Power Query challenge.  That means you can use Power Query in Excel, or Power Query in Power BI.  VBA, SQL and Excel formulas results don't count.  Winking smile

I've got a Power Query driven solution all cooked up to return the results above.  And now I'm curious to see how you would solve this problem using the same tool.

Ready to give this Power Query challenge a try?

Like we did with our last Power Query challenge, we're going to ask you: Please Do NOT post your answer below.  (We don't want to spoil it for anyone who wants to play along.)

Please note: the submission period for the Challenge is now closed. The submissions are being reviewed and will be discussed next week!

You can download the source data for this Power Query challenge here.  Have fun!  Smile

Power Pivot eBook Coming Soon

It's been a long time coming, but we are putting the finishing touches on the third installment of our free 'DIY BI' series. Consequently, we are excited to announce that the Power Pivot eBook will be officially released on Tuesday, July 3, 2018!

Power Pivot eBook

This brand new book will feature five of Ken's top tips, tricks, and techniques for Power Pivot, including:

  • Hiding fields from a user
  • Hiding zeros in a measure
  • Using DAX variables
  • Retrieving a value from an Excel slicer
  • Comparing data using one field on multiple slicers

Power Pivot eBook

 

About the 'DIY BI' Series

This free eBook series is available to anyone who signs up for the monthly(ish) Excelguru email newsletter. The series includes four books, one edition each for Excel, Power Query, Power Pivot, and Power BI. Each book contains five of our favourite tips, tricks, and techniques which Ken developed over years of research and real-world experience.

DIYBI eBook Series

We first launched this series in the spring of 2017 with the Excel Edition, and the Power Query edition followed later that summer. You can read some more about why Ken decided to create this series in his initial blog post about it.

The Excelguru Newsletter

The monthly Excelguru email newsletter features the latest updates for Excel and Power BI, as well as upcoming training sessions and events, new products, and other information that might be of interest to the Excel and Power BI community.

Don't Miss Out, Get Your Free Copy of the Series

If you're not already a newsletter subscriber, you can sign up here. We will send you the Excel Edition right away, and the Power Query Edition a few days later. All of our current and new subscribers will receive the Power Pivot edition once it is released on July 3, 2018. Be sure to keep an eye on your inbox for the new book.

We will be continuing to work on the fourth and final book, the Power BI Edition, over the coming months so stay tuned for details!

Number rows by group using Power Query

After one of my previous sorting posts, Devin asked if we can number rows by group.  Actually, that's a paraphrase… what he really asked was:

Any thoughts on how to produce something like a ROW_NUMBER function with PARTITION from T-SQL? Similar to this: https://docs.microsoft.com/en-us/sql/t-sql/functions/row-number-transact-sql?view=sql-server-2017#d-using-rownumber-with-partition

I've never used the PARTITION function in SQL, so I checked in with him just to confirm what he was after.  And here it is: take the first three columns of the table below, and add a final column with index numbers sorted by and restarting with each group:

Number rows by group example

And of course there is.  For this one, though, we actually do need to write a little code.

Preparing to Number rows by Group

Now here's the interesting part.  The source data looks exactly the same as what you see above, the only difference being that in the output we also added a number rows by group.  So how?

Well, we start by grouping the data.  Assuming we start by pulling in a table like the above with only the first 3 columns:

  • Sort the data based on the Sales column (in descending order)
  • Group the data by Group
  • Add a aggregation column for "All Rows"

Like this:

Group By dialog

Which yields this:

Table after aggregation

We are now ready to add the numbering.

Now to Number rows by Group

In order to number rows by group, what we really want to do is add an Index column, but we need to add it for each of the individual tables in our grouped segments.  If we try to add it here, we'll get 2 new values, showing 1 for Alcohol and 2 for Food.  That's not exactly what we need.  But if we expand the column, then the index will not reset when it hits Food, and the numbering won't be right either.

The secret here is to add an Index column to each of the individual tables in the Data column, then expand the columns.  The numbering will then be correct.

To do this, I added a custom column using the following code:

=Table.AddIndexColumn([Data], "Index", 1, 1)

Where did I get this code?  I actually added an Index column to the whole table.  That added the following code in the formula bar:

=Table.AddIndexColumn(#"Grouped Rows", "Index", 1, 1)

I copied that code, and deleted the step. Then I added my custom column, pasted the code, and replaced #"Grouped Rows" (the name of the previous step) with the column I wanted to use.

The result was this:

Table after Index column added

The final steps to clean this up:

  • Remove the Data column
  • Expand all columns from the Partitioned table except the Group (since we already have it)

Which leaves us with our table in place including the new column which does number rows by Group as originally planned.

If you want to play with this one, the example file can be found here.

I would also be remiss if I didn't mention that we have a great video in our Power Query Academy that covers this kind of operation (among others).  It's called "Advanced Row Context" (in our M deep dive section) where Miguel shows all kinds of cool stuff that you can do by adding new columns to Grouped Rows.

Trick to Protect Excel Tables

Slobodan emailed me to describe a trick to protect Excel tables that he is using to drive data validation lists.  The data validation lists are sourced from tables loaded via Power Query, and leverage a little hack to hide them from prying users eyes.  I thought it would be cool if he shared it with everyone, so asked him to write up a little blog post on it, and here it is!

Take it away Slobodan…

Hello everybody,

Recently, my team and I had faced a problem with refreshing PQ tables that we managed to solve with a simple trick (no VBA coding), and shared it with Ken who asked me to share it with community. Thank you Ken for this opportunity! Glad to make some kind of contribution, to all of you PQ users.

Solution Background

We created calculation model for our sales people (Full cost calculation).  Inside this Excel file, they have a lots of drop down lists from which they can choose customer, partner etc. The idea is to make these dropdown lists dynamic.  In other words, whenever a new customer is created in SAP, they should be able to select this customer in Excel using a dropdown list. This is where Power Query comes to the rescue.

We have scheduled daily export of all our customers from SAP to a file on a network drive, and use this file as the data source for a local PQ table in the workbook. We then use our Power Query table “Customers” as the source for dropdown lists in calculation model.

The Challenge

How to make it fully automated? We have two goals here:

  1. We want Power Query to be scheduled for automatic refresh on a daily basis
  2. At the same time, we would like to protect Excel tables sourced via Power Query from careless users

For the first point, we have Power Update - a tool which allows you to schedule daily refresh.

Note from Ken: I haven't seen Slobodan's model, so there may be a need to use Power Update to do what he's doing.  If you only need your Power Queries to update each time the Excel workbook is opened, however, you could force an update by changing the table's connection properties to force an update upon open.

Second issue, in order to protect Power Query table, we need to hide these sheets and protect the workbook.  The end result is that our Customers table is hidden and cannot be unhidden and everything looks promising.

clip_image002

Of course, Excel protects the whole workbook structure using this method, which causes Power Update to fail. In fact, query refreshes also fail if we try to refresh data manually.

clip_image001

So the obvious solution doesn't work.  I spent time Googling for solution to this but could not find one 🙂

Our Solution

I am not VBA guy, but I remembered one tip from Mynda Treacy’s dashboard course which I applied here.

Step 1

  • Hide the worksheet and open the Visual Basic Editor (press Alt+F11)

Step 2

  • In the Project Explorer Window (Ctrl + R if it's not showing) select the sheet which  contains the Power Query table

clip_image003

Step 3

  • In the Properties Window (press F4 to display this), set the Visible property to "2 - xlSheetVeryHidden"

clip_image004

Step 4

  • Go to Tools --> VBAProject Properties --> Protection
  • Check the box next to "Lock Project for Viewing"
  • Set a password so only you can access it
  • Close the Visual Basic Editor

The Effect

Our sheet containing the Customers table is hidden, and there is no possibility to unhide it.  It doesn't even show up in the menu!.

clip_image005

At this point the only way to unhide the worksheet is to go into the Visual Basic Editor, and reset the worksheet's Visible property - but you protected the VBA project with a password so no one can get in there.

The great thing is that refreshing the Power Query tables will work, because you didn’t actually lock the workbook structure.

Caveat

This solution is intended to protect data from regular excel users, who can easily mess up your workbook.  Do be aware that users with VBA skills will be able to break the password, or extract the hidden sheet contents.

Hopefully someone finds this useful 🙂

Take care!

Your Voice Matters – Power Query Quality

Some time ago we embarked on a bit of a crusade to get Microsoft to fix a specific issue with Power Query related to performance.  I posted about it in detail on the Power Pivot Pro blog, and have been encouraging people to vote on this at every possible turn.  Conferences, classes, my free e-Book series, even the Microsoft Data Insights Summit - none of them were immune to hearing me drum up support to get this fixed.

Your voice matters

You voted, and Microsoft listened.  I'm super excited to let you know that they have architected a fix based on your votes, and it's finally out of testing!

Even though the idea is still marked as "started" on UserVoice, (I got this directly from a reliable source,) it is starting to roll out to the Office Insider channels on Office 365.  I'm told that so long as you're on version 1801, build 9001 or higher, you're good to go and have the fix in place.  You can locate your version and build information via File --> Account:

image

Thank you for voting!

While Microsoft does listen to me, my voice only goes so far.  Your voice matters a huge amount in this process, as it validates that it is bigger than just one person.  I want to thank everyone who voted for this fix to raise it's importance along the way!

Can't wait for the fix?

You can get on the Insider channel, and that will get you the fix.  There are two methods to do this:

For consumer licenses of Office 365, you can find out more here.

For commercial users, you have to download and configure an installer, which you can do here.

Will this fix all your Power Query refresh speeds?

Absolutely not.  This deals with one specific technical issue that we identified which was re-doing work multiple times.  There is still much improvement that needs to be done, but at least we've got a start here to bring Excel back to parity with Power BI Desktop.

In the mean time, if your queries are going slow, you might want to consider our Power Query Academy.  We have a module on Query Optimization which teaches how to use Buffer functions, provides a strategy to reduce lag during development and also shows a few settings that can be tweaked to make your code run faster.

Ranking Method Choices in Power Query

My recent post on showing the Top X with Ties inspired a discussion on ranking methods.  Where I was looking to rank using what I now know as a standard competition rank, Daniil chose to use a dense ranking method instead.  Oddly, as an accountant, I've never really been exposed to how many different ways there are to rank things - and I'd certainly never heard the terms skip and dense before. (At least not tied to ranking!)

Naturally, after a few emails with Daniil and a bit of a read over at Wikipedia on 6 different common ranking methods, I had to see if I could reproduce them in Power Query.

What are the 6 different ranking methods?

Let's look at a visual example first.   These were all created in Excel using standard formulae:

image

The first thing I had to do was figure out what each ranking method actually does.  So here's a quick summary according to Wikipedia's article on the subject:

  • Ordinal Ranking - This ranking method uses sequential number for each row of data, without concern for ties
  • Standard Competition Ranking - Also know as a form of a Skip ranking, this method gives ties the same rank, but the following value(s) are skipped.  In this case, our values go 1,2,3,4,4, 6.  (5 is skipped as the 5th item is tied with the 4th)
  • Modified Competition Ranking - This is similar to the Standard Competition ranking method, but the skipped values come before the ties.  In this case, we would get 1,2,3, 5, 5, 6.  (As 4 and 5 are tied, they both get ranked at the lower rank.)
  • Dense Rank - In this ranking method, ties are given the same value, but the next value is not skipped.  In this case we have 1, 2, 3, 4, 4, 5.
  • Fractional Rank - Now this one is just weird to me, and I'd love to know if anyone has actually used this ranking method in the real world.  In this algorithm, ties are ranked as the mean of the tied ordinal rank.  Very strange to me, but it won't stop me from building it!

So know that we know what they all are, let's build them in Power Query so that we can perform them in both Power BI and Excel.

Groundwork for demonstrating the ranking methods

If you download the sample workbook, you'll see that it has the full table shown above.  To make this easy, I set up a staging table called SalesData as via the following steps:

  • Select a cell in the Excel table --> Data --> From Table/Range
  • Select the Item and Sales columns --> right click --> Remove Other Columns
  • Load it as a connection only

This gave me a simple table with only the product names and values as shown here:

image

As you can see, the values column has already been sorted in descending order, something that is key to ranking our ties.

One thing I should just mention now is that - for every ranking method - we will actually start every new query by:

  • Referencing the SalesData query
  • Renaming the new query to represent the ranking method being demonstrated

That means that I'm just going to give the steps each time based on the view above, since that's what we should get from the referencing step.

Ranking Method 1: Ordinal Rank

This ranking method is super easy to create:

  • Sort the Sales column in descending order
  • Sort the Item column in ascending order (to rank ties alphabetically)
  • Go to Add Column --> Index Column --> From 1
  • Rename the Index column to Rank
  • Reorder the columns if desired

Yes, that's it.  It simply adds a row number to the way you sorted your data, as shown  here:

Ordinal Ranking Method in Power Query

Ranking Method 2: Standard Competition Rank

This ranking method involves using a little grouping to get the values correct:

  • Sort the Sales column in descending order
  • Add an Index column from 1
  • Go to Transform --> Group
    • Group by the Sales column
    • Create the following columns:
      • Rank which uses the Min operation on the Index column
      • Data which uses the All Rows operation
  • Expand the Item column
  • Reorder the columns if desired

The result correctly shows that the Dark Lager and Winter Ale - 4th and 5th in the list, but tied at 557, each earn a rank of 4, and the Member Pale Ale (6th in the list) comes in with a rank of 6.  There is no item ranked 5th, since their rank was improved to be in a 4th place tie.

Standard Competition Ranking Method in Power Query

Ranking Method 3: Modified Competition Rank

To create ranking following the Modified Competition ranking method, we need to:

  • Sort the Sales column in descending order
  • Add an Index column from 1
  • Go to Transform --> Group
    • Group by the Sales column
    • Create the following columns:
      • Rank which uses the Max operation on the Index column
      • Data which uses the All Rows operation
  • Expand the Item column
  • Reorder the columns if desired

The only real difference between this ranking method and the standard competition rank is that we create the Rank column using the Max of the Index column instead of the Min used in the previous method.

The result correctly shows that the Dark Lager and Winter Ale - 4th and 5th in the list, but tied at 557, now earn a rank of 5 (not 4 like the standard rank).  There is no item ranked 4th, since their rank was dropped to reflect a 5th place tie.

Modified Competition Ranking Method in Power Query

Ranking Method 4: Dense Rank

The dense ranking method requires a change to the order of the steps from what we did in the standard competition ranking method.  Namely the Group By command must come before the addition of the Index column:

  • Sort the Sales column in descending order
  • Go to Transform --> Group
    • Group by the Sales column
    • Create the following columns:
      • Rank which uses the Max operation on the Index column
      • Data which uses the All Rows operation
  • Add an Index column from 1
  • Expand the Item column
  • Reorder the columns if desired

This method will yield the results found here:

Dense Ranking Method in Power Query

The result correctly shows that the Dark Lager and Winter Ale - 4th and 5th in the list, but tied at 557, ranked in 4th place - just the same as the Standard Competition rank.  But where it differs can be seen in the ranking of the Member Pale Ale.  6th in the list, it is ranked 5th, as there are no gaps left after the ties.

Ranking Method 5: Fractional Rank

As I mentioned at the outset, I find this to be one of the strangest methods of ranking.  Like the others though, it's actually really easy to create when you know how. (And certainly more straight forward than using an Excel formula to calculate it!)

  • Sort the Sales column in descending order
  • Add an Index column from 1
  • Go to Transform --> Group
    • Group by the Sales column
    • Create the following columns:
      • Rank which uses the Average operation on the Index column
      • Data which uses the All Rows operation
  • Expand the Item column
  • Reorder the columns if desired

One thing I will say… it's certainly makes it obvious that there are other ties in the table.  Maybe that's the point of it?

Fractional Ranking Method in Power Query

Final Thoughts

I was actually surprised to see how easy it is to change the ranking methods with just some minor modifications to the order of steps and/or the aggregation chosen when applying the grouping method.  It certainly gives us some robust choices!

And while we can certainly create each ranking method using Excel formulas (each is demonstrated in the sample file if you're curious), this is even more awesome.  Now we don't need to load data and land it in the grid.  We can go straight to Power Pivot or Power BI should be need to.

If you'd like to download a file with each of the methods illustrated, just click here.

Creating Dynamic Parameters in Power Query

A couple of years ago, the Power Query team added Parameters as a proper object, but I kept on Creating Dynamic Parameters in Excel Power Query the same way as I always had.  The reason for this is two-fold: the first is because I was used to it, the second was because the built-in Parameters are quite static.  Sure, you can set up a list and change them at run time, but you have to enter the Power Query editor to do that.  And is that really something you want your boss doing?

So why do we care about creating dynamic parameters, anyway?

Let's take a look my last technical blog post to understand this.  In that post, I pulled a Top 5 value from an Excel cell, and used that to drive how I grouped my items.  It works great, and is truly dynamic.  It puts control of the grouping in Excel, allowing a friendly user interface for the end user to work with.  They simply change a cell value, hit refresh, and all is good.

The challenge here is not from the end user's perspective, it's from the developer's.  One of the instructions I gave in the post last week was to:

  • Create a Custom Column using the following formula:
    • if [Rank] <= TopX then [Item] else "Other"

Why a Custom Column?  Why not just use the Conditional Column dialog?  The answer is simple… TopX in this case was a query that returned a value, but it was not a proper Power Query Parameter.  Does it work the same in code?  Pretty much yes, but you can't see it in the Conditional Column dialog as you're building the query.

Even worse, if you want to make any modifications to the logic, you have to do it in either the formula bar or the Advanced Editor, as the gear icon returns the conditional column builder, but can't resolve the query:

Conditional Column Dialog

Wouldn't it be nice if we could create dynamic parameters that actually show up as valid Parameters to Power Query?  That's the goal of this post.

Groundwork - Creating the dynamic parameters in Excel

There are two different ways we can do this:

 1. Fetching dynamic parameters using a Named Range

This is the super easy method.  To do this:

  • Enter your parameter value in a worksheet cell
  • Go to the Name Manager and define a name for the cell (I called mine rngKeep)
  • Select the cell and pull the data into Power Query
  • Right click the value in the table's cell --> Drill Down
  • Rename the query
  • Load it as a Connection only

For this example, I renamed my query to XL_TopX_NamedCell.  It's a long name, I know, but you'll see why in a bit.

2. Fetching dynamic parameters from a Parameter Table using the fnGetParameter function

I've detailed this technique on the blog before, so if you'd like to review this technique, you can find a detailed post on that here.  The quick summary:

  • Create a two column table called Parameters, with Parameter and Value columns

Parameters Table

  • Copy in the fnGetParameter function (from the other post)
  • Call the function as needed

Just to check the value, I then:

  • Created a blank query
  • Entered the following formula in the formula bar
    • =fnGetParameter("Keep top")
  • Named this query XL_TopX_fnGetParameter
  • Loaded it as a connection only

Query to check the dynamic parameter value

So what makes a parameter a "Real" parameter?

At this point, I decided to create a new parameter and look at what happens.  To do this, go in to the Power Query editor and…

  • Go to Home --> Manage Parameters --> New Parameter
  • Give the Parameter a name (I used Test)
  • Set a Current Value of 0
  • Click OK

Next, right click the Parameter in the Queries pane on the left and go to the Advanced Editor.  You should see code that looks like this:

0 meta [IsParameterQuery=true, Type="Any", IsParameterQueryRequired=true]

So this is interesting… 0 is the value, and the rest is just a meta tag to tell Power Query that this is a real parameter…  This got me wondering… am I stuck with this value, or can I feed it a Power Query code and actually create a dynamic parameter that updates at run time?

Converting a Query to a dynamic Parameter - Take 1

The first thing I did here was copy everything after the 0, then exited this query.  I then:

  • Jumped over to the XL_TopX_NamedCell query
  • Entered the Advanced Editor
  • Pasted the copied line of code at the end
  • Clicked OK

And it didn't work.  Not to give up, I jumped back into the Advanced Editor and wrapped the original query in parenthesis like this:

Wrapping the original query in the Advanced Editor

And this time, something did change:

Dynamic parameter appears in the query list

There are 3 things worth noting here:

  1. It has the parameter icon (Yay!)
  2. It doesn't show a current value but shows an exclamation icon
  3. It shows the value of (…) in the name - meaning it doesn't know what the value is

I wasn't too worried about this last one though.  Dynamic named ranges show the same way in Excel, so would this work to create dynamic parameters?

Conditional Column Dialog

It sure does!  Not only does it show up in any parameter drop down, but the value gets read correctly and allows me to make my comparisons.  How cool is that?  I've actually got a dynamic parameter now!

Converting a Query to a dynamic Parameter - Take 2

Now, as cool as this was, there is something that bothered me about it.  When you tag the meta data at the end of a functional query and turn it into a parameter, you lose the applied steps.  If anything goes wrong, it makes it hard to debug.  (Reminds me of the classic custom function setup.)

To solve this, I decided to remove all the meta tags and parenthesis from the XL_TopX_NamedCell query, returning it to what is was before I turned it into a parameter.  I then created a new blank query called TopX_NamedCell and edited the code in the Advanced Editor to read as follows:

XL_TopX_NamedCell meta [IsParameterQuery=true, Type="Any", IsParameterQueryRequired=true]

Why?  Because I now have the query that pulls in the original data.  When I click on it, I can see the values and debugging steps to get there:

Checking the dynamic parameter value

And I also have a Parameter, which pulls from this value and can be used in my drop downs:

Conditional Column Dialog

Extending dynamic Parameters to leverage the fnGetParameter function

If you've used the fnGetParameter function before, it only makes sense that you'd want to know if we can leverage this function to pull values and return real Parameters.  And indeed you can.

 Parameters that pull from fnGetParameter

Here's the quick and dirty way to create dynamic Parameters by calling the fnGetParameter function directly:

  • Create a new blank query
  • Name your new Parameter  (I called mine TopX_DirectFromFunction)
  • Go into the Advanced Editor
  • Paste in the following code:

fnGetParameter("<Variable Name>") meta [IsParameterQuery=true, Type="Any", IsParameterQueryRequired=true]

  • Replace <Variable Name> with the name of the variable you want from the Excel Parameter table.  In the example this would be fnGetParameter("Keep top")
  • Click OK

Yes, it's just that easy.  You've now got a fully functional and dynamic Parameter… at least, you do if you replaced the variable name correctly with one that exists in the Parameter table!

NOTE:  I recommend that you rename your query before you edit the M code since you lose the applied steps window during the process.  You can still rename a parameter, but you'll need to right click it in the queries pane on the left and choose Rename to do so.

Making dynamic parameters that pull from fnGetParameter auditable

There's only one problem with the above approach. How do you test the value is resolving correctly before you try to use it?  Or how do you look to see what is actually happening when your downstream queries return an error?

For this reason, I actually recommend that you don't use the fnGetParameter query in a real Parameter as outlined in the previous section.  What I recommend you do is create an intermediary query which leverages fnGetParameter to pull the value from the Excel table, then reference that query from the Parameter query.  So in short:

Create an intermediary query

This is also fairly easy to set up.  The full process would be:

    • Copy in the fnGetParameter function
    • Set up the Parameters table in Excel and populate it with data
    • Create a new blank query to retrieve the parameter value
      • Name it
      • Enter the following in the formula bar:
        • =fnGetParameter("<variable name>")
        • replace <variable name> with the name of the parameter you wish to retrieve
      • Load as Connection only
    • Create a new blank query to be the real Parameter
      • Name the parameter as you'd like to see it in drop down lists
      • Go into the Advanced Editor and enter the following
        • QueryName meta [IsParameterQuery=true, Type="Any", IsParameterQueryRequired=true]
        • Replace QueryName with the name of the query you created above
      • NOTE: Parameters will automatically load as Connection Only queries
    • Use the new Parameter in other queries

See it in action…

The attached sample file contains three different variables based on the methods above, any of which can be used to drive the Conditional Columns step in the Grouped query:

Dynamic Parameters listed in the Conditional Column dialog

And if you're curious, they are related as shown below.  The TopX_NamedCell parameter is driving the version I saved, but as per the above, you can change that out easily.  (Naturally, in this case they all serve up the same value though!)

Query Dependencies View

Some Observations

As I was playing around with this, I noticed a couple of things that are worth bringing up here.

Yes, these work in the Power BI service!

To test this out, I cooked up a small sample that used a dynamic parameter using the methods outlined above to read the most recent year's data from a SharePoint folder.  I then published it to the Power BI service, added a new file to the server and refreshed the data in Power BI online.  Worked like a charm.

For the record, I haven't tested, but don't anticipate that this will work well with Power BI templates, as they will most likely clear the parameters and prompt you for values.  Any data points you wish to be preserved should be left as queries.

The Convert to Parameter function

Assume you created a new query, then typed a value into the formula bar (not a formula, but it could be numeric or text).  This would return a single (scalar) value that is static.  You'd then be able to right click the query in the Queries pane and choose Convert to Parameter.  Unfortunately, if your query returns anything that is dynamic or has multiple data points, this option is greyed out.  That's too bad, as this would be a really cool thing to be able to do.

Avoid the Add/Manage Parameter UI

Unfortunately, adding even a single dynamically-driven parameter renders the Manage Parameter dialog useless to you.  The reason is that as soon as you try to say OK to any parameter in that list (whether modifying or creating a new one), it appears to try to validate the current value of each of the listed parameters:

Add/Manage Parameter UI

This is unfortunate, as it means that you'd need to kick over to a blank query to create any new Parameters or debug any existing ones.

UPDATE:  Thanks to Andrew in the comments, I know that you can uncheck the Required value when creating your parameter.  If you do that the M code upon the initial creation comes up as:

0 meta [IsParameterQuery=true, Type="Any", IsParameterQueryRequired=false]

If the required setting is false, then the manage queries dialog can still be used without forcing an update!

 The Parameter meta tag

The only part of the Parameter meta tag that is actually required is the following:

meta [IsParameterQuery=true]

Having said that, I got mixed results doing this.  Sometimes the Parameters were not presented in my drop down list.  Editing those queries and restoring the full meta tag to the end resolved that immediately.  I.e.:

meta [IsParameterQuery=true, Type="Any", IsParameterQueryRequired=true]