Using Rich Data Types in Power Query

If you’re on Office 365 and don’t have Excel’s new Rich Data Types, you should know that they’ll be coming to you soon.  Giving us the ability to create both Stocks and Geographies, these are going to add some exciting new capabilities to Excel, particularly if we want to enrich our data.  In this post, we'll quickly explore what Rich Data Types are, what they add, and how they are treated by Power Query.

What is a Rich Data Type?

Have a look at the following data:

Table of locations for experimenting with Rich Data Types

The challenge with this data is that it is completely text based.  What if we wanted to enrich this with more information like population, latitude or longitude?  The answer is to convert it to Excel’s new Rich Data Type.  To do this:

  • Select the data
  • Go to the Data tab -> Data Type -> Geography

This will then convert the text into “Entities” with a little map icon beside them.  And clicking on that little map icon shows some pretty cool new things:

Example of the Geography Data Type

This is the new geography data type. Unlike the original text entry, this object contains all of the properties you see on the card, adding a whole bunch of power to our original data.

NOTE:  The data on this card comes from a variety of sources such as Wikipedia and WeatherTrends360.  Full attribution can be found at the bottom of the card.

Working with a Rich Data Type

One of the very cool things about this new data type is the ability to expand the enriched data from the object.  To do this:

  • Mouse over the top right of the table
  • Click the Add Column dialog
  • Check the box(es) next to the columns you want to add

Adding enriched data to the Location table

Shown below, we’ve extracted Latitude, Longitude, Population and Name.

The Location table with enriched data added

Note:  This button just writes the formulas needed to extract the data from the Rich Data Type. We could have easily written formulas to do this ourselves, such as =@Location].Latitude or =A4.Latitude.

The impacts of this should be pretty clear… even though we started with text, we now have the ability to convert it into a real place and pull further data back from that area!

Rich Data Types and Power Query

The ability to enrich a plain text data source is huge.  One simple example of their impact is that we could add the Lat/Long coordinates to allow proper mapping in Power BI. But how will Power Query read these new Rich Data Types?  Not well as it turns out…

The enriched Locations table has been brought into Power Query but creates an error

Ideally, Power Query would pull in this data and recognize it as a proper record, which would allow you to extract the elements.  And while I’m sure that will happen one day, it won’t be possible when Rich Data Types hit your build of Excel.

The trick to getting at this data today is actually already evident in the image above: create new columns in the original table.  Even though Power Query (in Excel or Power BI) can’t read the Rich Data Type itself, it CAN read the columns you extract via formulas.  It’s a workaround, and one we’d prefer not to have to do, but at least we can get to the enriched data that these new data types give us.

Become a Data Master with Power Query

Ken is really excited to be teaching his popular Master Your Data with Power Query class for the first time in New York City! Join us this spring for a small group hands-on workshop and learn how to become a Data Master with Power Query.

Learn to become a Data Master with Power Query

Join Ken Puls for a live hands-on session in New York, NY on April 17, 2019.

What's so great about Power Query?

If there is one thing you need to learn in Excel today, Power Query is it. With Power Query, you can clean, reshape, and combine your data with ease. No more tedious cutting and pasting between multiple files. No more manually removing garbage rows or adding new columns. And no more repeating the same time-consuming steps whenever you need to refresh the data.

Instead, once you have your data the way you want it, all you have to do is click refresh and it will be ready to be loaded into the next day's/week's/month's/quarter's report. You can even schedule these refreshes to happen automatically!

What will be covered in the workshop?

The day will begin with a quick overview of Excel tables, PivotTables, and what makes "good" data. Next, Ken will show you how to import data from a wide variety of files, including Excel workbooks, CSV and TXT files, databases, and even entire folders. You'll be able to clean, transform, and refresh your data in Power Query with just a few clicks.

Ken will also show you how to append (or stack) data from multiple tables and 7 ways to merge (or join) tables without any VLOOKUPs. You'll be able to pivot data like this:

Pivoting Data with Power Query

and unpivot data like this:

Unpivoting Data with Power Query

But wait, there's more!

Ken will teach you some more advanced techniques using conditional logic. He will also give an overview of best practices for structuring your queries and query folding. Additionally, you'll receive a copy of the course slides to refer back to. Many of these slides contain handy recipes that will lead you step-by-step through the data transformation techniques.

Not only that, you will be able to directly ask a leading Power Query expert to help you with challenges you are currently facing with your own data. That kind of in-person access is invaluable!

How can Power Query help me?

A data wrangler spends the majority of their time just gathering, cleaning, and preparing the data before it can be even used in a report, chart, or other data model. Instead, become a Data Master with Power Query and get hours of your time back. For example, Ken was able to help a workshop attendee automate a workflow in 30 seconds, saving them 6 hours per week!

Power Query is the data preparation tool of the future, not only for Excel but also for Power BI Desktop, Microsoft Flow, and more. Thus, everything you learn in this course is transferable to other technologies - giving you more bang for your buck.

How do I make the case to my boss for sending me to the course?

As Ken mentioned in a previous blog post, the cost of the course can look like a lot up front. This is especially true if you must also pay for travel, hotel, etc. But divide the $499 USD registration fee + any expenses by your hourly rate. You'll see a return on investment pretty quickly with the time you save using Power Query.

However, the real value of the training comes in when you look at what you can do for your company with that extra time. You can now focus on analyzing the data instead of preparing it. See, Power Query turns your data into information. It allows you to identify new opportunities, make better decisions, and add real value to your organization.

Where do I go to become a Data Master with Power Query?

Ken will be leading this full-day workshop on Wednesday, April 17, 2019 at the NYC Seminar and Conference Center in New York City. But this is a small group session so there are only a limited number of spots available. Register today to secure your chance to receive personal guidance from a world-class Power Query expert. Go to the Excelguru website to view the full course description and register online.

One Solution to Challenge 4

Yesterday, I posted a new Power Query Challenge, and in this post I'm going to show my solution to challenge 4.  You can pick up this solution as well as solutions created by the community in this thread of the Excelguru Forum.  And as a quick note - the very first answer posted there is much slicker than what I've written up here... but hopefully some tricks here will still help you up your Power Query game. 😉

Background on the Solution to Challenge 4

The original issue was to create a header from different rows in the data.  You can read the full reason for this in the original blog post, but basically put, I needed to convert this:

image

To this:

image

On a dynamic basis so that I could easily repoint the data set to this one:

image

And return this:

image

The major wrinkles in creating a Solution to Challenge 4

The biggest issues I had to deal with were these:

  • I couldn't just promote row 1 to headers and rename the first column.  Why?  Because Power Query would have hard coded "admin" in the first set, which would have triggered an error when pointed to Sales
  • I couldn't just rename the annual columns.  Why?  Because the years change, so Column2 is 2015 in one data set and 2016 in the other
  • After promoting the header row, I couldn't declare data types.  Why?  Because the column names get hard coded, meaning that the code would trigger an error when it couldn't find 2015 or 2018 in the data sets.

Fixing these on a static basis is easy, it's wanting it to be dynamic that is the issue.

Creating the Solution to Challenge 4

So how did I accomplish the goal?  I started with the workbook I posted in the forum, then took these steps.

As described in the original post, I edited the MakeMyHeaders query in order to do the work.  I then:

  • Demoted Headers (Home -> Use First Row as Headers -> Use Headers as First Row)
  • I then right clicked the "Changed Type" step in the Applied Steps area and renamed it to "AllData" (with no space)

image

This basically gives me an easy-to-come-back-to point of reference for later.  And since I did not include a space, it's super easy to type.  (If I left a space in there, it would be #"All Data" instead.)

Next, I needed to create my header row which involved:

  • Keeping the top 2 rows only (Home -> Keep Top Rows -> 2)
  • Right clicking and renaming the step "HeaderBase"

image

I then replaced the department name (Admin) with "Name". The trick was to make this dynamic, which involved a couple of steps.

  • Right click Column1 and do a replacement as follows:

image

  • Then, in the formula bar, remove the " characters around both HeaderBase[Column1]{1} and HeaderBase[Column1]{0}

SNAGHTML23bb1ac0

So what was that all about?  It's replacing the value in the 2nd row - row {1} - with the value from the first row - row {0}.  And it's completely dynamic!  (For reference, the reason I reduced this to only 2 rows was that if I tried this when all rows were showing, the departments on the data rows would be lost.)

I then went and removed the Top 1 row, leaving me with this:

image

So far so good.  Now I just needed to add back the original data.  To do that:

  • I went to Home -> Append and appended the MakeMyHeaders query (yes, I appended it to itself)
  • I then modified the formula from:

= Table.Combine({#"Removed Top Rows", #"Removed Top Rows"})

  • To

= Table.Combine({#"Removed Top Rows", AllData})

Which left this:

image

The final cleanup took a few more steps:

  • Promote First Row to Headers
  • Delete the automatically created Changed Type step (so we don't lock down the years in the code)
  • Remove Top Rows -> Top 2 Rows (to get ride of the rows I used to create the Header)

And we're done!

image

Proving that the Solution to Challenge 4 works

Naturally, loading this query to a table will show that it works.  But to really prove it out:

  • Edit the MakeMyHeaders query
  • Select the Source step
  • Change the formula to =Sales
  • Go to Home -> Close & Load

You'll see that it updates nicely with the updated headers

Why the solution to Challenge 4 even matters

If you ever decide to combine Excel worksheets, and want to hold on to the worksheet name as well as the data, you'll need this technique!

Power Query Challenge 4

It's time for Power Query Challenge 4!  This one is a tricky little challenge with creating a header row - but from different rows in the data set.

The real world scenario driving Power Query Challenge 4

Have you ever tried to combine Excel files in a folder, and wanted to preserve the worksheet name along with the contents?  If you have, you'll end up looking at data that follows this kind of pattern:

image

Notice that in step 1 we have the sheet name and a table with the contents.  And step 2 shows what happens when we expand all columns from the table.  So what's the issue?

This is the crux of Power Query Challenge 4… we need a header row that looks like this:

SNAGHTML236b9b5f

Easy right?  Not so fast!

The value in the Name column will change for each file in the folder.  In addition, the data in the columns may also have different names.  So you can't hard code anything here…

Sample data for Power Query Challenge 4

Let's be honest, this isn't simple or it wouldn't be a challenge, but we're going to try and keep it simpler by focussing on just the header row issue. (We're going to skip the whole combine files stuff, and just use some pre-formatted Excel tables that exhibit the problem.)

To build and test your solution I'm providing a file with two different data tables (Admin and Sales) and a query called "MakeMyHeaders" that just refers to the Admin data set right now:

image

The data in the Admin table looks like this:

image

And Sales looks like this:

image

Solving Power Query Challenge 4

To solve this challenge, you should work in the MakeMyHeaders query, and convert the data so that it outputs the data shown here:

image

And, if you did it correctly, you should then be able to edit the MakeMyHeaders query, select the Source step and change the formula to =Sales.  After loading, you should get the output shown here (without any errors):

image

Simple right?  Smile

Posting Your Solution

As mentioned in The Future of Power Query Challenges, we are no longer accepting submissions by email.  And don't post your solution here either.  (Comments are still welcome!)  Instead, we are collecting answers on the Power Query Challenge 4 thread in our forum.  Post your solution there, provide a short description of the approach you took, and have a look at other's submission there.

My solution will come out as a blog post tomorrow.

The Future of Power Query Challenges

Whether by email or in person at conferences, I've received a lot of requests for more Power Query Challenges.  You all seem to love testing your skills, and I really want to keep the Power Query Challenges going.  In order to do so, however, I need to evolve the format a bit to make it sustainable.

As awesome as the Power Query Challenges are, I was a bit unprepared for how much you all would love them, and how much time it would take out of my schedule.  And I'll be honest, that's why you haven't see any more so far - I just haven't had the cycles to do them.  Does that sound odd considering you're creating the solutions?

Why do the Power Query Challenges need to evolve?

The reality is that it takes me a good few hours to come up with the data set, come up with my own solution, then write up and publish both the challenge and solution posts that I do.  So that in its own is a bit of a time eater, but one that I'm prepared to do as sharing knowledge is part of what we do here at Excelguru.

The real killer though, was my original intent to read and review all of the submissions.  That part has already become unmanageable for me. Each of the first three Power Query Challenges has seen an increase in participation, with over 45 solutions provided by you all last time.  If you consider that it takes about 5 minutes just to open and review each submission, that's a good half day of work.  (And that's if they're easy to understand.)  Then try to group and write them up… I'm into this for probably another day.  I just can't put that much time into a non-chargeable service, as I need to earn funds to pay the awesome team that helps me do what I'm doing.

So the question I've been wrestling with is: how can I keep providing these challenges to you and still sustain the time I need to run my business?  So here's my thoughts…

What's the plan for new Power Query Challenges?

I still plan to curate and write up the challenge.  I still plan to write up my solution.  But rather than have you all email us with your solutions, I'm going to set up a post in the Excelguru Power Query forum with the original data.  If you want to play along, you can download the workbook there, and even upload your solution to that thread, with a short comment of how you approached the task - or what you think is special about your solution.

I see a number of benefits here:

  1. You can get your solution out there much more quickly than waiting for us to curate and post them.
  2. Solutions posted by others are available to you immediately upon submission (rather than waiting for use to share a link later).
  3. You can review and comment on the solutions posted by others (give kudos, ask questions, etc.).
  4. With a brief description, you can focus on the ones you feel are the most interesting to you.
  5. As it is a forum, it actually allows you a place to post challenges of your own, too.

Basically, it allows the challenge process to be a lot more "self-service", and isn't that what things are all about today?

I'm hoping that this might allow more interaction between the entire community around these challenges, without me becoming a road block to the process.  Do I anticipate fewer responses?  Yes, I do, only because people will read them before attempting it on their own, or may not share theirs if they see a similar solution.  I'm not sure what we can do about that, or if it's even an issue.

Your Thoughts?

I'm curious as to your thoughts on this approach. Will it work for you?  Do you have concerns about it?  Does it make it less attractive?  Please feel free to leave your thoughts in the comments.

Nuthin’ ain’t nuthin’ in Power Query

There are two kinds of nuthin' in Power Query: null, and blank. I tripped on this issue the other day, and Ken thought it would be a good idea for a blog post.

Let's just call out the two types of nuthin' in Power Query:

  • null is literally "no value" for any data type from text to table.  In other words, the cell is completely empty.
  • A blank also has looks like "no value", but it is equivalent to a cell formula of  ="" in Excel.  In other words, the cell holds a value that renders as blank.

Why is this important?  It's because, inside Power Query (and indeed many programming languages) null and blank are not equal!  And it turns out that nuthin' matters more than getting the right nuthin' in Power Query!

Burned by nuthin' in Power Query

Some time ago I built a set of Excel Power Query transforms which report on data extracted from a client system. For some time the client data has been extracted into Excel files, but there were some problems. My solution was to extract the client data files as CSVs instead.

What I found, unexpectedly, was that blank data values are treated differently by the PQ import functions depending on whether the file being imported is an Excel file or a CSV file!

Here's the rules:

  • For an Excel import, blanks are converted to null - always.
  • For CSVs, blanks are imported as blanks (not nulls). But when a field type is changed in a Power Query step the numeric and date column blanks are converted to nulls, and text column blanks remain as blanks (not nulls)

Seeing nuthin' in Power Query

What does it matter?  Have a look at the following cases...

1. Excel data with blanks loaded into PQ. Blank cells are imported as null.

2. CSV data with blanks imported to Power Query: Blanks are read as blanks.

3. CSV data with blanks: dates and numbers change to null after type change. Text blanks remain blank

The impacts of nuthin' in Power Query

Nuthin' in Power Query could have caused me more issues here... my transforms used conditional columns to check for nulls in text columns, a test which failed when the value is blank.

As Ken pointed out, another key issue is that the Fill Down and Fill Up functions in Power Query are used to fill null values. But blanks are not nulls, so the functions do not work as intended in my case.

How to deal? Caveat Emptor! It appears Power Query treats blank data differently in the CSV import and Excel import functions.

So, if you are building transforms based on a consistent source of imported data, then there is little impact. But if you should need to change the type of data source like I did then beware. I had to do some detail testing, and I was looking to re-write some longstanding and well tested transforms.

Instead, for my specific situation, I added a value change step to the CSV import in PQ to replace nuthin' (blank) with the keyword null for the whole data set. That allowed me to maintain my downstream logic, but cost me some processing speed.

I hope this alerts PQ users to a potential issue in their transforms. If there are any alternate solutions to the problem let's see them in the comments.

Power Query Recipes Now Available

As you may have heard, we have been working on a set of helpful Power Query Recipes for Excel and Power BI and are excited to announce that they are now officially available!

Are our Power Query Recipes for you?

The Power Query Recipes are targeted at people who are familiar with the Power Query interface (in either Excel or Power BI), and will lead you step-by-step through the process needed to clean up and convert your data from one format into another.  And if you're not already comfortable with Power Query?  Consider joining the Power Query Academy so we can change that!

So what is in the Power Query Recipes package?

There are currently over 30 cards in the set, showing easy-to-follow steps that will deal with a variety of common data issues.  I personally am finding them super useful, and often refer back to them when I'm helping people clean up their data in person or in forums.

We've even marked each of the Power Query Recipes that has a video version in Power Query Academy.  If you are already subscribed to our Academy, simply click the video camera in the bottom corner, and it will take you straight to the appropriate video so that you can see the technique demonstrated in a live setting.

Are there samples of the Power Query Recipes?

Of course there are!

Here's an example of one of my favorites, which lays out how to create Full Anti Join, something that does not exist in the regular Power Query user interface:

image

And another which shows how to create a Calendar Table on the fly:

image

We actually have a sample package available which includes four recipe cards (including the precedent card for the calendar recipe shown above), as well as the full table of contents for the current version.

Just some of the patterns included in the full version are:

  • Splitting data into new columns or rows
  • Pivoting, unpivoting, and transposing data
  • Several different ways to merge tables (including the Full Anti Join shown above)
  • Extracting a new column based on values in the prior or next rows
  • Six different ranking methods
  • Creating dynamic calendar tables
  • Adding a random number to all rows

Will there be more Power Query Recipes?

Oh yes!  While there are already over 30 recipes in the set, we already have another 20 on our "to-do" list, and are adding to that list regularly.  Our intention is to release new recipes on a regular basis, putting even more Power Query resources at your fingertips.

How do I get my copy of the Power Query Recipes?

There are a few different options:

  1. If you are a Power Query Academy member*, the recipes are already included in your registration.
  1. Pre-order a copy of the new 2nd edition of our M is for Data Monkey book from Amazon by September 30, 2018 and receive a FREE set of the initial  recipes.**
  1. Purchase the current stand-alone set of recipes from the Power Query Training web store for $24.95 USD. This is also an option to add a subscription ($2.95 USD every 3 months) so that you get all the new recipe cards as we expand the set.

* Please note that the recipe cards are not included in the free Power Query Academy trial, only with the full course.
**Upon receipt of your Amazon proof of purchase being received via email at support@powerquery.training, we will provide a coupon code for $24.95 which can be applied to either the stand-alone or subscription purchase.  Act fast though, as this offer ends on October 31, 2018.

Power Query Challenge 3

**Please note that the challenge is now closed, so we are no longer accepting submissions. However, you are still welcome to try it out on your own.

I've got a lot of feedback that you enjoy the Power Query Challenge series we've been running, so it's time for another!  Challenge 3 is just a fun one that was inspired by a conversation I had with Alex J.

Background for Challenge 3

I'm sure you've all seen this before:

If the letters A-Z are worth a value of 1-26 respectively, then:

  • Knowledge = 96%
  • Hardwork = 98%

Of course, the joke is that Attitude is worth 100%.

Your job for Challenge 3

For challenge 3, based on assigning a letter score of 1 for A, 2 for B, 3 for C, etc…:

  • Take a table of words
  • Return the total score using Power Query

So basically… do this:

image

But keep in mind that correctly solving Power Query Challenge 3 requires one very important thing… It needs to work no matter the case of the letters in the original column.

Before you submit your answers to Power Query Challenge 3

After our previous challenges, I got feedback that people really want to see the solutions submitted by others.  I haven't shared them only because I didn't specifically say I was going to, and I can't guarantee that there isn't personally identifiable information in them.  That will change here…

Submitting Your Answer to the Power Query Challenge

**The challenge is now closed, so we are no longer accepting submissions.

To submit your answer:

  • Please name your file using your name - or the name alter-ego if you prefer.  (Keep in mind that your real name could be in the file properties, and it's up to you to clean that out.)  The reason we need a name is so that anyone reading solutions can ask questions about your solution.
  • We have to cap this somewhere, so will allow submissions until the end of Sunday, Sep 16 only.
  • While the challenge is open, you can submit your answer to Rebekah at Excelguru dot ca with the subject Power Query Challenge 3.

After Submissions are closed:

I will post my solution to Power Query Challenge 3, and share a link to a folder of all the submitted solutions.

Please note that last challenge we had over 30 submissions.  While I may post some highlights from the solutions, I won't be doing a full write-up of all them due to the volume we expect to see.

Have fun!

Disaster Recovery in Puerto Rico with Power Query

When Ken was at the Microsoft Business Applications Summit a few weeks ago, he met Mr. J.A. Garcia who has been doing some amazing work with Power Query. We wanted to share his story about how he has been using Power Query in helping with disaster recovery efforts in Puerto Rico:

"[In] my line of work there's been two defining moments that have changed the way we look at our tools. The first one was the Zika outbreak and the second one was Hurricane Maria.

The first time I saw Power Query was [as part of] Power BI during the Zika outbreak [in 2016]. One of our clients needed up-to-date information of the Zika outbreak and its effect on healthcare. With the help of a consultant, we started using Power BI and Power Query.

Aedes aegypti mosquito

An Aedes aegypti mosquito, one of the main transmitters of Zika virus.

I began taking courses during that time, and one of them was about Excel. That's when I learned about Get & Transform in Excel 2016.

Any new job that I received, I tried to use Power Query. I taught myself SQL so I could understand better the process of extracting data and how to integrate it into Power Query.

Our job was changing. We could give the tools to our clients that would let them refresh when they needed it the most. No more waiting [on] our area for a data refresh!

Then Hurricane Maria hit Puerto Rico [in September/October 2017]. It was a harsh two weeks of no communication. As soon as I came back from work, I noticed the change in attitude. As a healthcare company, we began doing Public Health.

Hurricane Maria - Disaster Recovery with Power Query

Hurricane Maria is regarded as being the worst natural disaster on record to affect Dominica and Puerto Rico and the deadliest Atlantic hurricane since Hurricane Stan in 2005.

My main job was identifying members with certain serious conditions. I used Power Query and Excel to create processes that obtain information from the assessment done to keep track of the efforts of the company. The clients could refresh the data and see who was missing, fix any data entry errors and more.

I'm very proud of my work, and Power Query in Excel and Power BI has been a large part of my growth. In the present, we have created a tool that refreshes constantly to help identify members with serious conditions. Now in case of any emergency, we'll know who to attend."

~ J.A. Garcia

We were very inspired how Mr. Garcia began is Power Query journey as part of the disaster recovery efforts after these emergencies, and that he and his team continue to leverage this powerful tool in both Excel and Power BI. Power Query really can help save lives!

Do you have an story to share about your Power Query journey? Maybe it hasn't saved your life literally, but perhaps it has saved you hours of time and effort, a significant amount of money, or even your sanity! Let us know in the comments below or contact us through the Excelguru site.

Power Query Challenge #2 Results

What an overwhelming response to Power Query Challenge #2!  We had 40 submissions, and some with multiple entries in a single submission.  Plainly you all enjoyed this!

Naturally, there were a couple of submissions that involved custom functions, and a couple who wrote manual grouping functions to get things done.  These folks obviously know how M works, so I'm going to focus more on the other entries to show different UI-driven routes to accomplish the goal.  Winking smile  Each of those is included in the workbook that you can download here.

The Base Query

I'm going to start this by creating a base query called "Source Data" which has only 2 steps:

  • Connect to the Data Table
  • Set the data types

This is going to make it easy to demo certain things, but also replicates what a lot of you did anyway.

Most Popular Solutions to Power Query Challenge #2

By far the most popular solution to Power Query Challenge #2 was by starting using one of the following two methods:

Method 1A - Group & Merge

  • Reference the Source Data query
  • Merge Customer & Membership
  • Remove duplicates on the merged column
  • Group by the Customer column and add a Count of Rows

Method 1B - Group & Merge

  • Reference the Source Data query
  • Remove all columns except Customer & Membership
  • Group by the Customer column and add a Count of Distinct Rows

Either of these methods would leave you with something similar to this:

image

Method 1 Completion

No matter which way you sliced the first part, you would then do this to finish it off:

  • Filter the Count column to values greater than 1
  • Merge the filtered table against the original data set:
    • Matching the Customer column
    • Using an Inner join
  • Remove all columns except the new column of tables

image

  • Expand all columns
  • Set the data type of the Data column and you're good

image

Of the 34 entries, this variation showed up in at least 25 of them.  Sometimes it was all done in a single query (referencing prior steps), sometimes in 3 queries, and sometimes it wasn't quite as efficiently done, but ultimately this was the main approach.

A Unique Solution to Power Query Challenge #2

I only had one person submit this solution to Power Query Challenge #2.  Given that it is 100% user interface driven and shows something different, I wanted to show it as well.  I've labelled this one as Pivot & Merge.

Here's the steps:

  • Reference the Source Data query
  • Remove all columns except Customer & Membership
  • Select both columns --> Remove Duplicates
  • Pivot the Customer column (to get of products by customer)

image

  • Demote the headers to first row
  • Transpose the table

And at that point, you have this view:

image

Look familiar?  You can now finish this one using the steps in "Method 1 Completion" above.

Personally, I don't think I'd go this route, only because the Pivot/Transpose could be costly with large amounts of data.  (To be fair, I haven't tested any of these solutions with big data.)  But it is cool so see that there are multiple ways to approach this.

The Double Grouping Solution to Power Query Challenge #2

This is the solution that I cooked up originally, and is actually why I threw this challenge out.  I was curious how many people would come up with this, and only a couple of people put this out there.  So here's how it works:

  • Reference the Source Data query
  • Stage 1 grouping:
    • Group the data by Customer and Membership
    • Add a column called "Transactions" using the All Rows operation

This leaves you here:

image

Now, you immediately group it again using a different configuration:

  • Group by Customer
  • Add columns as follows:
    • "Products" using the Count Distinct Rows operation
    • "Data" using the All Rows operation

Which leaves you at this stage:

image

It's now similar to what you've seen above, but we have a nested table that contains our original data.  To finish this off, we now need to do this:

  • Filter Products to Greater than 1
  • Expand only the Transactions column from the Data column
  • Right click the Transactions column --> Remove Other Columns
  • Expand all fields from the Transactions column
  • Set the data types for all the columns

And you're there!

image

Final Thoughts

Again, there were more solutions submitted for Power Query Challenge #2.  We had:

  • A couple of custom function submissions (of which each was slightly different)
  • A couple of custom grouping solutions (not written through the UI)
  • A couple of solutions that used grouping, then used a custom column to create a table based on the grouped output which filtered to distinct items

If I haven't covered yours here and you feel that I missed something important, please drop it in the comments below!

The part that fascinates me most about this is that we had UI driven submissions involving merging, transposing and grouping.  Three different methods to get into the same end result.

Thanks for the submissions everyone!