Label Duplicates with Power Query

Recently, a reader commented on a blog post that I wrote back in 2015.  Their question essentially boiled down to working out how to label duplicates with Power Query.  As an additional twist though, they also wanted to ensure that the first naturally occurring data point was never accidentally labelled as the duplicate.  As Power Query often re-sorts data at inopportune times I thought it was worth a look as to how to accomplish this.

The Goal:  Label Duplicates with Power Query

Our original source data is shown in blue columns below, with the green column on the right being the one that we want to add via Power Query.  (The white column on the far left contains rows numbers.  They aren’t actually part of our source data at all and are only intended to make it easier to follow the explanation below the image.)

A table with our source data on the left where we want to label duplicates with Power Query as shown in the final column

The important things to notice here are:

  • Row 2 of the table records the initial entry for SKU 510010 (Canadian), with a duplicate on row 12
  • We have an original entry of SKU 510032 on row 15 and a repeat on row 18.

The key thing that we want to ensure as we flag the duplicates in this scenario is that the sort order is always retained as per the original order of the data source.  While you’d think this shouldn’t be hard, the reality is that there are many occasions where Power Query will re-sort your data on the fly, and we cannot let that happen here.

Getting Set to Label Duplicates with Power Query

The way I would approach this task – providing that the data has already been loaded to Power Query – is to do this:

  • Add an Index Column --> From 1
  • Select the SKU column --> Transform --> Group By
  • Configure the “New Column Name” to call it “Data” using the “All Rows” aggregation --> OK

Adding an All Rows aggregation via the grouping dialog

  • Go to Add Column --> Add Custom Column and use the following formula:
    • AddIndexColumn( [Data] , "Instance" , 1 )
  • Right click the Custom column --> Remove Other Columns
  • Expand all columns from the Custom column

Now, if you’ve been following my work at all, you may recognize the data pattern I just used.  It’s called Numbering Grouped Rows, as is available as one of the Power Query Recipe cards and is also illustrated in Chapter 13 of my Master Your Data for Excel and Power BI book.  The result is a data table that looks like this:

The data points in Power Query with columns added to show the original row number and the instance of each point

As you can see, the Index column preserves the original row numbers of the data set.  In addition, the “Instance” correctly records the order of their appearance in the data set.

Applying Labels to the Duplicates

This is the easy part:

  • Go to Add Column --> Conditional Column --> name it “Occurrence” and configure it as follows:
    • if the Instance column equals 1 then return the Original column else return the Duplicate column
  • Sort the Index column --> Sort Ascending
  • Select the Index and Instance columns --> press the DEL key
  • Set the data types of each of the columns

And that’s it. The data points have all been labeled and can now be loaded to the desired destination:

Our final output with duplicates highlighted

If you'd like to play with this scenario, you can find the completed sample file here.

Learning More

I love data patterns and include a ton of them in Master Your Data with Excel and Power BI, our Power Query Recipe cards.  Both of those resources are also included in our Power Query Academy video course as well, where you can actually see them performed live.  I have to say - of all the recipes I have - Numbering Grouped Rows is one of my particular favourites.  It has a ton of utility in all kinds of scenarios.

Master Your Data is Now Available!

You read that correctly, the Data Monkey has landed, and Master Your Data for Excel and Power BI is now available in PDF format from your favourite online bookstore!

Master Your Data book cover

You may have noticed that the past few months have been pretty quiet on the Excelguru blog.  A big reason for this is that I essentially went offline to focus on finishing this book.  As many of you know, it’s been pushed back before (more than once), and we wanted to ensure that this would not happen again.  I’m pleased to say that the final version has now gone to the printers, and we’ll see physical copies start being distributed by Mr. Excel and Amazon on November 1, 2021.  The even better news however, is that you can get your hands on a digital copy today, as Master Your Data for Excel and Power BI is now available for sale at Skillwave.Training.

What’s in Master Your Data for Excel and Power BI?

Miguel and I are super proud of what we’ve put together here.  This has been a very long journey to get to this point, and we wanted to make sure that we delivered the best book we possibly could.  With the exception of a few of the paragraphs in the Foreword and Chapter 0, the book has been re-written from scratch, covering much more material than was covered in M is for Data Monkey.  How much more?  M is for Data Monkey was 226 pages long.  Master Your Data for Excel and Power BI clocks in at 369 pages in total.

To give you an idea of the topics we covered in this book, have a quick peek at the table of contents:

  • Chapter 0 - The Data Revolution
  • Chapter 1 - Power Query Fundamentals
  • Chapter 2 - Query Management
  • Chapter 3 - Data Types and Errors
  • Chapter 4 - Moving Queries Between Excel & Power Bl
  • Chapter 5 - Importing from Flat Files
  • Chapter 6 - Importing Data from Excel
  • Chapter 7 - Simple Transformation Techniques
  • Chapter 8 - Appending Data
  • Chapter 9 - Combining Files
  • Chapter 10 - Merging Data
  • Chapter 11 - Web Based Data Sources
  • Chapter 12 - Relational Data Sources
  • Chapter 13 - Reshaping Tabular Data
  • Chapter 14 - Conditional Logic in Power Query
  • Chapter 15 - Power Query Values
  • Chapter 16 - Understanding the M Language
  • Chapter 17 - Parameters and Custom Functions
  • Chapter 18 - Date and Time Techniques
  • Chapter 19 - Query Optimization
  • Chapter 20 - Automating Refresh

One of the biggest changes you’ll see up front is that in Master Your Data for Excel and Power BI, we tried to make sure we covered a lot up front about getting started with Power Query, understanding errors and query management.   It wasn’t until we got through those topics that we dove into specific data transformation techniques.  This was a slightly different approach to what we did with our first book.  While query management was covered briefly near the end of M is for Data Monkey, we changed that up this time, as we wanted to make this a practical guide that sets our users up for long term success as they build their solutions.  Mastering your data is about much more than just getting a transformation complete – it is about making sure that you can re-use it in future.

The Tricky Bits (for us)

When we were laying out the book, Miguel and I had a few different goals:

  1. Make this an awesome resource for beginners
  2. Include great material to up the game of intermediate users as well as seasoned pros
  3. Create a useful resource that you’d go back to again and again
  4. Write a book that will survive longer than six months
  5. Deliver the book without delaying it again

Each of these is a challenge in its own way, of course.

The last point on the list is not totally in our control, of course.  In fact, less than an hour after I posted on my personal Facebook that the first draft of the book was done and sent to the publisher, Microsoft announced a visual refresh of Office on their blog.  (Fortunately, all screenshots that rely on the Excel ribbon have been updated to reflect the new design, putting that to bed for now!)  Naturally, new features will get added to Power Query, but our hope is that this book will still be accurate and useful for years to come.

With regards to making this an awesome resource for beginners, we have been teaching Power Query to beginners for a long time via our Power Query Academy, as well as in-person courses.  We have taken lessons from those experiences when deciding how to approach the material, balancing the speed and techniques used as we move through the material.  We’re confident that a new user to Power Query will be able to find the material accessible and approachable, and that the material in these pages will change their (data) life forever.

Building a book that also answers the needs of the intermediate and seasoned pro is a lot tougher, especially when trying to keep the material accessible to newer users.  We believe we’ve hit that balance and have added a ton of material to help people get deeper into Power Query.  From deeper explanations of Query Folding and the Lazy Evaluation engine, completely redesigned chapters on the M language and material that deals with Privacy and the ever-irritating Formula Firewall, we are sure that even the most seasoned pro will pick up some tips and knowledge from the material.

My Favourite Chapters

Honestly, I’m very proud of how everything came together, as well as the journey that we lay out for the reader.  But like anything that I put together, there are some chapters that I feel especially good about.

Chapter 2 on Query Management was something that I have wanted to include ever since I started teaching my Dimensional Modeling courses.  The section of that course is always an eye opener and I’ve lost count of how many times people tell me that it has changed the way they approach building their solutions.  It’s awesome to be able to put that information in writing, in a place that all our readers will be able to see it.

Chapter 9 on Combining Files is pretty awesome too.  It follows the steps outlined in my Power Query Recipe cards and will probably be one of the most impactful chapters in the book for many users: especially those who receive files on a monthly basis that need to be cleaned and combined.  The whole Combine Files experience was released to Power Query about six months after M is for Data Monkey went to print, so it’s great to finally be including the modern experience in Master Your Data for Excel and Power BI.

Chapter 10 is one that I’m particularly jazzed about.  My initial page estimate for this chapter was 12 pages… it ended up taking 21.  In that chapter we cover every native join that Power Query provides, plus a discussion on the Full Anti Join, Cartesian Products (cross-joins), Approximate Match joins and a comprehensive coverage of Fuzzy Matching.

Chapter 17 on Parameters and Custom Functions is another one that I just love.  Within the last two weeks I’ve actually reached back into that chapter twice on a personal basis to build solutions for clients.  We cover how to manually rebuild the whole Combine Files experience that Power Query does for you, and then build some increasingly complex examples, eventually landing on an improved fnGetParameter function that is built from scratch.  Super cool stuff.

Then there is Chapter 18, which begins with building several varieties of dynamic Calendar tables.  We’re not just talking about a list of dates, we’re talking about the patterns needed to build 12-month calendars with non-standard year ends, 445 calendars and more, including the columns that return the period IDs and dimensional fields for each. (These are the same patterns that we build with Monkey Tools!)  But that’s not all that is nestled in Chapter 18…  I also snuck in some methods to answer some questions that I get frequently from accountants: “How do I allocate my sales/expenses over x periods?”

There is so much material that we ended up getting into this book – some of which has never been seen before… I can’t wait for people to see it.

Getting Your Copy of Master Your Data for Excel and Power BI

If you are an active member of our Power Query Academy, or my Self Service BI Bootcamp, you’ve already got a copy waiting for you in your Skillwave dashboard.  And if you are an alumnus of the Academy (your subscription is no longer active), you’ll be getting an email over the next week to let you know how to claim your copy.

Physical copies won’t be fulfilled by until Nov 1, 2021 but can be pre-ordered via either the Mr. Excel store, or Amazon.  But if you’d like to pick up the PDF version, you can do so right now at Skillwave.Training.

Building a SelectQuery Function

For a while now, I have been wanting to have the capability in Power Query to select a query by name PROGRAMMATICALLY.  Why?  Building a SelectQuery Function would allow me to execute one of multiple “Transform” queries depending on a user selection on the Excel sheet. This will help me process log files from multiple vendors which each have different contents and field names.

Here is a fairly simple example with only three input queries (although my true setup actually has seven potential queries to select from:

Illustration of the query chain with three queries that pull from a single data source, and another query that feeds the SelectQuery function to choose which to execute

How the setup is intended to work:

I have a named range called “User_Select” on my sheet that has a Data Validation dropdown with the names of my source queries:

And what I want to do is read the value from the User_Select named range into my query named Selection_Query. This provides a scalar value that matches the query I’d like to execute.

The Issues

  • I do not want to land the multiple input queries on sheets (too much data involved).
  • I am not a fan of Power Query Parameters for this approach, as they must be changed from within the Power Query user interface (I don’t really want my users going in there.)
  • I do not want to use “brute force” – I want to do this programmatically so that it is easy to maintain in future.

What is the “brute force” method?

The brute force method is essentially coding a great big IF/THEN statement that contains each possible query.  Looking at the M code, you’d end up with something like:

let
Source = if SELECTION_Query = "Query 1" then #"Query 1" else
if SELECTION_Query = "Query 2" then #"Query 2" else
if SELECTION_Query = "Query 3" then #"Query 3"
// (and so on, and so on, and shoobie-doobie-doo ?)
else null
in
Source

The problem however is that each time I create a new “Transform” for a new vendor, or retire it from production, I would also need to come back and update my brute force query to reflect these changes.  It would be MUCH simpler if I only had to add/remove the query name from my drop-down list, and not worry about messing with the M code of my Selector query.

Some hope for building a SelectQuery function

On 19 Feb, Gasper Kamensek presented a session at VANPUG’s Power BI track that got me excited. In his presentation, he showed how to programmatically select from some LANDED queries using the Excel.CurrentWorkbook() statement in Power Query:

let
Source = Excel.CurrentWorkbook({[Name=Selection_Query]}[Content]
in
Source

Enter Expression.Evaluate

Now, that worked great for items that had been landed to a worksheet table and got me thinking about this some more.  The challenge I’ve been facing is that I need to select from queries which were NOT landed to a worksheet table and therefore don’t show up via the Excel.CurrentWorkbook() function.  Wondering if this was even possible, I asked my friend Ken Puls. And guess what he Puls-ed out of his bag of tricks?

Source = Expression.Evaluate("some text string", #shared)

Now, I had encountered Expression.Evaluate() in the Power Query M function reference, but it was not clear to me what it was intended to do. But after Ken and I bashed this back and forth a bit… WOW!  Does this ever have potential!

Ken explained that Expression.Evaluate() works very similarly to Excel’s INDIRECT() function - it takes an input and tries to evaluate it at run-time.  Unlike Excel, which seems to just evaluate the provided term against any and all Excel items, Expression.Evaluate() requires you to specify the library you want to use to interpret the code.  And that’s where the #shared parameter comes in, as this parameter provides a list of not only all Power Queries in the solution, but also all of the available Power Query functions.

So Ken’s suggestion was to pass the name of the query I wanted in to the Expression.Evaluate function, and evaluate it against the shared library. At that point – he told me – it should give me the results of that query.

Armed with this theory, I was eager to plug it in to my SELECTOR query, which gave me this:

let
Source = Expression.Evaluate(Selection_Query, #shared)
in
Source

AND IT DIDN’T WORK. ?

Expression.Identifier to the Rescue!

Turns out, it’s not Ken’s fault – I like to name my queries with spaces and leading numbers.  After a little digging, it became apparent that the Expression.Evaluate() needed me to refer to “Query 1” with a pound sign and quotes.

In other words, this DOESN’T work:

=Expression.Evaluate(Query 1, #shared)

But this DOES:

=Expression.Evaluate(#"Query 1", #shared)

So now I just needed to figure out how to automatically “escape” the query with the #” “ requirement where necessary.  I suppose I could have put those into my Excel drop down, but that would make the list values look kind of ugly, so I went hunting something a bit more elegant.

After poking around in the M manual, I found Expression.Identifier(“some text”), and guess what it does? It converts the name we see in the Queries & Connections panel into the correct “# and quote” syntax.

So that gives me:

let
qName = Expression.Identifier(Selection Query),
Source = Expression.Evaluate(qName, #shared)
in
Source

AND IT WORKS!

Completing the Solution

To make this as flexible as possible (and allow me to use it in other projects), I decided that building a SelectQuery function was the way to go.  So here’s what I ended up with:

The fxSelectQuery function:

(qName) =>
let
Source = Expression.Evaluate( Expression.Identifier(qName) ,#shared)
in
Source

And at that point, I can invoke whatever query I need by passing the results of the Selection_Query to the fxSelectQuery function like this:

The Output Query:

let
Source = fxSelectQuery(#"Selection_Query")
in
Source

And the end result is that I select the query I want to run from an Excel data validation list, click update, and I’m done.  How cool is that?

One caveat that I should probably mention here is that you must disable the formula firewall in order to use this setup.  You can do this by going to Get Data -> Query Options -> Current Workbook -> Privacy ->  Ignore.

You can download the example file here if you’d like to see the results of building a SelectQuery function in action.

Major redesign at Skillwave.Training

This past weekend we published a major redesign at Skillwave.Training.  Months in the making, this has been a total overhaul to focus on delivering the best online learning experience for our clients.  Check out some of the images from the new site:

Centralized Dashboard

When you log in, you’ll be taken to your Dashboard immediately.  This is the one stop console that will let you access any of your active course subscriptions, review forum posts, download your files, and manage your billing and profile details.  We’ve worked hard to make this dashboard intuitive and easy to use as possible, and to make it look great on mobile as well.

Re-Designed Course Player

The course player is a completely custom built as well.  Of course, you’d expect to see your navigation menu on the left to get to your lessons, but we’ve also added a “Materials” fly out menu on the right where you can access files specific to any given lesson.The Materials flyout in action in the Skillwave Course Player

Community Forum Overhaul

We said is was a major redesign at Skillwave.Training, and we meant it.  One of our big goals here was to do a better job with the Skillwave help forum and foster a sense of community within it.  Our belief is that learning is great, but there can be another hurdle when trying to convert theory into practice with your own data.  We see the forum experience and Skillwave community as a crucial part of solving this issue, giving students the ability to:

  • Ask questions about the course materials,
  • Get help with applying techniques to their own data,
  • Interact with other people in the same training,
  • Practice applying their skills to other data sets, and
  • Reinforce their knowledge and help others in the process.

Any of our clients who have an active subscription to one of our paid products will find a completely revamped forum experience.  As forum posters ourselves, there were a couple of very important things that we wanted to make sure that our community was provided a good set of tools for:

  1. Asking To this end, we’ve made sure that we support topic tags, image and file uploads, code tags and a variety of rich formatting options.  (Our old forum was quite weak in this regard).
  2. Answering In addition to the tools above, we’ve added the ability to mark questions as solved. Our forums are searchable based on topic tags, answered status, solved status and more.
  3. Ensuring high quality answers. Our forum is private and monitored by our admin team.  Even if Matt, Miguel or myself aren’t the ones answering specific questions, we have a special “Recommended Answer” tag that we can apply to answers.  This serves two purposes to us: the first is providing assurance to the asker that they got a great answer, while the second is providing validation to a poster that they’ve provided a high-quality response.

Course to Question Integration

There’s one more really cool thing though… We also now give you the ability to post a forum question directly from a given lesson and provide links to all other questions that have been posted in this manner.  This serves both askers and answerers as it links directly back to the source of the question.  We’re super proud of this little feature and feel that it sets us apart from other platforms out there.  Not because other platforms don’t offer the ability to ask questions – they do.  But we serve all of that up right inside the lesson page.

A demo of the integration from course player and our forum

Check Out the major redesign at Skillwave.Training

If you haven’t checked out Skillwave.Training yet, you really should.  We’ve got all kinds of great courses related to Excel, Power BI, Power Query and DAX.  You can even try out the platform via our free Power Query Fundamentals course.  You won’t have access to the forums on the free tier, but you’ll be able to experience the rest of our new platform.

As we've just launched the site, we'd love to get your feedback.  For the next month or so, you can do that by clicking the little Feedback widget on the right side of any site page.  Let us know what you think!The feedback widget in action on Skillwave.Training

New Monkey Tools Features

We're super excited to let you know that we've just released some new Monkey Tools features!  Let's take a quick look as to what is new...

The Table Monkey

This feature was actually released back in December. However, since we announced it at the KSA meetup (which you can see on YouTube), we decided that it needed a personality of its own.  So now, on the Query Monkey menu you'll find the Table Monkey: a monkey who is dedicated to helping you build queries from Excel tables.The Table Monkey allows creating queries not just from one table, but multiple tables in one shot

Some of the cool features of this Monkey are:

  • It can create multiple "From Table" queries at once.
  • Tables can be excluded with a single click.
  • It can create "Staging" layers for you - as per our Dimensional Modeling course on Skillwave.Training, with custom staging layer names or counts.
  • You can rename the Excel tables by right clicking on the blue boxes that represent the Excel tables.
  • You can rename the Queries by right clicking on the green boxes that represent the data model tables.
  • It allows you to toggle the end query so you can load it to the data model or as a connection.
  • It provides a data typing algorithm that is smarter than Power Query's native algorithm.

Overall, we find this to be super useful. It allows us to create multiple table connections in a few seconds, rather than the minutes it would take us to set things up manually.

This feature is a Pro feature, but is fully functional in our free trial.

Create Query from M Code

The next feature that we included is a nice interface to create a new query from M code.  If you post in forums and need to quickly create a query for testing, you can simply take their code, paste it into the form, give it a name and click create.  Much easier than having to create a new query, edit the code, select everything and then paste:

Using the new Create Query from M Code feature to quickly create a new query

The main benefit of this form is saving you the headache of jumping into the query editor to create your query. Additionally, we also added the ability to indent the code right in the form. So if you're just trying to read it, it can be useful without ever creating a query at all.

We feel that this would be a super useful feature for those helping each other in the community. Thus, this feature falls in to our "Forever Free" category and works at all license levels (include after your trial expires).

Convenience Features - Pivots & Filters

Another one of the new Monkey Tools features that we've added is a Pivots & Filters menu to the Monkey Tools ribbon.  This is purely a convenience feature. It's designed to bring the commands closer to you so that you don't have to do as much tab switching:

The new Pivots & Filters menu allows creating PivotTables, PivotCharts and Slicers and Timelines without leaving the Monkey Tools ribbon

The version on the left is what we are terming the "Classic" view, which shows you the Insert PivotTable button (as well as PivotCharts, Slicers & Timelines).  The view on the right is what your menu will look like once the new Insert PivotTable button rolls out to your Office 365 install.  (If your Monkey Tools menu starts with PivotCharts, then head to our Options screen and uncheck the "Use Legacy PivotTable Menu Buttons" option.)

Bug Fixes

And - of course - like every release we do, we have included a bunch of bug fixes. Fixes that are applicable for all users including Pro, Trial and Free.

How to you get the new Monkey Tools features?

If you already have Monkey Tools installed, then head in to Monkey Tools -> Options.  If you are running 1.0.7678.28973, then you already have them.  And if not, click Check for Updates Now to update.

Don't have Monkey Tools installed?  You can try the full feature set for free for two weeks before the license reverts to a "free" license.  We think you'll be pleasantly surprised with how useful Monkey Tools is on a free license, and yet how much more it does in the Pro version.

 

More free features in Monkey Tools

Wow, it is hard to believe it is already December.  And looking back at my blog, I realized that I forgot to tell you that we released a few more free features in Monkey Tools over the past month!  In fact, November was a busy development month for us, so I though it would be a good time to share what we have done.

GetISOWeek Function

One of my friends saw the ability to create a calendar using the Calendar Monkey.  While he was suitably impressed, he did also ask me if it could do something he badly needed, which was to create a column displaying the ISO week that is commonly used in Europe.  Unfortunately, the Calendar Monkey had not learned enough about ISO weeks at that time, so was unable to help. So, we sent a couple of the Monkeys back to school…!

If you are on a trial or free version of Monkey Tools, you will find that the Query Monkey will now allow you to add a custom Power Query function called GetISOWeek to your file.  From there, you can manually call this function via the Invoke Custom Function button, or via writing a formula in the Custom Column dialog within Power Query.  Simply feed the function any date column to get the ISO Week Number, and include “true” for the final (optional) parameter if you prefer the “precise” text version:

Date\Formula =fnGetISO( [Date] ) =fnGetISO( [Date], true )
Sun 30 Dec 2007 52 2007-W52-7
Mon 31 Dec 2007 1 2008-W01-1
Tue 1 Jan 2008 1 2008-W01-2

Of course, adding a new function in to your workbook is great, but for our Pro users, the Calendar Monkey wanted to make it even easier, and added it as a default column choice.  No fuss, no mess, just choose the ISO date formats you need and let the Calendar Monkey do the rest!

The new ISO Week options displayed on the Calendar Monkey form

Measure Monkey – Basic Explicit Measures

While we are also super proud of our Measure Monkey who will help create Multiple Explicit Measures, we also realize that there are times where you need to create individual measures.  For this reason, we trained another Measure Monkey to do exactly that.

The new Basic Explicit Measures feature shown on the Measure Monkey menu

The Measure Monkey that focuses on Basic Explicit Measures provides you with a no-code experience to create… well… basic explicit aggregations.  (Yes, you could make Implicit versions via drag and drop, but serious modelers far prefer the more customizable and scalable explicit versions.)

This Measure Monkey will help you create these measures without writing a single line of DAX (although it does show you the DAX it has created.)  You will be provided a list of relevant aggregations (go home COUNTA!) and smart default formatting choices.  The Monkey will even capture your preferred defaults to make you even faster next time.

Side by side vide of creating a SUM and LASTDATE aggregation with the Basic Explicit Measure Monkey

And, like its brother who builds Multiple Explicit Measures, this Measure Monkey will work for you for free!

Support for Non-English Queries

Did I mention that my friend whom I referred to above, runs a French version of Excel?  Unfortunately, Monkey Tools had some challenges reading the queries in his model correctly.  While we have always claimed that we only support English versions of Excel, this still bothered us.

One interesting part about being a coder is that MOST coding is written in English. But every now and then, Microsoft localizes something that we did not expect.  So was the case with the underlying Power Query connection name.  To make a long story short, I have now learned that “Query” is “Requête” in French, “Abfrage” in German, and has other localized words among other languages.  And now that we know?  We have retrained our tool to deal with this challenge.

What this means to you if you are a user of a non-English version of Excel is – while we are not quite ready to say we fully support all non-English versions of Excel – we do believe Monkey Tools should work no matter the localization of your Excel install.  (We do still recommend caution here.  Until we say we OFFICIALLY support all languages, please do try the Trial version before you buy, and let us know if Monkey Tools has any issues reading your queries!)

Feedback Mechanisms

Another question we received from time to time was “How do I give you feedback?” or “How do I report a bug?”  It was enough that we realized that we had done a poor job of giving you a mechanism to do so.  So to that end, we have added the following to the Monkey Tools Help menu:

  • Log a Bug
  • QuerySleuth Indenter Issues (for issues specific to QuerySleuth indentation)
  • Feature Suggestions

Each takes you to a form that you can fill out to get in contact with the dev team.  And yes, we are open to hearing your suggestions!

Various Other Bug Fixes

Of course, no release would be complete without a few bug fixes.  There were a half dozen fixes that were included in the various November updates (plus another half dozen published last night.)  Each was minor, and not really worth mentioning on their own, but rest assured that we are trying to fix bugs whenever we find them.

What is the Current Version?

To make sure you have all of the current features, go to Monkey Tools -> Options.  If you are running a version that is less than 1.0.7640.41496, then click Check for Updates Now to update.

And if you don’t have Monkey Tools installed yet… what are you waiting for?  You can try the pro features for free for two weeks, and there are a ton of useful tools even if you don’t elect to purchase a pro license.  Click here to get your copy of Monkey Tools.  And hey… if you decide to upgrade to a Annual Pro license today, you can get 20% off with the code BF20MONKEYTOOLS.

So… What’s Next?

We are working on something cool that will help Excel modelers get started quickly.  And if you want to be one of the first to hear about it and see it in action you should attend the inaugural KSA Excel Power Platform meetup, as I’ll be demoing this new feature.

 

Introducing the Measure Monkey

You know the drill… extract, transform and load your data, relate your tables, then create basic DAX measures.  All work that needs to be done before you can really get started on analyzing your data.  Today we’ve unleashed the Measure Monkey to help speed up that process a bit for you.  (You can think of the Measure Monkey as Quick Measures for Excel.)

If you follow Monkey Tools already, you’ll know that our goal is to help you build better models faster.  We already include helpful functions such as:

  • the ability to inject a query that can automatically switch between local folders and SharePoint folders
  • manage your queries via our Query Sleuth
  • build calendar tables on the fly against your data
  • and so much more...

But while we’ve had a nice tool to trace DAX query chains, we haven’t included a lot of DAX functionality to date.  That is changing today.  And oh… before we dive into it, I want to be clear that this feature will be available to ALL users of Monkey Tools.  Yes, even those of you using a Free license!

The Sample Model

Before we dive into this, let’s take a look at a sample data model:

Notice that everything is nicely created and linked (by the way - we created that calendar in a few seconds with Monkey Tools’ Calendar Monkey…) but that there are no DAX Measures on our Sales and Budget tables.  Date and Category are both foreign keys that link each of the those tables to the Calendar and Categories tables.  However, we really want explicit measures to sum both the Sales[Amount] and Budget[Amount] columns.

Of course, these measures are easy to write, but what if your model is a bit more complicated and there are ton of them to do?

Creating Explicit Measures in Bulk with the Measure Monkey

As of version 1.0.7599.31348, you’ll find a new Measure Monkey menu on the Monkey Tools ribbon for this exact purpose:

Step 1A: Which Tables Host The Columns To Aggregate?

When you launch the new feature, you’ll be taken to a screen that looks like this:

This screen in intended to allow you to tell the Measure Monkey which tables hold the columns you need to aggregate.  Our aim in this screen is to give you the highest possible chance of just clicking "Next". That being said, we realize that this may not work for everyone, so we also allow some flexibility here.

In the top left, we pre-select the tables which we believe have the highest chance of needing aggregation: your fact tables.  (Those tables with only ‘many’ sides of relationships attached to them.)  But if we get this wrong for you, you simply need to check the other boxes to include basic aggregations for other tables.  (Ideally, you shouldn’t be aggregating dimensions, but there are – of course – exceptions to every rule.)  You’ll get immediate feedback in the box in the bottom left, as we show all the tables that will be included based on your checkbox selections.

Step 1B:  Tell the Measure Monkey Where to Store Your New Measures

In the top right, we also allow you to tell us where you want to store the measures.  If you have created a specific “Measures” table, we’ll provide that by default.  If you haven’t, we’ll offer to store the measures on the Host Table.  (In other words, all measures created to aggregate columns from the Budget table will be stored on that table, whereas columns from Sales will be stored on Sales.)

Forgot to set up a new Measures table before doing this?  No worries, click the + to add a measures table on the fly, give it a name, and we’ll create it for you:

There are a couple of Advanced options as well, but we believe most people will want to leave these set based on their defaults.  So let’s click Next, to go to page 2…

Step 2: Choose Your Aggregations

This page contains a ton of info, but again we’re trying to provide you the biggest chance of clicking “Create” right away. Unfortunately, this is something that we can’t do in the image above…

The reason our Create button is disabled is that we have two measures offered with the name “Sum of Amount”.  The blue one is the first instance, and any subsequent measures with the same name will highlight in red.  So let’s fix those, and choose a default data type format:

It’s all good to go now, except that I want to add a “Transactions” measure that counts the rows of the Sales table.  So I’m going to click the “Add another aggregation” button in the Sales table. Then I choose the name of the table from the drop down list:

That will give me a new row with a “Count Rows of Sales” measure, which I can quickly rename to “Transactions” before clicking “Create”.

During this process, the Measure Monkey will create your measures for you. Plus, if you created a Measures table, it gives you some advice on how to make it an “official” measures table.  You can see the results in my data model here:

That was Easy…

The demo above was obviously a fairly simple model. Yet it cuts my explicit measure creation time down to less than a minute to create these two measures.  Now consider the time savings when you get a bit more complicated:

So how do you get the Measure Monkey menu?

This update to Monkey Tools is available in Monkey Tools 1.0.7599.31348 or higher.  And it's will be a “forever free” feature, so you’ll be able to use it on either a Free or Pro license!

To try our free trial, head over to the Monkey Tools product page to download your copy.

If you already have Monkey Tools installed, it will automatically update within a couple of weeks. Alternatively, you can request the update now by going to Monkey Tools -> Options -> Check For Update Now…

Status of our Master Your Data Book

We are rapidly approaching November 1, 2020, which has been the latest release target for our Master Your Data book. This is the long-awaited second edition of M is for Data Monkey, the book that I wrote with Miguel Escobar.  I want to share an update with you as to why the book’s release is about to be pushed back – yet again.

Our Master Your Data book (new edition of M is for Data Monkey)

First, I just want to say that Miguel and I wish we did not have to do this.  We want to get this book into your hands and scheduled a big block of time in the summer to get this done.  Our challenge this time has not been around technology, so much as the same thing that has impacted virtually all of you this year: COVID.

What have we been doing, anyway?

From my side, I can tell you that I was away from home for over 155 nights last year.  I travelled a ton leading in-person training courses and speaking at conferences.  Returning to the office was dedicated to catching up on the things I had missed and trying to slot in a block of time to write was quite difficult.  And then, in March 2020, everything changed.  We hit a global lockdown, and I watched my business go from 100 to 0 in the span of one week.  Every single training contract cancelled or postponed; income flow gone.

In some ways, you would think COVID would be a blessing there… I mean look at all that free time, right?  But the reality is that I have a team of 5 people at Excelguru, and I need to pay them.  We had to pivot, and pivot fast in order to generate a new income stream to keep people employed, and this has been all-consuming since then.

We brought Matt Allington in as a partner for our Power Query Academy and rebranded the enterprise as Skillwave Training. Ever since then, we have been working on adding new content there.  During that process we have come to realize that our site hasn’t scaled from a one-course site to a multi-course site. Now we’re investing in trying to design and build a better structure for our entire site.  (No, no promises yet on when you are going to see that!)

How are we weathering the storm?

Now, I am pleased to say that all of my clients who cancelled or postponed have come back to us.  But even with the courses I’ve done dozens of times before, the commitment on my side for a one-day in-person course has now become one to two weeks of time. This is because I have to adjust it for an online format, shoot and edit the video, get it all uploaded into our LMS, and so forth.

I joke today that I am working double the hours for a fraction of the earnings… but at least I still have it better than most.  The reality for us – like many – is that 2020 has been a very hard year.  I am just super proud that despite all of these challenges, we’ve been able to scrape by.  I have been able to keep my entire team at their usual salaries.  It is tight, but we are making it work.

Of course, there are two authors for this book. And while Miguel’s story is somewhat different, it is also somewhat the same.  He has been scrambling to deal with his business challenges – working sometimes 20 hours per day – in order to meet his commitments.

Where are we at with the Master Your Data book?

The simple reality here is that we just do not have the time to be able to write the Master Your Data book.  Book writing has never been a lucrative business, we do it for the passion of sharing with all of you.  So we are trying to build our businesses back to the point where we can do that again.

I know what you are thinking: "COVID, Murder Hornets, and now this.  2020 sucks!"  Yes, it sure does.

Having said that, we were asked if we should just cancel the sequel.  Miguel and I both said no to this.  We have made commitments to many people.  We are very proud of M is for Data Monkey and we want to make sure you get the new volume that you have been waiting for.  While the Master Your Data book won’t be released in 2020, we hope that we can at least brighten your 2021.

What can we expect going forward?

Last week Miguel and I sat down (virtually) to work out our future plan.  We are both committed until the end of 2020, but feel that we should be in a good place to start writing by mid-January.  Here is our plan on how it is going to play out:

  • Mid-January (at the latest), we will begin writing.
  • We will deliver all chapters – once written – in draft form to active members of the Power Query Academy. (Yes, you will get a chance to review them before the general public.)
  • We will review comments and edits from the Academy members and incorporate as appropriate.
  • We will deliver chapters to our publisher as they are completed.
  • Our target completion date is March 31, 2021.

After that, there's time for page layout, copy editing, and a bunch of other stuff, but the Master Your Data book should be available from Amazon as of July 1, 2021.

Final Thoughts on our Master Your Data Book

Again, at the end of the day, we really wish that the book was being shipped to Amazon today.  We know that many of you have been waiting for this for a long time, and we very much appreciate your patience and understanding.  And if you have lost faith in us and you cancel your pre-order, we totally understand and accept that too.  We are committed to getting this in your hands, it is just taking us far longer than we had hoped.

Keep safe out there everyone.  We will come through for you.

Use Excel Tables to Filter a Power Query

A question came up in the Excelguru forums today about how to use Excel tables to filter a Power Query.  While Power Query can't read a filter from an Excel table natively, there is a cool little trick that you can do to flow that information through though.

Data Background

The data footprint I'm working with looks like this:

3 tables showing the original data, a table of just years, and the final output with all rows

The Data query is a fairly simple staging query, pulling the data from the Excel table on the left, setting data types, and loading as Connection Only.

The YearFilter query is a little more complicated, as it pulls the data, removes duplicates, and then drills down into the Year column (right click the header -> Drill Down), resulting in a unique list of the Years:

The YearFilter results in a unique list of years in Power Query

And finally the Sales Query, which - shown in an indented and 'colourfied' format thanks to MonkeyTools QuerySleuth - looks like this:

The M code of the Sales query shown in Monkey Tools

The important things to notice about this query are:

  • It references the Data query (no new data is added here)
  • The Filtered Rows step filters to include any item that is in the list generated by the YearFilter list
  • The Filtered Rows step had to be adjusted manually to add the List.Contains function
  • The [Year] column refers to the [Year] column of the Sales query (which flows through from the original data)

So What's the Issue?

We want to use the filter on the YearFilter table in Excel to filter our Power Query.  Unfortunately, that doesn't happen... despite a refresh, all the years are still in the worksheet after setting that filter:

Despite filtering the Excel table, the output isn't filtered

The challenge, when you are attempting to use Excel tables to filter a Power Query, is that Excel can't read the filter.  In fact, Power Query can't access any of the table's metadata about filters or the visible state of the rows.  It therefore brings in all rows from the table whether they are hidden or not.

Using Excel Tables to Filter a Power Query

The secret here is that we need a way to tell Power Query which rows are visible versus which are hidden.  Something we can do by leveraging the AGGREGATE function, since it has the ability to count only visible rows.

The formula I used was =AGGREGATE(3,5,[@Year]) where:

  • 3 indicates the COUNTA() function
  • 5 sets it to ignore Hidden rows
  • [@Year] points to the current row of the Year column

The weird part, if you've never done this before, is that all the visible rows in Excel will always show a 1. But look what happens when you filter to only a couple of years, then edit the YearFilter Query and select the Source step:Using AGGREGATE, Power Query lets us see the visible and hidden rows in the table Boom!  We can see which rows are visible (indicated with a 1) and which are hidden (indicated with a 0).  This now becomes a pretty easy fix:

  • Filter the Display column to 1

And you're done.  The rest of the query will still work, as it drills in to the list of years, so we don't even need to remove this new column.

And just like that, we can now use Excel tables to filter a Power Query:

Setting a filter on the Excel table now filters our Power Query

 

Update to Monkey Tools QuerySleuth

We've been kind of quiet here, but we're excited to announce that we've just published an update to Monkey Tools QuerySleuth feature.  It now contains an "tabbed" experience so that you can easily flip back and forth between queries, "pinning" the ones you want to see and compare.

The Updated QuerySleuth Interface

In this case you'll notice that I pinned The ChitDetails and ChitHeaders queries, then selected the Locations query from the left menu.

An image of the update to Monkey Tools QuerySleuth showing the new tabbed interface indicating two pinned queries and two modified queries

Why does this matter?  Did you notice that the ChitDetails and Locations tab names are both red?  That's because I made changes to both of them to update a data type... I can now hold onto those changes as I flip back and forth between JUST the queries I want to keep in focus.

Updating Multiple Queries

But now, of course, I want to commit my changes and force the data model to update to reflect those changes.  In this image, I'm doing just that, with three queries:

An image of the QuerySleuth prompting the user to ask which queries they want to save and refresh

And due to the selection pointed out by the arrow, each of these queries will not only get saved back to the Power Query engine, but a refresh of each query will be triggered as well.

So how do you get this update to Monkey Tools QuerySleuth?

This update to Monkey Tools QuerySleuth is available in Monkey Tools 1.0.7553.5975 or higher.  And it's available in both the free and Pro versions of the tool.  (Of course, you will still need a Pro version in order to actually save your queries.)

To try our free trial, head over to the Monkey Tools product page to download your copy.

If you already have Monkey Tools installed, it will automatically update within a couple of weeks, or you can request the update now by going to Monkey Tools -> Options -> Check For Update Now…