Update to Monkey Tools QuerySleuth

We've been kind of quiet here, but we're excited to announce that we've just published an update to Monkey Tools QuerySleuth feature.  It now contains an "tabbed" experience so that you can easily flip back and forth between queries, "pinning" the ones you want to see and compare.

The Updated QuerySleuth Interface

In this case you'll notice that I pinned The ChitDetails and ChitHeaders queries, then selected the Locations query from the left menu.

An image of the update to Monkey Tools QuerySleuth showing the new tabbed interface indicating two pinned queries and two modified queries

Why does this matter?  Did you notice that the ChitDetails and Locations tab names are both red?  That's because I made changes to both of them to update a data type... I can now hold onto those changes as I flip back and forth between JUST the queries I want to keep in focus.

Updating Multiple Queries

But now, of course, I want to commit my changes and force the data model to update to reflect those changes.  In this image, I'm doing just that, with three queries:

An image of the QuerySleuth prompting the user to ask which queries they want to save and refresh

And due to the selection pointed out by the arrow, each of these queries will not only get saved back to the Power Query engine, but a refresh of each query will be triggered as well.

So how do you get this update to Monkey Tools QuerySleuth?

This update to Monkey Tools QuerySleuth is available in Monkey Tools 1.0.7553.5975 or higher.  And it's available in both the free and Pro versions of the tool.  (Of course, you will still need a Pro version in order to actually save your queries.)

To try our free trial, head over to the Monkey Tools product page to download your copy.

If you already have Monkey Tools installed, it will automatically update within a couple of weeks, or you can request the update now by going to Monkey Tools -> Options -> Check For Update Now…

Name Worksheets After Queries

Have you ever loaded a Power Query to a worksheet and then changed the name to match the query? It's a shame that there is no option to name worksheets after queries, as this would be handy.

Well, after seeing this request come up in the forums last week, we thought that this would be a great feature to add to the DestinationSleuth in Monkey Tools.  So as of build 1.0.7433.38066... it's done!

How to Name Worksheets After Queries

We wish that we could add this as an option in the Close & Load dialog, but sadly that's not possible.  So we did the next best thing...  Once you've loaded your queries, you simply need to open our DestinationSleuth and:

  1. Select the queries you're after
  2. Click the Rename Sheets button

Using DestinationSleuth to select queries and change the names of their host worksheets

At that point, we'll quickly loop through the host worksheets and rename them to match the query landed to that sheet.

Name Worksheets After Queries While Changing Load Destinations

You might also notice a new checkbox called "Name Sheets After Queries".  This checkbox allows you to name worksheets after queries while changing a load destination to create a new table.  It's also super easy to use:

  1. Select the query (or queries) you wish to change
  2. Choose to change the load destination to a Table
  3. Check the Name Sheets After Queries checkbox
  4. Click the Update Load Destinations button

Using DestinationSleuth to change a load destination from Connection Only to Table, and update the worksheet to the Query name at the same time.

We'll change the Load Destinations, creating the worksheets AND naming them to match the query in one step.

This is also a "Forever Free" Feature

As mentioned in my last post, while the DestinationSleuth's colour highlighting is only available in the trial and pro versions of Monkey Tools, the ability to Change Multiple Load Destinations at Once is a "forever free" feature.  And so is the ability to rename worksheets after queries!  All you need is Monkey Tools version 1.0.7433.38066 or higher, and you'll have that ability at your disposal.

If you haven’t already, head over to the Monkey Tools product page to download a copy

If you already have Monkey Tools installed, it will automatically update within a couple of weeks, or you can request the update now by going to Monkey Tools -> Options -> Check For Update Now…

 

 

Change Multiple Load Destinations at Once

Have you ever built multiple queries in one Power Query session?  You get to choose one load destination, then have to change each of the incorrect ones... one by one.  Have you ever wished you could change multiple load destinations at once?  Well now you can - you just need Monkey Tools!

Check out the New and Improved DestinationSleuth!

Following on the heels of last week's feature updates, we've added functionality to the DestinationSleuth that allows you to change load destinations of any one or more queries.

Naturally, DestinationSleuth still gives you a very visual view of the different query load destinations all in one place, but do you see that new option just to the left of the Exit button?

DestinationSleuth showing the different load destinations in colour

Assume that you've created 5 new queries in the workbook, and load them all to Connection Only.  But now you want to change 4 of them to the data model.  Rather than click each query, change the load destination, and wait, then move to the next one... Now you can just launch DestinationSleuth and:

  1. Select the load destination
  2. Select each of the queries you want to repoint (hold down your CTRL key)
  3. Click Change

DestinationSleuth being using to change multiple load destinations at once

Sure, it still takes time, but at least you can walk away and let it cook, rather than slowly shepherd it through the entire process.

Delete Host Worksheets

You'll also notice a little checkbox called "Delete Host Worksheets".  You know that issue where you accidentally load a table to the worksheet, then change the query to Connection Only?  It leaves the worksheet behind.  Now true, you can always just delete the worksheet (which will actually set the query to Connection Only automatically). But what if you need to change 5 of them?  It's easy with DestinationSleuth:

  1. Select the load destination
  2. Check the "Delete Host Worksheets"
  3. Select each of the queries you want to repoint (hold down your CTRL key)
  4. Click Change

We will repoint your queries AND remove the worksheets that were holding the query results.  (Naturally, you want to be really sure you acutally want to do this, but it's handy if you do.)

This is a "Forever Free" Feature

While the DestinationSleuth's colour highlighting is only available in the trial and pro versions of Monkey Tools, the ability to Change Multiple Load Destinations at Once is a "forever free" feature.  All you need is Monkey Tools version 1.0.7423.41125 or higher, and you'll have that ability at your disposal.

If you haven’t already, head over to the Monkey Tools product page to download a copy

If you already have Monkey Tools installed, it will automatically update within a couple of weeks, or you can request the update now by going to Monkey Tools -> Options -> Check For Update Now…

Appending Columns with Different Names

Will Thompson from the Power BI team threw out a question on Twitter today related to Appending Columns with Different Names:

Let's say I have some CSVs coming in where the column names differ from file to file, but always in the same order. I could use PQ's automatic Column1/Column2 etc. names by skipping the header row and then have them all map to the same columns in AS.  Then build visuals on top of it that'll pick up the fixed column names from AS. Has anyone written a blog or tutorial that covers that sort of scenario? I want to point someone to some guidance…

No worries, Will, I’ve got your back. 🙂

Setting the stage

To be fair, the data source (CSV, AS, Excel or whatever), really isn’t relevant.  It’s all about the process of Appending Columns with Different Names.  What we need to recognize is that we need two types of tables here:

  • Data Tables: These tables have the data, but have different headers.
  • Header Table: This table has the correct headers for the data

And we also need to remember that the Data Table and Header Table columns are always in the same order.

Step 1:  Prep the “Data” tables

Preparing the data tables is pretty easy.  If the different column names are already showing in the headers like this:

A data table with headers

Demote and remove them by going to:

  • Home -> Use First Row as Headers -> Use Headers as First Row.
  • Remove Rows -> Remove Top Rows -> 1

Your table should now have headers called Column1, Column2, etc…

A data table with headers demoted into the first row

Do this for each table, then set each to be a staging query:

  • Excel: Load the query as a Connection Only query
  • PowerBI: Disable the Load of the query

Step 2: Prep the Header Table

Connect to your data, which might look like this:

A data table with headers

Clean it up by going to:

  • Home -> Use Headers as First Row -> Use First Row as Headers.
  • Keep Rows -> Keep Top Rows -> 1

You should now have a 1 row query that shows the names of the column headers.

A single row table with headers demoted to be the only row

It's time to save it:

  • Rename the Query as “Headers”
  • Set it to load as a staging query (as above)

Step 3: Combine the Data

This part is simple:

  • Right click the Header table -> Reference
  • Go to Home -> Append Tables -> Three or more tables
  • Add each of the data tables and click OK
  • Go to Use First Row as Headers
  • Set your data types

And you’re done:

A complete data table with the correct headers

Sample file (Excel) is available here.  But you can import it to Power BI if you prefer that look.

Monkey Tools is Here

We are super excited to announce that we’ve (at last) released the first version of our Monkey Tools software!  Ken has been working on this software on and off for the better part of 8 years now.  But after showing it to a friend in Wellington last year, we decided it was finally time to get serious.  We hired a full-time developer last summer and are finally ready to go live with the initial release!

What is Monkey Tools?

Monkey Tools is an Excel add-in (supported in Excel 2016 and higher) which provides tools for you - as a business intelligence author/reviewer - to:

  • Build models more rapidly
  • Follow recommended practices
  • Document your work
  • Audit files that you receive

It is targeted primarily at modelers and analysts who work primarily in Excel, but also push their models into Power BI.  (Our philosophy at Excelguru is to model in Excel first, then export to Power BI for reporting, sharing and security control.)

Oh, and super important… it installs on your system without requiring admin rights on your PC.  How cool is that?

What does Monkey Tools actually do?

Well… lots!  We’ve collected all the cool features under some themed buttons including:

  • QueryMonkey (for inserting new queries)
  • DestinationSleuth (to provide information on query load destinations)
  • QuerySleuth (helping understand your actual queries)
  • TimeSleuth (to benchmark query load times)
  • PivotSleuth (helping you diagnose Pivot Table field issues)
  • DAXSleuth (tools especially for working with DAX measures)
  • ModelSleuth (reporting on the properties of your queries and data model)

Cute names, right?  The Monkey builds things, and the Sleuths investigate things.  Here’s a high-level view of what they each contain.

QueryMonkey

Query Monkey gives you the ability to insert key queries like:

  • The famous “fnGetParameter” query and table (from Chapter 24 of M is for Data Monkey)
  • A “From Folder” setup that works with local and/or SharePoint hosted files
  • Dynamic calendar tables based on your data (for custom calendars, it even provides the option to insert the "periodicity" columns for Rob Collie's GFITW DAX pattern!)

The QueryMonkey provides a Dynamic Calendar generator

DestinationSleuth

Today, this is simply a viewer to visually indicate the load destinations of your tables (better than just “Connection Only” or “x Rows Loaded”).

The DestinationSleuth user form displays four different load destination types

QuerySleuth

This is a single form, packed with information and features such as:

  • A dependency/precedent tree view layout
  • Full colour display based on load destination
  • Colourful and indented M code
  • The ability to modify the M code and write it back to the editor WITHOUT LOCKING YOUR EXCEL User Interface!

The QuerySleuth shows a query dependency tree as well as indented and colourful M code

TimeSleuth

This feature allows you to time query execution in Excel, and even chart comparisons between them with or without privacy settings enabled.  If you’ve ever wondered which query is slowing down your workbook, or wanted to time test two different approaches, you may find this helpful!

A chart generated by Monkey Tools TimeSleuth user form

PivotSleuth

Have you ever seen that irritating “relationships may be needed” error when building a Power Pivot based Pivot Table, and wondered why?  Pivot Sleuth can tell you…

  • See the real, fully qualified names of the fields used in your Pivot Tables
  • Highlight potential or current issues in Pivot Table configurations
  • Debug cross filtering issues, “relationships may be needed” errors and errors where grand totals are returned for all rows on the Pivot Table

Debugging PivotTable errors with the PivotSleuth

DAXSleuth

We believe that measure dependencies are just as important as query dependencies, and this is the reason we build the DAXSleuth.  This form:

  • Displays a dependency/precedent treeview of your DAX measures
  • Provides a full colour display of Implicit and Explicit measures (with or without children), as well as Calculated Columns
  • Shows your DAX measures with colour highlighting in an indented format
  • Allows you to Indent, Un-Indent, Flatten, Duplicate and even Update measures without leaving the DAXSleuth
  • Exposes all locations a DAX Measure has been used (Pivot Tables, Pivot Charts, OLAP Formulae and Named Ranges), and even allows you to select those objects right from the DAX Sleuth!

Monkey Tools DAXSleuth user form in action

ModelSleuth

Have you ever had to provide documentation for your model?  Or picked up a model from someone else and had to review it?  The ModelSleuth provides reports and utilities such as:

  • A full model summary report showing key statistics about your tables, relationships, columns, measures and queries. (Trial and Free licenses are limited to every other record in this report.)
  • A model memory usage report, complete with how much memory is recoverable (for Excel based data models).
  • An unused columns report (for Excel based data models).
  • A DMV Explorer (for those who want to build their own reports).

Showing the impact of unused columns on memory via Monkey Tools ModelSleuth feature

Monkey Tools Supported File Types

The Monkey Tools add-in is compatible with Excel 2016 or higher, and can read from:

  • Excel files
  • Power BI Desktop files
  • Backup files (that you can export from the Monkey Tools software)

Will Monkey Tools get updates?

Oh yes, we have plans for many more features!

Our intended model is to deliver features (and bug fixes) as we develop them.  That means that there could be periods with no updates as we work on something big, or periods with multiple updates delivered in a single week.  We know that some people love frequent updates and some people don’t, so we let you control how often you get them:

Monkey Tools allows you to control update frequency

The key thing to recognize here is that we are not holding new features for a vNext. They’ll be delivered when they’re ready.

Can I try Monkey Tools before I buy it?

Ken did not become or remain a Microsoft MVP without contributing a large portion of tools and help to the community for free, and that won’t change.  Having said that, we’re paying a developer to work on this product full time and need to recoup those costs.  For that reason, we will always have both a Free version, as well as a Pro version.

Naturally, we want you to look at it, as we're convinced you'll like it.  And that's why we have a two-week trial that provides full access to almost all of the full feature set.  Once your trial expires, your license will automatically revert to a free license.  You’ll still get fixes and new features, they’ll just render in free mode (without colour, without field advice, etc.).  We do believe that you’ll still find the tool useful, just maybe not as useful without a Pro license.

Ready to learn more about pricing options and download the free trial?  Click here!

Unpivot Stacked Sets with Inconsistent Rows

I'm currently hanging out in New Zealand, with a friend who has generously let me stay at his place instead of a hotel.  What I didn't know when he offered a bed though, was that the cost of admission was a solution for a gnarly Power Query issue he was facing: How to unpivot stacked sets with inconsistent rows.

The data Jeff provided me with looked similar to this:

3 sets of data with products on rows and dates on columns

If it were only the first two tables that we were facing, this wouldn't be too difficult.  We cover unpivoting stacked data sets in both our Power Query Academy and our Power Query Recipes, whether they are single or multi column.  But the killer here is the third table... it has more rows than the first two.  So the question becomes how do we unpivot stacked sets with inconsistent rows?

Preparing the Data

Obviously the first piece we need to do is to get the data into Power Query and remove the irrelevant rows and columns.  The steps I went through were these:

  • Pull the data in to Power Query
  • Filter Column1 to remove null values
  • Remove the Total column

Once done, my data looked like this:

Stacked Pivoted Data Sets with 3, 3 and 4 rows

So now the data is nice and clean, but how to you unpivot it?

Separating the Data

The first trick is to find a way to separate this data into individual tables that represent each of the stacked data sets.  The good news here is that there is an indicator that we are looking at a new table.  Every time we see "Products" in Column 1, we know that's a header row and a new table will begin.  So we'll use this to separate the data into blocks, starting a new block each time we see that term.  To do this:

  • Go to Add Column --> Index Column --> From 1
  • Go to Add Column --> Conditional Column and configure it as follows:
    • Name:  Set
    • Formula:  if [Column1] = Products then [Index] else null

Shown below is the image view of the Conditional Column, as well as the results that it will create:

Building a Set column returning the Column1 if Column1 equals Products or null

As you can see, we've pulled out the number from the Index column if - and only if - the value in the first column is "Products".  The reason we want the null is that we can then:

  • Right click the [Set] column --> Fill Down
  • Select the [Index] column --> press DEL

You're now left with a nice table where the Set column shows a unique value for each data group:

Stacked Data with a "Set" column showing a unique value for each set

Grouping Into Tables

With an indicator for each group of data, we can now leverage this to separate the data into the individual data sets.  The method to do this is Grouping.

Select the Set column and then:

  • Go to Transform --> Group By and configure as follows:
    • Group by: Set
    • New Column Name: Stage1
    • Operation: All Rows

Grouping the Set column and adding an aggregation called Stage1 for All Rows

The data will then be grouped by the values in the Set column, and show the original data that was used to generate those groups.  Clicking in the whitespace beside the Table keyword will show each of these rows that were used in the grouping for that data point:

Results of the Grouped table, shown by clicking in the whitespace next to a group

Cleaning up the Grouped Tables

The challenge we have here is that we want to unpivot the data, but we've got some extra data here that will pollute the set: the values in the "Set" column which were added to allow the grouping.  We need to remove that.  To do so:

  • Go to Add Column --> Custom Column and configure it as follows:
    • Name:  Stage2
    • Formula: =Table.RemoveColumns( [Stage1], "Set"

Compare the results to that of the Stage1 column:

The Stage2 data table looks like the Stage1 data table, except the Set column has been removed

Before we can unpivot data, we need to promote that first row to headers... but we need to do it for each column.  No problem, we'll just break out another custom column:

  • Go to Add Column --> Custom Column and configure it as follows:
    • Name:  Stage3
    • Formula: =Table.PromoteHeaders( [Stage2], [PromoteAllScalars=true] )

Wait... what?  How do you figure that out?  I cheated.  I grabbed another table, promoted headers, then looked in the formula bar to figure out the syntax I needed.  The function name and table name were pretty obvious but unfortunately - even with intellisense - that final PromoteAllScalars part doesn't auto-complete.  Even worse, if you don't included it, it essentially just eats the top one row.  Once I had it correct, the results are exactly what I needed:

The Stage3 table now shows the headers promoted

As you can see in the image below, the Stage 3 table contains columns that have headers, as we wanted.  The 3rd table (carrying the identifier of Set 9), shows four rows, while the other tables show 3 rows.  So the data is now separated into tables, but they still have an inconsistent number of rows.

The Set1 group has 3 rows, and Set9 has 4 rows

Unpivot the Data

We have done everything we need to do in order Unpivot Stacked Sets with Inconsistent Rows.  We now just need to unpivot the data.  So let's do it:

  • Go to Add Column --> Custom Column and configure it as follows:
    • Name:  Stage4
    • Formula: =Table.UnpivotOtherColumns( [Stage3], {"Products"}, "Date", "Units" )

An indented version of the formula, as well as the results it produces, is shown here:

Displaying the Unpivot formula and the results for Set1

How do you learn to write this?  Click on one of tables to drill in to it, unpivot the single table, copy the code from the formula bar, then delete the temporary steps to back up.  You may need to do some tweaking, of course, but at least you can easily get the syntax that way.

Now that we have this, we can finish extracting the data:

  • Right click the Stage4 column --> Remove Other Columns
  • Click the Expand icon at the top of the Stage4 column
  • Set the data types
  • Load it to your destination

Sample File

If you'd like to download the sample file, you can do so here.

 

Unfill in Power Query

Recently I received a question on how to Unfill in Power Query.  In other words, we want the opposite of the Fill feature, which fills data into blank cells (cells that contain a null value.)  If we see repeating values, we’d like to keep only the first, then replace all subsequent duplicate values with the null keyword.

Now I’ll be honest that I’d typically never do this.  I’d load the values into a table, then use a PivotTable to show the data the way I want to see it:

A table with repeating values, and a pivottable that suppresses repeating values

But having said this, if you need to have your data look like this…

A table of Animals, Colour and Amount that shows blanks under each repeating animal

… well then why not?

Unfill data with Power Query – Step 1

The first thing we need to do is run our recipe for numbering grouped rows.  (You can find this in our Power Query Recipe Cards, or in our Power Query Academy videos.)

Namely, it looks like this:

  • Sort the data by Animal to order it
  • Group the data by Animal
    • Add a single aggregation called “Data” for All Rows
  • Go to Add Column -> Custom Column and use the following formula
    • =Table.AddIndexColumn([Data],"Row",1,1)
  • Right click the “Custom” column -> Remove Other Columns
  • Expand all columns from the Custom Column

You’ve now got your rows numbered:

A Power Query showing Animal, Amount, Colour and a Row Number where each row with the same animal has a unique value starting from one

Unfill data with Power Query – Step 2

Once you’re in this state, it actually becomes pretty easy:

  • Go to Add Column -> Custom Column and use the following formula
    • = if [Row] = 1 then [Animal] else null
  • Remove the [Animal] column and the [Row] columns
  • Re-order the columns as desired
  • Rename [Custom] to Animal
  • Set the data types

Once done, you’ll notice that we have unfilled the data nicely.

A Power Query showing Animal, Colour and Amount, but only the first instance of a Animal is shown in the Animal column with duplicates showing as null

Final Thoughts

As I mentioned at the outset, this isn’t something I ever anticipate myself doing.  But if you do have a good business use case, I’d be curious to know what it is.  (I assume the asker did – although it came from a comment on an old blog post, so haven’t been able to ask.)  Please share in the comments. ?

The Data Insights 2 Day Master Class

I’m super excited to be presenting a Data Insights 2 Day Master Class in Wellington, NZ with my good friend Matt Allington.  This is the first time we’ll be working together to bring our unique strengths to our participants in a joint session format, and it’s going to be AWESOME!

Ad for the Data Insights Masterclass in Wellington NZ

How is the event going to work?

We think you’ll love this.  We’re going to divide our group in two.  You’ll get a one full day with me on Dimensional Modeling, and one full day with Matt, which focuses on the DAX formula language.  These two components are essential to understand when you want to build truly dynamic, scalable and stable data models, and we're going to cover both in detail.

What is covered in the Dimensional Modeling day?

Ken will be looking deeply at how to structure your data for a successful Excel/Power BI data model.  You’ll learn how your data should be shaped, what the data model expects in its tables, and a variety of techniques and patterns to work around common join problems.  Our goal here is very simple: to teach you everything you need to lay the foundation for a data model that will stand the test of time.

But not only will you lean practical hands on techniques to lay this groundwork, you’ll learn the key terminology at play.  By the time you leave this session you’ll be able to identify things like ‘facts’, ‘dimensions’, ‘relationships’, ‘schemas’, ‘slowly moving dimensions’ and much more.  Armed with this knowledge you will be able to not only design your own models properly, but you’ll be able to understand other materials you reference during your career.

As you might expect from one of the world’s leading voices on Power Query, there’s going to be a heavy focus on Power Query in this course.  But it's Power Query with a purpose: to feed a Power Pivot Data Model.

What is covered in the DAX Formula day?

Matt will take you into the world of DAX formulas, exploring how this incredible language can be used to summarize virtually any statistic you want to know.  He’s one of the world’s experts in the DAX language and will teach you not only what you SHOULD do with DAX, but what you SHOULDN’T.

When Is This?

Soon!  It’s going to be hosted in Wellington, NZ on Feb 24 and 25, 2020.  But the good news is that there are still seats available, and we’d LOVE to see you there with us.

How Much and Where Do I Sign Up?

Great questions!  Head over to ExceleratorBI for all those details.

Remove Dynamic Number of Top Rows

Removing the top five rows from a data set is easy in Power Query, but what do you do when the number of rows changes?  There isn’t a built-in Remove Dynamic Number of Top Rows function.  In this post we’ll look at how to set this up.

Illustration of the issue

Assume you have the following report, and you’re only interested in the Cider sales:

With Cider starting in row 9, we’d need to remove the top 8 rows.  That’s fairly easy.  You just need to:

  • Go to Remove Rows -> Remove Top Rows -> 8
  • Promote headers
  • Do whatever else you need to do to the data

But then you get an updated version of the data set, and it looks like this:

Uh oh.  Best case, if you run the previously generated Power Query script, you’ll end up with the following result:

But more likely, if you promoted the clean header row from the original data set, you’ll get a step level error since the revised data set doesn’t yield a “Cider” column when row 1 (shown above) is promoted to header:

Regardless of which one of these scenarios appears worse to you, I think we can agree that neither one is desired.  So how do we make this work on a dynamic basis?

Solution Architecture

The way I approach this issue is to split the job into 3 queries as follows:

Let’s look at how this works in practice…

Query 1:    Raw Data

The purpose of this query is quite simple:

  • Connect to the Raw Data source
  • Perform any preliminary cleanup
  • Rename the query as “Raw Data” (add something descriptive if you have many data sources
  • Set the query to load as a Connection Only query (disable the load in Power BI)

The key thing to note here is that we’re not doing any work to remove top rows beyond things that we know will ALWAYS occur.  We may want to drop columns and other things to reduce our data set, we just don’t want to touch anything we can’t guarantee will be exactly the same when we get updated data.

In the case of the data sample I showed above, I’m just going to connect to the data set and load it as connection only.  (While I could make an argument that the first 3 rows will always need to go, I will get rid of those when filtering to just the cider header anyway.)

Query 2:    Generate the Dynamic Row Number

The next step is to generate the number that will indicates the dynamic number of top rows we are looking for.  Despite the fact that the row which holds our data is changing, this is actually relatively easy once you know how:

  • Right click the Raw Data query -> Reference
  • Go to Add Column -> Add Index Column -> From 0
  • Filter one of the columns to the data you are looking for
  • Right click the [Index] Column -> Remove Other Columns
  • Go to Home -> Keep Rows -> Keep Top Rows -> 1
  • Right click the value in the cell -> Drill Down
  • Rename the query as “HeaderRows”
  • Set the query to load as a Connection Only query (disable the load in Power BI)

You now have a query that will dynamically pick up the number of rows to be removed from the top of the data set before it encounters the text you are looking for.

Step 3:       Remove Dynamic Number of Top Rows

So now comes the magical part:

  • Right click the Raw Data query -> Reference
  • Go to Home -> Remove Rows -> Remove Top Rows
  • Type in the current number of rows to remove (for this example, we’ll assume it is 8 rows)

The formula bar will now be showing the formula =Table.Skip(Source, x ) where x is the value you typed in:

  • Replace the value with “HeaderRows”

CAUTION!  Power Query is case sensitive.  You must spell and case HeaderRows EXACTLY as you did previously.  And if you separated those two words with a space, you need to escape it with hash marks and quotes:  #"Header Rows"

If you’ve replaced everything correctly, you should see that everything still works:

Does it Work?

Here’s what we see when we point RawData to the second data set I showed earlier:

The sample file for this example can be downloaded here.

Column Quality and Distribution Features are Finally Here

If you've worked with Power Query in Power BI, you've seen Column Quality and Distribution features already.  But I'm super exited to say that I've finally received it in the Insider build of Excel!

What are the Column Quality and Distribution Features??

If you're not familiar with these, let's break them down a bit:

Data Quality Indicators in Column Headers

In its most simplistic form, you'll see that you now get an indicator in the header of each column as shown here:

Column quality in Power Query Headers

Notice that red bar at the top of column C?  That's happening because there is an error in the column!  This is SUPER awesome, as we've been teaching for years that you have to scroll down looking for errors each time you change from text to a numeric data type.  This should really help, as the column header will make it obvious that something went wrong.  All green?  You should be good to go.  See red?  You've got potential problems to deal with.  (To be fair, you may still get errors after loading, as it is working with a data preview, but still, this will be a BIG help.)

The nice thing about this is that it is always on.  Nothing needs to be set up to enable this by default.  (And so far as I know, you can't turn it off - but then why would you want to?)

The Column Quality Feature

This one is enabled by going to the View tab and choosing "Column Quality", resulting in a view that looks like this:

Power Query Column Preview Settings

Which adds 3 new lines to the top of our columns:

Column Quality in Power Query

Now, I don't plan on leaving this on all the time. However, the cool thing here is that I can very quickly identify that 20% of my data rows in column C are errors.  And while I don't have any empty rows, it would show me that at a glance if I did.

The Column Distribution Feature

The idea behind the Column Distribution feature is to give you a quick glance at the distribution of values in your columns.  Is there a single value that shows up more than others, or is the data uniformly distributed?  This can help you quickly identify if the data has any consistency to it or not:

Column Distribution in Power Query

Notice that in Column a, we have 3 distinct values (b, d, e), but only one value that is truly unique (b only occurs once in the entire data set).  Meanwhile, in column b, each value occurs in the data set once and once only, with no values appearing more than this.  On a data set this small, this isn't going to be super insightful. But say you are pulling in data that spans your monthly summaries for the last year. If suddenly you see 13 values where you are expecting 12... that's a good indicator you might want to look a little deeper.

Something else that is worth noting is that you don't have to keep the Column Quality on to get the Column Distribution chart.  These features can be used together or individually.

The less obvious Column Profile Feature

The final checkbox in the Data Preview menu is the Column Profile feature.  This one is very cool as it only activates when you select the entire column.  And when you do so, it takes over the preview window providing most of the information from the previous two settings.

Column Profile in Power Query

We've got even more stats (although I wish it also showed the % of values like in Column Quality), as well as the value distribution chart.

What I love about this is that it can be called up on-demand by selecting the column, but doesn't get in my way when I don't need it.

My Thoughts on this Feature

I've been waiting for this to hit Excel for over a year now, and am SUPER happy to see it finally show up.  Personally, I'll be running with Column Profile (the last option) selected, but not Column Quality or Column Distribution. The reason is pretty simple... I want to minimize the amount of info between me and my data.  The Column Profile feature will give me what I need in an on-demand fashion when I see the data colours change at the top of my column. 🙂

How do I Get These Features?

This feature is so hot and fresh that it showed for Office Insiders this morning (v1910 build 12112.x.)  So you need to get to that version as a minimum to get these features.  Hopefully it will start rolling through the regular update channels next month!