Keep Only Numbers in Power Query

My last blog post was interesting in that I got a few emails about it.  Both Imke Feldman and Bill Szysz sent me better methods, and a blog commenter asked for a slightly different version.  For this post, I’m going to adapt Imke’s technique to show how we can Keep Only Numbers from a string of text (removing all other characters.)

Other Posts on the Subject

Each of these posts will be a targeted to a specific scenario with it’s own idiosyncrasies.  Which you need depends on your specific situation, of course

  • My original post to split off measurements leaving only the numbers (this will only work if there are no numbers in the measurement.)
  • The method in this post (which will remove all numbers – or text – in the input)
  • Bill Sysyz’s method to split off measurements (coming in a future post, but better than my original as it doesn’t break when measurements also include numbers)

In this Post:

In this post, we are going to keep only numbers in our data set.  Actually, we’ll also keep spaces and decimals the first time around, but we could easily modify the function to clear those too.  So for our first go, we’ll convert the data in the left column below, to show as displayed in the right column:


Of course, I started by just pulling the data into Power Query via the From Table command.

How to Keep Only Numbers

Looking at this from a logic point of view, what we want to accomplish is to remove any character that is not a number.  Ideally, we would like to use a function like this in a custom column in order to do so:

=Text.Remove(text as nullable text, removeChars as any)

The first parameter should be pretty easy, we could just feed in the [Quantity] column, but how would we provide all the characters to the last parameter?

Here’s the cool part… removeChars is an “any” datatype… that means we’re not restricted to a single character, we can actually provide a list.  So all we need to do is find a way to create a list of the characters to remove.

This is where Imke’s email to me was really helpful.  She had a step similar to the following in her code:

CharsToRemove = List.Transform({33..45,47,58..126}, each Character.FromNumber(_))

So what does this do?  It actually creates a list of non-contiguous numbers (33-45, 47, 58-126), then transforms each value in the list into it’s alphanumeric equivalent.  A partial set of the results is shown here:


For reference, character 32 is a space, 46 is a period, and 49-57 are the values from 0 through 9 – facts that you can discover by changing the values inside the lists.

In order to use this, I just popped into the Advanced Editor, and pasted the line above right between the “let” and “Source=…” lines.  (Don’t forget to add a comma at the end.)  And with a nice list of values contained the the CharsToRemove step, we can now create the custom column from the Source step:

  • Add Columns –> Add Custom Column
    • Name:  Result
    • Formula:  =Text.Remove([Quantity],CharsToRemove)

And it loads up nicely:


Now, keep in mind here that the purposed of this is to strip all characters except the numbers.  In the case of things like m2 and m3 in this data set, we’re left with a the final value, but that is exactly what the query is designed to do.

The final M code for this solution is:

CharsToRemove = List.Transform({33..45,47,58..126}, each Character.FromNumber(_)),
Source = Excel.CurrentWorkbook(){[Name="RawData"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Quantity", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Result", each Text.Remove([Quantity],CharsToRemove))
#"Added Custom"

Keeping Only Numbers

What if we wanted to also remove any spaces and decimals?  Easy enough, just add those values to the original list in the CharsToRemove step as follows:

CharsToRemove = List.Transform({32,46,33..45,47,58..126}, each Character.FromNumber(_))

And the result:


Removing Numbers Only

Now let’s keep the text and remove the numeric characters from 0-9 only.  To do this we modify the original list values again:

CharsToRemove = List.Transform({48..57}, each Character.FromNumber(_))



And the result:


End Result

This is pretty neat.  Once we recongnize which character code represents each character, we can build a list of those characters to remove, and take care of them all in one shot.  To put it all together, here is a look at the different views all shown in one table:


You can also download the completed file here.

Last PowerQuery.Training Class of 2015!

There will be a regular blog post coming later this week, but we wanted to just throw out a quick heads up that we are currently accepting registrations for the last PowerQuery.Training class of 2015.

Registrations are open now for the class which begins on November 24, 2015.  This will be your last chance of 2015 to get an in depth training class on the best damn tool to hit Excel in 20 years.  (Sorry Power Pivot, but Power Query is going to reach more people overall.)

For more details on why you need to take this amazing live online workshop, check out the details here.

To register, you can follow this link (and click the Register button on the bottom right of the page.)

And don’t forget that when you register you get a free digital copy of our amazing new M is for Data Monkey book too.

Hope to see you there!

Separate Values and Text in Power Query

I recently received a comment on one of my blog posts asking how to separate values and text, especially when there is no common delimiter such as a space separating them.  This is a bit of an interesting one, as there is no obvious function to make this happen.


The scenario the user has here is a list of values with their unit of measure, similar to this:


This issue here is that we don’t really have anything to easily split this up, as there isn’t really a good pattern.  Sometimes there are spaces after the values, sometimes not.  The letters change, and there is non consistency to how many characters the values represent.  So how would you approach this?

You can download the sample workbook here.

My Route

I think that a solution for this type of problem is going to be specific to the data in use.  Looking at the sample data, I figured that I can probably bank on all the numbers being at the beginning of the string, and that I probably won’t see something like square meters expressed as m2.  Of course, if that assumption wasn’t correct, I’d have to come up with another method.

At any rate, the angle I approached this was to build a custom function to remove the leading numeric values.  That should leave me with the text values, which I could then replace in the original string.  Let’s take a look.

Removing Numbers

As we recommend in M is for Data Monkey, the way to build a custom function is to start with a regular query that will let us step through each piece you need to do.

So focussing on doing this through the user interface, here’s how I started this solution.

  • Create new Power Query –> From Other Sources –> Blank Query
  • In the formula bar, I typed in 1.07Kg  (no quotes, just that text and pressed Enter
  • I then right clicked the text in the Power Query window, and choose to convert it to a list


Of course, you can’t do a ton with Lists in the user interface, so I converted it to a table:

  • List Tools –> Transform –> To Table –> OK

To be fair, I could have started by creating a record or a list from scratch (as we show you how to do in M is for Data Monkey,) but I didn’t really need to here in order to get up and running quickly.  Regardless, I’m now sitting in a nice place where I have the entire UI exposed to do what I need (which was my original goal.)


At this point, things become pretty easy:

  • Right click Column1 –> Replace Values –> Replace 0 with nothing
  • Repeat for 1 through 9 and the decimal character

This removed all numbers and decimals, leaving me with just text.  But because I know some of the values had spaces in them as well, I should deal with that:

  • Right click Column1 –> Transform –> Trim


The final thing I did was to drill into the data point there, as I don’t really want to return a table when I convert this into a function.  To do that I needed to:

  • Click the fx on the left of the formula bar
  • Append the following to the text in the formula bar:  [Column1]{0}


Notice that we now have just the data point, not the Column1 header.

Converting the Query to a Function

Now, we’ve got a neat little function that will let me take a data point, sanitize it, and turn it into data point with no leading values.  But how can I repurpose that to use it for every record?  The answer is to turn this query into a custom function, as we describe in  Chapter 22 of M is for Data Monkey.  Here’s how we do it:

  • Go to View –> Advanced Editor
  • Right before the “let” line, add the following:

(Source) =>

  • Go and place two / characters in front of the current Source line in order to comment it out (otherwise it would overwrite the function input)

//Source = “1.07Kg”,

  • Click Done
  • Rename the query to fxRemoveNumbers

That’s it.  We’ve converted it to a function.  Now you can go to Home –> Close & Load to save it and it’s ready for use.  The interesting part here is that creating the logic is the hard part, converting it to a function is deadly easy.

Separate Values and Text

So now let’s use our new function to separate values and text.  Here’s how I did this:

  • Select any cell in the table –> create a new query –> From Table
  • Go to Add Column –> Add Custom column
    • New column name:  Measure
    • Column formula:  fxRemoveNumbers([Quantity])

And we’ve got a nice new column with just the textual values.


Not bad, now we just need to figure out a way to replace the matching text in the Quantity column with nothing…  After checking MSDN’s Power Query formula guide, I found a formula called Text.Replace() that seems it should do just that:

  • Go to Add Column –> Add Custom column
    • New column name:  Value
    • Column formula:  =Text.Replace([Quantity],[Measure],"")




To summarize here, we’re going to look at what is in the Quantity column and replace any instance of the text in the Measure column with the value between the two sets of quotes (i.e. nothing.)  The results are shown below:


Now it’s just a simple matter of doing some cleanup:

  • Right click the Value column –> Change Type –> Decimal Number
  • Right click the Quantity column –> Remove


And there you go.  It’s finished.  We simply need to go to Home –> Close & Load to commit it, and then refresh it any time we need it.

M is for Data Monkey

The book is now available and is packed with good information that will help you solve this issue as well as many others.  Learn more about the book here.

Merge Data Based on Two Columns

This past weekend I attended SQL Saturday in Portland, OR.  While I was there, I attended Reza Rad’s session on Advanced Data Transformations with Power Query.  During that session, Reza showed a cool trick to merge data based on two columns through the user interface… without concatenating the columns first.

The Issue

Assume for a second that we have data that looks like this:


There’s two tables, and we want to join the account name to the transaction.  The problem is that the unique key to join these two tables (which isn’t super obvious here) is a combination of the Acct and Dept fields.  (Elsewhere in the data the same account exists in multiple departments.

To get started, I created two connection only queries, one to each table.

  • Select a cell in the left table (Transactions) –> create a new query –> From Table –> Close & Load To… Connection only
  • Select a cell in the right table (COA) –> create a new query –> From Table –> Close & Load To… Connection only

My Original Approach

Now, with both of those created, I want to merge the data so I get the account name on each row of the Transactions table.  So how…?

Originally I would have edited each query, selected the Acct and Dept columns, and merged the two columns together, probably separating them with a custom delimiter.  (This can be done via the Merge command on the Transform or the Add Column tab.)

Essentially, by concatenating the columns, I end up with a single column that I can use to dictate the matches.

Reza’s presentation showed that this isn’t actually necessary, and I don’t need to merge those columns at all…

Merge Data Based on Two Columns

So here’s how we can get those records from the COA Table into the Transactions table:

  • Right click the Transactions query in the Workbook Queries pane
  • Choose Merge
  • Select the COA query

The data now looks like this, asking for us to select the column(s) we wish to use for the merge:


So here’s the secret:

  • Under Transactions, click the Acct column
  • Hold down the CTRL key
  • Click the Dept column

And Power Query indicates the order of the columns you selected.  It will essentially use this as a temporary concatenated value!


So now do the same to the COA table:


And then complete the merge.  As you can see, you get a new column of data in your query:


of course, we can expand NewColumn to get just the Name field, and everything is working perfectly!


End Thoughts

This is pretty cool, although not super discoverable.  The really nice piece here is that it can save you the work of creating extra columns if you only need them to merge your data.

I should also mention that Reza showed this trick in Power BI Desktop, not Excel.  But because it’s Power Query dealing with the data in both, it works in both.  How cool is that?

Clean WhiteSpace in PowerQuery

The other day as I was working through a model, I once again tripped upon the fact that Power Query’s Text.Trim function doesn’t clean whitespace inside the text string, only at the ends.  For those who are used to Excel’s TRIM function, this is a frustrating inconsistency.

Just to circle on it, here’s the difference:

Source Function Result
Excel =TRIM(“  trim   me  “) “trim me”
Power Query =Text.Trim(“  trim   me  “) “trim   me“

Typically, I’ve just gone through the cycle of replacing a double space with a single space a few times on the same column to deal with this issue.  The issue, of course, is that you need to do this twice if there are 4 spaces, but add more spaces, and you have to do this more times.  Doesn’t seem like a really robust solution.

At any rate, this time I emailed one of my friends on the Power Query team and suggested that they should implement a function to make this a bit easier.

My Suggestion for a Clean Whitespace Function

The gist of my suggestion was to create a new function that would not only trim the whitespace internally, but would also allow you to specify which character you want to clear out.  This way it would work nicely to clean whitespace in the shape of spaces (the usual culprit in my world), but would also allow you to substitute in other characters if needed.  (Maybe you need to replace all instances of repeating 0’s with a single 0.)

It got referred to another friend on the team, (who wishes to remain nameless,) and he responded with some pretty cool code.  I’ve taken that code, broken it down and modified it a bit, and the end result is a slightly different version that can work the same as Excel’s TRIM() function, but adds an optional parameter to make it even more robust.  For lack of a better name, I’m going to call it “PowerTrim”.  (Just trying to do my part to keep the Power in Power Query!) 😉

Here’s the function:

(text as text, optional char_to_trim as text) =>
char = if char_to_trim = null then " " else char_to_trim,
split = Text.Split(text, char),
removeblanks = List.Select(split, each _ <> ""),
result=Text.Combine(removeblanks, char)

And to implement it, you’d take the following steps:

  • Copy the code above
  • Create a new query –> From Other Sources –> Blank Query
  • Change the query name to PowerTrim
  • Go into the Advanced Editor
  • Select all the text and replace it with the code above –> Done

Like this:


How it Works

We’d call this from a custom column, feeding in a column of text, and specifying the character (or even string of characters) we’d like to trim.  The function then works through the following process:

  • It checks to see if the char_to_trim was provided, and uses a space if not
  • It splits the text by that character, resulting in a list:


(This list shows the word “bookkeeper” split by “e”)

It then:

  • Filters out any blank rows
  • Combines the remaining items using the original character to split by

(The original version was actually all rolled up in one line, but I find it easier to debug, step through, examine and play with when it’s separated.)


Here’s some examples of the function in action. I started with a raw table from Excel.  (Create a new query –> From Table)


And added a Custom column by going to Add Column –> Add Custom Column

  • Name:  Trim_null
  • Formula:  =PowerTrim([Text])


Notice that in the first row it trimmed the leading, trailing and internal spaces.  Just like Excel!  (Remember that if you used Power Query’s default Text.Trim() function, you would return “trim   me”, not “trim me”.)

Now, let’s add another and try with an alternate character… like 0.  Again, we go to Add Column –> Add Custom Column:

  • Name:  Trim_0
  • Formula:  =PowerTrim([Text],”0”)


In this case the extraneous zeroes are trimmed out of row 3, leaving only a single one.  Cool stuff.  Now what about the “e”. Let’s see how that one goes.

Once more to Add Column –> Add Custom Column:

  • Name:  Trim_0
  • Formula:  =PowerTrim([Text],”e”)


The first time I looked at this, I thought there was an issue with the function.  But then I remembered in this case we are removing all leading and trailing e’s, as well as replacing any duplicate e’s with a single e.  You can see that this is indeed what happened in both rows 2 and 4.

Final Thoughts

I wish there was a way to get this to easily role into the Text functions category, so that I could call it something like Text.PowerTrim() or even replace the Text.Trim() function with my own.  Unfortunately a query name can’t contain the period character, which kind of sucks.  I guess it’s to to protect you from accidentally overwriting a function, but I’d like the ability to do it intentionally.

Allocate Units Based on Dates Using Power Query

I ran into an interesting wrinkle in a model I’m building, where I need to allocate units based on dates.  The idea here is to allow a user to the number of units to allocate, the start date and the end date.  From there, I wanted to use Power Query to work out how many months have elapsed, and then tell me how many units should be allocated to each year in the period.


Here’s a look at my data (which you can download here):


So the idea here is that I need to come up with a table that shows that data should be allocated as follows:

10-1-2015 8-53-05 AM

So, if we look at the Traditional Single Family, the sales cycle is the 6 months from Aug 2015 through Jan 2016.  With the first 5 months being in 2015 and the final month being in 2016, that means we need to allocate 5/6 of the total units to 2015 and 1/6 to 2016.

Allocate Units Based on Date: Method

My initial thought was to try and find a date difference or duration type function to return a count of months between two dates.  Unfortunately, such a function doesn’t seem to exist.  For that reason, I decided I’d just go ahead and build my own function to do the job.

Step 1: Create a function to return a list of months

To start with, I needed a list of month end dates.  I started a blank query, jumped into the Advanced Editor and built a simple query to provide a hard coded startdate and enddate, then create a list from one to the other:

Source = {Number.From(startdate)..Number.From(enddate)}

That list yielded me a list of date serial numbers, so I then:

  • Went to Transform –> To Table
  • Changed the column’s data type to Date
  • Renamed the column to Date
  • Converted the column to month end dates (Transform –> Date –> Month End)
  • Removed Duplicates (Home –> Remove Duplicates)

The end result is a short table that shows only the month end dates:


Step 2:  Add a Year End date column

I then needed to find a way to count the number of months in each year.  To do that I:

  • Added a year end column (Select the Date column –> Add Column –> Date –> Year –> End of Year)
  • Went to Transform –> Group By and set up the grouping as follows:
    • Group by:  EndOfYear
    • New column name: Months_x_Year
    • Opeartion:  CountRows


Step 3: Modify to list Months in Period

At this point I realized that I had a pretty serious miss in my logic.  If I wanted to apply this as a proportion, I needed to also track the total amount of months in the period (so that I could allocate 5/6 to 2015 and 1/6 to 2016.)

To fix this, I added another level of grouping, but with a twist…

  • I removed the “Group By” column
  • I created an “Original” column, and set the operation to All Rows
  • I created a “Months_Total” column, set to SUM the Months_x_Year column

Here’s the configuration:


And the result:


This is pretty slick, as the grouping returned the total count of months, but also returned the original table.  Of course, when you expand the table using the double headed arrow to the top right of the Original column, it runs the Months_Total row down each row that gets added:


Step 4:  Turn the routine into a function

The next step was to go back into the Advanced Editor, and turn this into a function. That’s actually not hard at all, requiring only three lines to be modified.  The first 4 lines of the function are shown here:

(startdate as date, enddate as date) as table =>

As you can see, I basically added the parameter line at the beginning (using the same variable names for startdate and enddate), then commented out the lines I initially used in order to populate the data I used to build my test case.

Finally, I renamed the function to fnGetAllocationBase, and saved it.

Step 5:  Using the function

To use the function, we basically now just load the original table, then feed the start/end dates in to it.  Here’s how I went through that process:

  • Select the table –> Power Query –> From Table
  • Select the First Month and Last Month columns –> Change Type –> Date
  • Add Column –> Add Custom Column
    • Formula:  =fnGetAllocationBase([First Month],[Last Month])

I now had a new column containing the tables I needed with my allocation basis:


As I didn’t need month granularity for my model, (we’re budgeting on an annual basis,) I’m now able to:

  • Remove the First Month and Last Month columns
  • Expand the columns from the Custom column
  • Add a new custom column with the following details:
    • Name:  Units
    • Formula:  =[Units To Allocate]*[Months_x_Year]/[Months_Total]
  • Remove the Units to Allocate, Months_x_Year and Months_Total columns
  • Set my data types

And the end result is a nice table that will serve my sales model nicely:


Split Power Queries

There was a really cool new feature added in the latest Power Query update:  The ability to split Power Queries.  This is something that has always been possible by editing the M code manually, but this makes it super simple.

Where this is useful

Where can this be super helpful is when you’ve built a nice query to reshape your data.  It runs fine for a bit, then you realize that you need to create a larger data model out of the data.  For example, assume we have this data:


And we run it through these steps:


To come up with this:


All of this is fairly easy as Power Query goes, but now the model needs to grow.  In order to expand it, we also want to create a table of unique Inventory Items and a table of unique Sales people.  Basically we want to de-aggregate the data that should have come in from separate tables in the first place.

Methods to Split Power Queries

Like always, there are a variety of ways to do this.  You could create new queries to load the data from the original table, then cut it down to the columns needed in each case.  But that causes and extra load requirement.

You could manually cut the code up to the step required, create a new blank query, then reference the new query from the previous.  But that takes some knowhow and tinkering that many people won’t be comfortable with.

Starting in Power Query version 2.26 (released today), we have a MUCH easier way.  Let’s assume that we want to split this query right after the Changed Type step, so that we can create an Items table and a Saleperson table in addition to the Transactions query that we already have.

How to Split Power Queries – the easy way

To start, we need to understand the steps and what they give us.  We can step through each step of the query, and find the step that gives us the jumping off point we need.  In my case, this is the Changed Type step.  We then right click the step AFTER Change Type, and choose Extract Previous:


You’ll be prompted to enter a name (I’ll use “Base Load”), and upon clicking OK, you’ll see a couple of things happen:

  • A Base Load query is created
  • The Queries Navigator opens on the left, showing you’ve now got multiple queries
  • The Transactions query (the one I’ve been working on) gets shorter
  • The Transactions query’s Source Step gets updated to #”Base Load”



You can see these changes here:


So the Transactions query still contains all the subsequent steps, but the Source step changed, and the Changed Type step is now in the Base Load query:


The biggest piece of this whole thing, however, is that the Base Load query is still pointing to the raw source table, but the Transactions query now loads from Base Load, NOT the original data source.  So it’s following the staging/loading approach I talk about in this blog post.

Now, how can we use this…?

Making Use of the Split Power Queries

So far, very little is different to the overall goal, except that we load in two stages.  Let’s change that by creating a new query that references the Base Load query:

  • Right click the Base Load query in the Query Navigator (at left)
  • Choose Reference
  • Change the query name to Salespeople
  • Right click the Sold By column –> Remove Other Columns
  • Select the Sold By column –> Home –> Remove Duplicates
  • Right click the Sold By column –> Rename –> SalesPerson

And we’ve now got a nice query that shows our unique list of sales people:


Now let’s build the SalesItems table:

  • Right click the Base Load query in the Query Navigator
  • Choose Reference
  • Change the query name to SalesItems
  • Right click the Inventory Item column –> Remove Other Columns
  • Select the Inventory Item column –> Home –> Remove Duplicates
  • Right click the Inventory Item column –> Rename –> SalesItem

And this table is done now as well:


Loading the Split Power Queries to the Data Model

The final step is to load these to the Data Model.  We’ve actually created three new queries in this session, but we don’t get the liberty of choosing a new destination for one of them.  Instead, we get to choose a single loading style that will be applied to ALL of them.  (If in doubt, I’d suggest that you load queries as Connection Only first, then change them after if you need to pick different destinations.  This will save you waiting while Power Query litters your workbook with extra worksheets and loads the data to them.)

For our purposes here, I’ll load them directly to the Data Model:

  • Home –> Close & Load To…
    • Select Only Create Connection
    • Select Add to the Data Model
    • Click Load

The only side effect here is that that the Base Load query was also loaded to the data model, and I don’t need that.  So I’ll now quickly change that.

  • Go to the Workbook Queries pane –> right click Base Load –> Load To…
  • Uncheck “Add this data to the Data Model” –> Load

And I’ve now got my tables where I need them so that I can relate them and build my solution.


Final Thoughts

This is a fantastic feature, and I was actually going to blog on how to do this the old way, until they surprised me with this update.  I’m a huge fan of the staging/loading approach, and this will certainly make it easier to retrofit a query after it’s already been built.

Performing OR logic tests

When working with Power Query, it’s actually pretty simple to restrict records in a table using an AND criteria.  As filters on different columns are applied in a cumulative fashion, we just keep applying more to filter down the data.  But what about performing OR logic tests?

The issue

Assume we’ve got a list that looks like this:


And we want to preserve any record where the inventory item is Talkative Parrot OR the item was sold by Fred.  So how would you do that?

We’d be tempted to start by filtering the Inventory Item column to just the Talkative Parrot item.  But that would remove our very first row, which shows an item sold by Fred.  And if we started by filtering the Sold By column to Fred, we’d lose the Talkative Parrot in row 2.

Performing OR logic tests – via merging columns

We could merge the Inventory Item column together with the Sold By column, then filter for any item that contains Talkative Parrot or Fred.  While this might work in this situation, it could also lead to cross contamination if there were names like Manfred or Wilfred, or if the name we were searching for appeared in one of our Inventory items (like “Fred the dog”.)

Performing OR logic tests – via list functions

Using list functions is a far better way to make this kind of comparison. In fact, there is a function that is specifically built for this purpose, which works exactly like Excel’s OR() function; List.AnyTrue().

To use this, we’ll set up a new custom column with the following formula:

{[Inventory Item]="Talkative Parrot",

[Sold By]="Fred"}

The formula breaks down as follows:

  • The function name is List.AnyTrue()
  • The function requires a list, which means that you need the {} inside the parenthesis.
  • Each list item (comparison), is nested inside the curly braces, and is separated by commas

Knowing this, we then just need to include the comparisons we want to make, with the field name surrounded in square brackets, and the comparison surrounded in quotes (as we are making a text comparison here.)

And the result, as with Excel’s OR() function, is a list of TRUE/FALSE values:


Of course, we can do better than this, by wrapping the original function in an if statement:

=if List.AnyTrue(
{[Inventory Item]="Talkative Parrot",[Sold By]="Fred"}
then "Meets Criteria!"
else "No Match"

Which returns useful messages instead of just TRUE/FALSE values:


Naturally, you could filter these records or keep them, as your need dictates.

Performing AND logic tests

If you’re simply looking to filter out records based on an AND criteria, then applying successive filters will work.  But what if you only want to tag the records, so that you can use this field in a slicer?  In that case you need to preserve both the matching and non-matching items.

Fortunately, Power Query has a list function for this as well:  =List.AllTrue().

This function is created in the same manner as List.AnyTrue(), where the following would only show a “Meets Criteria!” message where the Inventory Item is Talkative Parrot AND the Sold By field contains “Fred”:

=if List.AllTrue(
{[Inventory Item]="Talkative Parrot",[Sold By]="Fred"}
then "Meets Criteria!"
else "No Match"

A Quick Note

This topic is contained in the forthcoming M is for Data Monkey book.  And it’s not too late to get YOUR name inside!

Get Your Name in M is for Data Monkey

I’m pleased to say that our new book “M is for Data Monkey” is now complete, through the technical edit, and into the copy edit stage.  While there is still more work to go, the book is on schedule for release in November (with Amazon shipping in December, I believe.)

At this point we’re just finalizing the “Thank You” section, and wanted to invite everyone who has supported and pre-ordered the book the chance to get your name in M is for Data Monkey:

How to get your name in M is for Data Monkey

If you’d like to get your name inside, we need you to submit the following information via email to

  • A copy of your receipt for the pre-order of the book - if you haven’t pre-ordered it then you can do so here
  • The name that you want to be displayed in the book – for example, it can be Miguel ‘Mike’ Escobar, Ken ‘Power Query’ Puls, your company name, or whatever you want to see there.

We only have space for a limited amount of names (around 80), so please submit your information as soon as possible!

The deadline to submit this information is September 16th, 2015 at 23h59m US Eastern time, so don’t miss out!

Power Query’s Extract Text Feature

This will be a short post, as today we are leading our second sold out Power Query workshop at  I wanted to make sure I still got something out for my readers today though.  This time I’m looking at a feature that was added in the August Power Query Update: Extract Text.

You can find these commands on both the Transform and the Add Column tabs, with the former just converting your selected column, and the latter creating a new column of results while preserving the original column.


Here’s what’s interesting to me about these functions:

They replicate the LEN(), LEFT() and RIGHT() functions, saving you having to build them manually as I discuss here.  This is handy, and pretty seamless.

Here, you’ve got a table of words, and the second column is generated using Excel’s LEFT() function.  The final column was generated by:

  • Selecting the Word column –> Add Column –> Extract –> First Characters –> 4


Nice and consistent with Excel’s LEFT() function.

Likewise, Last Characters replicates the RIGHT() function by going to:

  • Selecting the Word column –> Add Column –> Extract –> Last Characters –> 4


And Length replicates the LEN() function:

  • Selecting the Word column –> Add Column –> Extract –> Length


The Range function is a user interface implementation of what should be equivalent to the MID() function.  In this case, however, it’s still has the following issues for Excel pros:

  • It is Base 0, meaning that you want to start at the 3rd character of the text string, you need to specify that you want to start at character 2 (Power Query starts counting at 0, not 1)
  • If you provide a value for the “number of characters to return” that is larger than the total number of characters – the starting character, you’ll get an error.  (Unlike the MID function)

So when you try to use Range in place of MID as follows:

  • Selecting the Word column –> Add Column –> Extract –> Range
  • Starting Number: 5
  • Number of Characters: 4

You get this:


Ugh.  And correcting to subtract one from the starting index, you get this:


Better, but still errors.

Honestly, I was hoping the user interface implementation would solve those issues building the more complicated code shown in my blog post on the subject.

So, at the end of the day, it’s awesome, but still doesn’t offer full “Excel parity”.  And if you want that, you’ll need to learn to work with formulas in Power Query.

The good news?  We teach how to do that in our Power Query workshop.  In addition, we’ve just announced a new registration intake.  If you’re interested in learning how to master Power Query, check it out at