This post is a guest article from Marcus Croucher, a finance professional based in Auckland, New Zealand. Marcus instantly fell in love with Power Query after seeing how it can easily transform data in ways Excel finds difficult, and how it can automate repetitive workflows.
I have been using Power Query in a professional capacity for a number of years, but have never fully understood exactly how the each keyword works in Power Query. I did some research around two years ago, but the documentation at the time was quite sparse (and still isn't great) and I did not have enough knowledge about wider programming to fully understand it.
In the meantime, I was looking for a way to use Power Query-like technology on OSX. I ended up learning a lot of Python, which has several libraries that have similar functionality to Power Query (albeit without the amazing graphical interface of Power Query.) Two notable examples are pandas (very popular data analysis library), and petl (a more light-weight and easy to use data processing toolkit).
This general programming knowledge gave me the background to understand some of the underlying concepts behind each, which I will now proceed to attempt to convey to you, the general intermediate to advanced Excel user, so that you can understand and wield the each keyword with confidence. I will use Python as a parallel and will link to some Python articles that expand on the underlying concepts I am trying to explain.
A Deep Dive into How the Each Keyword Works
In this article, I assume that you are familiar with Power Query and how to use it on a practical level. I also assume that you have some (limited) experience with the advanced editor and have seen the underlying M code, and understand the underlying data structures and how to access them.
- Table
- List (columns) – notated as {item one, item two, etc.} and accessed by [column header]
- Record (rows) – notated as [category1: data1, category2: data2, etc] and accessed by {row number}
I suggest going ahead and pasting the code snippets into a blank query in Power Query to see for yourself what is really going on.
When Might You Use the Each Keyword?
The each keyword in Power Query is used by quite a few of the built-in operations. For example, filtering a table based on a certain column features each here:
let
sample_table = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"}),
filter_to_score_5 = Table.SelectRows(sample_table,
each ([Score] = 5))
in
filter_to_score_5

Using the Each keyword to filter a table based on a certain column
The purpose is quite clear: we want to keep each row where the value in the Score column is equal to 5. However, how does this work? [Score] looks like it is referring to a full column – how can we do a full column equal to a single value? Can I access values from the row above or below the row we want to keep/discard?
Another example of the each keyword in Power Query is creating custom columns. Here, I create a custom column which is the combination of the index and the name columns:
let
sample_table = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"}),
add_ordered_name_column = Table.AddColumn(
sample_table, "ordered_name",
each Number.ToText([Index]) & " " & [Person])
in
add_ordered_name_column

Using the Each keyword to create a custom column which is the combination of the index and the name columns.
Similar questions apply. It is easy to follow the logic, but how it works and the limitations behind it are somewhat of a mystery.
So What is the Each Keyword in Power Query?
The current documentation has this to say about the keyword:
Each Keyword
The each keyword is used to easily create simple functions. “each ...” is syntactic sugar for a function signature that takes the _ parameter “(_) => ...”
Each is useful when combined with the lookup operator, which is applied by default to _ For example, each [CustomerID] is the same as each [CustomerID], which is the same as () => _[CustomerID]
Still not very clear (unless you have a background in functional programming languages). It would be really nice to get a full understanding, as the each keyword in Power Query is used in a number of places (filtering, custom columns, etc.) and understanding it would give us an understanding of what we can and can't do with it.
Turns out, you need to understand three things to understand each:
- Functions as first class objects
- "_" as a temporary variable (and the associated shortcuts within M language)
- Anonymous functions (each)
Functions as First Class Objects
If you are used to using Excel, you are used to functions (e.g., =SUM() ) being a bit magical. We cannot touch or modify them, and they are supplied fully formed by Excel. You can create custom functions with VBA, but these are far and few between, and still seem like the lessor cousin to the in-built functions.
In Power Query, functions can be thought of just another "object" – or just another type of data. This means they can be:
- Assigned to a variable and/or renamed.
- Just like we can do something like variable = 5, in Power Query, so we can do something like variable = function.
- Fairly easily created
- Used as a parameter to another function (!)
To understand this, we need to distinguish between calling a function (using it in our code) and referring to it (to name it or to use it within another function). Similar to other programming languages, to call a function we use the parentheses at the end like:
function()
If we want to refer to a function we just omit the parentheses like:
function
Let's demonstrate the renaming/reassigning of functions. First I take one of the supplied functions which takes a list (i.e., column) and calculates the sum. Next, I build a sample table, and then take a sum of one of the columns using the function that I had defined at the beginning.
let
sum_column = List.Sum,
sample_table = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"}),
sum = sum_column(sample_table[Score])
in
sum

Calling a function within another function.
It works!
Let's create a basic function
In Power Query, the syntax to create a function is:
(variable) => body of function
The body of the function is like any other query that returns a value, but instead of being delivered to the main Power Query interface it is delivered to whatever called it.
We also want to assign a name to the function, so let's expand the above template. Here, we create a simple function that takes a number and multiplies it by two.
multiply_by_two = (number) =>
let
output = number * 2
in
output
Looking good, but now we want to use this snippet in a full query. Let's build a (trivial) query that uses this function to multiply a variable by two and output the result:
let
multiply_by_two = (number) =>
let
output = number * 2
in
output,
x = 5,
y = multiply_by_two(x)
in
y

Building a simple query that uses the function we created.
Functions as inputs to other functions
Now that we have explored how functions are the same as any other data type or variable, and we have demonstrated how we can create our own functions, let's look at functions that take other functions as inputs. One example from the official documentation is the filtering function, Table.SelectRows.
About
Returns a table containing only the rows that match a condition.
Table.SelectRows(table as table, condition as function) as table
So the function expects a table (makes sense), and a function! How does this work? According to the documentation, the condition is "the condition to match".
It's not very well documented, but it turns out that this function expects a function to be supplied. It then applies that function to each row (record) of the table, and expects a true or false response from the function. It then uses this response to decide whether to keep or discard the row.
To recap, a record is a data type representing a row. We can access the items from a record by supplying the column name as follows: record[column name].
Let's create a function which we can then supply to Table.SelectRows on our sample data. Note – this is our case study example which I will develop throughout this article.
let
filterer_score_two_plus = (record) =>
let
value = record[Score],
result = value >= 2
in
result,
sample_table = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"}),
filtered = Table.SelectRows(
sample_table,
filterer_score_two_plus)
in
filtered

Creating the sample_table function.

Using the Table.SelectRows function to filter the table created by the sample_table function.
What have I done here? First I have created a function which takes a record, extracts the value in the Score column and returns true if it is greater or equal to two. I then construct my sample table and apply the Table.SelectRows function on it, supplying my recently constructed function as the second input. As we can see, the output as expected provides a table with all rows with scores greater or equal to two.
Now, only if there was a quicker and easier way to create such functions, as it looks like we might have to build these one-use functions quite a lot...
"_" as a Temporary Variable
The use of _ as a throw-away variable is common across several programming languages, Python included (see point 4 here). Usually, _ is used to name things that are not going to be used again and so it is not worth using up another name.
Here I write a query creating a table, assigning it to a variable called _. Power Query has no problem whatsoever using _ in this way.
let
_ = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"})
in
_
For Python, using _ is just a convention, but it appears that Power Query has expanded the functionality here. Let's say we just want the column of names from the above table. Usually we can do this by selecting the column by name using [column_name] as the selector.
let
_ = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"})
in
_[Person]
It turns out, we can omit the _ in this statement as Power Query will infer that if we just put [Person], the table we are referring to is the one called _. The example below works just as well as the one above:
let
_ = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"})
in
[Person]

If we omit the _ variable in this statement and just use [Person], Power Query infers that the table we are referring to is the one called _.
I wouldn't recommend this as general practice, as it is not well documented or understood behaviour and explicit is usually better than implicit. However, it does provide nice-looking code when used with the
each keyword.
Note that this technique only works for column selections [column_name] rather than row selections {row_number}, as Power Query will interpret {row_number} as a new list.
Applying the _ Variable
With this concept in place, let's revise our filtering query defined above:
let
filterer_score_two_plus = (_) =>
let
result = [Score] >= 2
in
result,
sample_table = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"}),
filtered = Table.SelectRows(sample_table,
filterer_score_two_plus)
in
filtered
The shortest we can actually get this is pretty close to our final stage. Let's put the function definition right into the Table.SelectRows function, and get rid of the let and the in (only really needed if there are multiple steps in the calculation):
let
sample_table = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"}),
filtered = Table.SelectRows(sample_table,
(_) => [Score] >= 2)
in
filtered
This is already pretty tight and clean. However, the (_) => is pretty scary if you're not familiar with function definitions, and the function definition symbol "=>" is fairly similar to our greater than symbol ">=".
Anonymous Functions
As we saw above, we end up creating one-off functions to supply to other functions quite frequently in Power Query. It seems silly to go through all of the syntax of creating and naming a function if it won't be used again. There is a concept called anonymous functions, which are functions that are defined but not named. They are used as soon as they are created. In Python, these are known as lambda functions.
We can actually use the each keyword in Power Query to define the function. (Yes, we are finally at the each keyword itself!) Each just minimizes the syntax for creating functions, by providing a default input variable name, "_", and removing the need for the => or anything else. So:
(_) =>
let
result = [Score] >= 2
in
result
can become:
each [Score] >= 2
You can still name this if you like (filterer = each [Score] >= 2), but using the each keyword in Power Query is much more useful if we use it inline. So we come to our final query, which should look fairly familiar to intermediate Power Query users:
let
sample_table = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"}),
filtered = Table.SelectRows(sample_table,
each [Score] >= 2)
in
filtered
So What have We Discovered?
Now we have demystified the magic of the each keyword in Power Query, but what have we discovered?
- Each itself doesn't actually do that much, slightly simplifying our function definitions.
- Once you understand the concept of supplying functions as inputs to functions, everything becomes a lot clearer.
- The missing piece of the puzzle comes from an understanding of the special _ variable name, which enables us to take a shortcut when selecting columns from tables or records (we can use [column name] instead of _[column name]).
- The use of each hinges on the behaviour of the underlying function. Because we know that Table.SelectRows calls the supplied function on each record of a table and expects a true/false response, we can construct a function that works as we expect it to. Unfortunately, this is not very well documented.
How can we use this information? I can think of a few different ways:
- If we have a complicated add column or filtering step to do, we can separate out the underlying logic into a separate function which can have multiple steps. This removes the complexity from the main body of the code and abstracts it away. It can be easier to read filter_to_current_year_red_cars rather than try to interpret all of the various equivalence statements as you read through the code.
- Now that we understand the context that is delivered (only the current record/row), we can construct functions that can do more interesting things (these usually require an index column). For example, we can filter a table based on a second table, or add a new column that subtracts the current row from the previous row.
Here's an example that:
- Abstracts the logic to a helper function, and
- Adds a column based on the difference for each row from the previous row.
let
row_difference = each
if [raw_index] = 0
then [Score]
else [Score] - add_index[Score]{[raw_index] - 1},
sample_table = Table.FromColumns({
{"Fred", "Mary", "Phil", "Sue"},
{1, 2, 3, 4},
{5, 2, 5, 1}},
{"Person", "Index", "Score"}),
add_index = Table.AddIndexColumn(sample_table,
"raw_index", 0, 1),
add_difference_column = Table.AddColumn(
add_index, "difference",
row_difference)
in
add_difference_column

Adding a column based on the difference for each row from the previous row
So does this help your understanding of the each keyword in Power Query? Has this given you some new ideas on how to structure your queries? Please let me know if you have any questions!