The IF Function In Power Query

In my last post I talked about useful text functions, and how they differed between Excel and Power Query.  Today we’re going to look at another compare/contrast scenario, but this time it’s going to be the IF function in Power Query.

Critical background

The only important thing we need to remember here is that all functions in Power Query, whether text, logic or anything else, are case sensitive.  That may strike you as weird in this one, but we need to remember that “if” is not the same as “IF”, and that Power Query will gag on the latter.

The base scenario

For this example I’m going to work with a table of data that holds a customer number, a boat type and a billing code schema.  While the data has been scrambled, this represents a real structure that we use in my day job.

There’s no real mystery to the first two items, but the billing code schema we designed holds a ton of info.  It’s always 10 characters long, and breaks down like this:

  • Char 1 – Alpha – Indicates the division (G = Golf, F = Fitness, M = Marina)
  • Char 2 – Alpha – Indicates the billing type (D = Dues, P = Pass, A = Annual Moorage, P = Periodic Moorage)
  • Char 3-4 – Numeric – Indicates the number of months of coverage for the product (1-12)
  • Char 5-6 – Numeric - Indicates the start month (and subsequent anniversary) for the customer’s product
  • Char 7-8 – Variable – Slip length (in feet) for a boat in the case of marina customers, or SG, CP or CS for golf (indicating single, couple primary or couple spouse)
  • Char 9 – Text – A variety of single letter codes indicating specific things we want to know. (Will factor in to a future post.)
  • Char 10 – Text – Indicates the payment method (F = Financed, P = Paid up front, C = Comp/Honorary)

So a table of customer data could look like this:

SNAGHTML153bc957

Turning data into more useful data

So great, we’ve got this awesome billing code schema, but it doesn’t really tell me anything when I look at it, as it’s too complicated to read.  I really need to break this into separate pieces, and make useful and readable columns out of it.  So that’s what I’m going to start doing now.

The first step is, of course, to click in the table and go to Power Query –> From Table.

My goal here is to make a column that says “Annual” if the second character is an “A”, or “Periodic” if the second character is a “P”.  To start, I’m just going to reach back to last week’s article and make sure I can identify which character I’m looking at.  So first I’ll click “Add Custom Column”.

I’ll call my new column “Seasonality”, and use a formula to extract just the 2nd character:

SNAGHTML16328557

And with that in place we can now focus in on the important data here:

SNAGHTML163368c2

Writing IF functions in Power Query

Assuming the data was in a table that started in row 2 of the worksheet, either of the following formulas would work to convert “A” to “Annual” or “P” to “Periodic”:

=IF([@Seasonality]="A","Annual","Periodic")
=IF(D2="A","Annual","Periodic")

Easy enough, right?  But look at how the signatures differ from Excel to Power Query:

Excel =IF(test, value_if_true, value_if_false)
Power Query =if test then value_if_true else value_if_false

Notice that there are no parenthesis or commas in the Power Query version, but you need to actually type out the “then” and “else” portions.  So to create the same thing in Power Query, we’d need a new column that uses the formula:

=if [Seasonality]="A" then "Annual" else "Periodic"

Or, as is my preference, we modify the Seasonality column we already built, wrapping the text extraction with the IF function as follows:

=if Text.Range([BillingCode],1,1)="A" then "Annual" else "Periodic"

Once we modify the original formula, our table now correctly shows the different values all the way down:

SNAGHTML1642027d[5]

Observations

Once again, I find this a bit of a departure from regular Excel formulas.  Although it’s not hard to make the transition once you understand it, it would still be nice if the language could leverage the skill set we’ve worked so hard to master.  You could argue that the verboseness of the Power Query IF function is easier to read, but it’s still inconsistent with the formulas we know and love.

I still feel it would be nice if we could have an alternate pointer into the same function so that I could type this in Power Query too:

=IF([Seasonality]="A","Annual","Periodic")

I think that would just make it so much easier to get off the ground running for Excel pros.

I’ll also point out that the error message Power Query gives you when you create an IF function or formula is not exactly helpful:

image

Most Excel pros aren’t going to understand what “Token Eof expected.” means, and I really have to question how it is telling me anything that I need to do to fix the formula.  Hopefully, in future versions of Power Query we get a more helpful message that says something like “It looks like you typed an upper case formula name.  Can I fix that for you?” (Maybe that will come with Intellisense and auto-complete…)

Taking this further

Next blog post we’ll look at how to take this a bit further… extending our conditional logic to look up a corresponding value in a list, avoiding having to nest several IF functions within each other.

5 Very Useful Text Formulas – Power Query Edition

Years ago I published an article on my site called Five Very Useful Functions For Working With Text.  This post is alternative version, but with a twist.  I bring you 5 Very Useful Text Formulas - Power Query Edition.

The article is geared to explaining five functions specific to working with Text in Excel, and are a set of the most under-utilized functions in Excel (in my opinion).  It even lets you try them out live in the web page without even having to open Excel at all.

Now, over the past few months I’ve been working with Power Query, and one of the things that’s been driving me a little crazy is that the formula names in Power Query and not the same as they are in Excel.  I personally think this is a bit of a mistake, and that the formula names in Power Query should have been a bit more congruent with standard Excel formulas (Power Pivot’s DAX functions are similarly named to Excel, so why not Power Query’s M language?)

Some critical background

Before you bang your head against the wall, there are two things that are really different between Excel formulas and Power Query formulas:

Case sensitivity:

While Excel formulas are not case sensitive, Power Query formulas are.  If the Power Query formula signature says “Text.Start” then “TEXT.START” or “text.start” will NOT work for you.

Base 1 vs Base 0

Excel formulas are what we refer to as “Base 1”.  This means that you count starting at 1.  Power Query, on the other hand starts counting at 0, not 1.  The implications of this are that it is very easy to write your formula referring to a number that is out by 1.  To see the effects of this, check the section below on the alternate for the MID function equivalent.

Excel –> Power Query Formula Equivalents

Let’s take a look at how the five functions I illustrated in that original example differ from Excel to Power Query…

LEFT

To get the left x characters, we basically replace LEFT with Text.Start:

Syntax Example Result
Excel =LEFT(text,num_chars) =LEFT(“Excel”,2) Ex
Power Query =Text.Start(text,num_chars) =Text.Start(“Excel”,2) Ex

Easy enough once you recognize it, although I would have preferred that the formula name was consistent.

RIGHT

To get the right x characters we have a similar situation.  The function name needs to change from RIGHT to Text.End:

Syntax Example Result
Excel =RIGHT(text,num_chars) =RIGHT(“Excel”,2) el
Power Query =Text.End(text, num_chars) =Text.End(“Excel”,2) el

Okay, so we’re getting the hang of this now…  Just change the function name and the rest work the same, right?  Um, no.

MID

This one gets a bit weird.  First we replace MID with Text.Range.  Okay, no problem there.  But look at the results when we pass the same parameters:

Syntax Example Result
Excel =MID(text,start,num_chars) =LEFT(“Excel”,2,2) xc
Power Query =Text.Range(text, start,num_chars) =Text.Range(“Excel”,2,2) ce

They differ a little, don’t they?  The issue comes down to that base 0 vs base 1 thing I mentioned above.  Where Excel’s formula language counts the word with E being character 1, Power Query considers that character 0.  So in this case, when we tell Power Query to start returning text at character 2, it pulls back c (E is 0, x is 1, c is 2).  Interestingly though, the last parameter needs to be 2 to pull back 2 characters.

LEN

Getting the length of a text string in Power Query is actually a bit more intuitive than Excel’s native function, only because the function name isn’t trimmed off.  Text.Length is what we need instead of LEN.

Syntax Example Result
Excel =LEN(text) =LEN(“Excel”) 5
Power Query =Text.Length(text) =Text.Length(“Excel”) 5

Notice that the result for this does return five characters, as you’d expect.  So this plainly works as a 1 based result in both Excel and Power Query.

FIND

And finally we come to the FIND function.  This one is again a bit confusing.  We’ve got 3 things to consider here:

  1. The function name changes from FIND to Text.PositionOf
  2. The parameters for the text we want to find and the text we want to search in get flipped around!
  3. The result is 0 based, not 1 based
Syntax Example Result
Excel =FIND(find_text,within_text) =FIND(“xc”,“Excel”) 2
Power Query =Text.PositionOf(text, find_text) =Text.PositionOf(“Excel”,”xc”) 1

So in the case of FIND, we put the “xc” first, and “Excel” second.  But in the Power Query version, it’s completely opposite.  And look at that result… in Excel the x is treated as the 2nd character.  In Power Query it is too, but because it starts counting at 0 we get a 1.

Some thoughts

I find that even after using Power Query for a while now, I still have to look up the formula names from the Power Query formula categories page, both to find the formula name, and also the syntax.  Part of this is due to the fact that there is no auto-complete/syntax help in the Power Query engine (I’m sure it will come eventually), but part is also that my instinct is to type the Excel function name first.  And then, when I do get it right I’m constantly getting tripped up by the base 0 base 1 conversion.

As this is a tool aimed at Excel users, I am a bit disappointed in the formula naming convention.  I could get used to pre-pending Text. to all of my functions, but I really wish the rest was similar to Excel.  Maybe one day the PQ team can give us duplicate handles into the same back end function so that we can write stuff like this IN ADDITION to what already exists (don’t deprecate, just give us alternate routes):

  • =Text.Right(text,characters)

Or how about:

  • =Text.Mid(text,start,characters)

That would be good too, especially if the signature could be tweaked to require a one based parameter for the start character.  That would be consistent with what the Excel pro would expect due to their experience with the MID function.

And how about this:

  • =Text.Find(find_text,within_text)

How much easier would that be to learn if the function not only accepted parameters in the same order as the Excel function, but returned a 1 based result rather than a zero based result (again, similar to the way the current FIND function returns.)

Despite what I’m suggesting here, it does need to be recognized that there is a way to do the job, which is critical.  I just think it could be better designed in order to make it easier for the seasoned Excel pro to learn because they’d be able to port their existing knowledge without having to learn a totally new syntax.

Power Query “Trusted” Locations

My last two blog posts detailed some frustration that I ran into when working with Power Query.  First the issue that I couldn’t trust folders, but got stuck trusting files, and then the issue where I overloaded my credentials area.  I’ve actually got a solution, in a way, for both.

One thing to be aware of here… there are actually two distinct entities that show up in the Data Source Settings dialog: Security Credentials and Data Privacy Settings.  You can see this in the following picture:

image

Notice how Exchange and the Database don’t show a privacy level on the right?  Those are security credentials.  Interestingly here, you can see two entries for the same thing (like a website), where one is the security credential, the other the privacy setting.

Ok, with that out of the way, let’s get on to the solutions…

Clearing out Security Credentials

In my last post I had over 700 security credential entries for ip-api.com that needed to be cleared.  To clear them manually took clicking the item, clicking delete and clicking ok before moving on to the next item.  Even if I could do one click per second (I find this UI slow and sometimes I missed the target), it would still take me 35 minutes.  I really wanted a “remove all” feature.  Sure I’ll lose Exchange, sure I’ll lose my database, but so what.  Small pain vs spending 30-60 minutes clicking Delete.

So I coded something to do it.  The download link is at the end of the post, but one of the features of the tool is to let you clear out your entire list of security credentials with on click:

image

And voila!

image

Now, are you going to use this every day?  Hell no!  But if you screw up like I did, you just may need a weapon like this in your arsenal.  🙂

“Trusted” Folders/Files and URLs

As I mentioned in my last post, Ehren, a developer on the Power Query team messaged me on Twitter to tell me of one solution to the “Trusted Folder” problem.

If you set the privacy level for the folder, it applies to everything within that folder, including files in subfolders

Now that sounded cool.  So I set out to test it out.  First thing I did was went to add a new Folder to my Data Source Settings dialog and… there’s no option to do that. Hmm… okay… so how?

Well, you could go and build a solution that references a folder, just so that you can get a folder in there, I suppose.  But that’s awkward and contrived really.  You have to waste a bunch of time concocting something you hope will work.  And the worst part to me is that I don’t just have to do it on my system, I somehow have to deploy it to other people as well.  Ugh.

Screw it, I’m building a tool.  Here’s the features of what we’ve got:

Clear out all privacy levels

To be fair, this was probably more useful for me while testing, but I included it in case you run into the same issues.  Basically you click this button:

image

And now all the Data Privacy Settings are gone too:

image

Add Folder

So this one is way more useful to solving my issues.  I’m going to click “Add Folder” on the left.  It will let me browse for a folder:

image

And prompt me for the Privacy Level:

image

And once I click OK, it sets it up as a folder in the Data Source Privacy settings.  And yes, it really does trust all files in all subfolders:

image

Add Files

Same thing really, it just allows you to select a specific file.  This is a bit redundant to just declaring them as you go along but hey, once I already wrote the code for the folder this was a snap to add:

image

Add URLs

I love this one.  Easy to use, just provide the URL and the privacy level:

image

And boom!  There you are:

image

And again, this trusts all sub-sites of the main domain.  This was particularly key for me as I’m querying data from this site and have to provide a different URL for each year.

Now, again, on this one, the first time you query the site you’ll get another entry for each URL you touch:

image

This is because you have declared the privacy level, but not the security (unfortunately I can’t get in to modify the security files, for obvious reasons.)  So here’s what I’m going to do to fix this:

  • Select the first URL in the list
  • Click Edit Credential
  • Change the setting to apply to the root domain:

image

  • Click Save

You’ll notice that the first one disappears now.  What actually happened was that it merged the security credentials with the privacy level in one entry.  Cool.  So now I’ll just go back and delete the other two. And it never asks me again on refresh.  🙂

The “Big Red Button”

So when it all goes really wrong, and you want to reset Power Query to a default state, what then?  That’s what this button is for.  It will wipe out all of your security credentials, data privacy settings, reset your Power Query formula bar and more.  It’s like a total factory reset of Power Query.

image

Some More Technical Stuff

I’ve now had this installed on Office 2010, Office 2013 Professional Plus and Office 365 Pro Plus versions of Excel, both 32 and 64 bit.  In addition, it’s been installed on Windows 7 Pro and Windows 8.1 systems as well.

Interesting enough, despite being an .MSI installer file, in my tests it has NOT required admin privileges to install.  (On one machine I installed this first without prompting, then got prompted for admin credentials when I went to update Power Query.)

Some Final Observations

I found this project pretty interesting, and it’s given me some ideas for some more useful tools to work with Power Query and other Power BI add-ins.  While I can’t promise a timeline on delivery, I do plan on adding a bunch of new useful stuff to this add-in and releasing a Pro Tools version at some point.

Before you download and install this, I also want to make something very clear.  THIS IS BETA SOFTWARE.  I’m pretty sure it’s stable, and shouldn’t affect anything else, BUT YOU DOWNLOAD AND INSTALL AT YOUR OWN RISK.

The Installer

You can download the installer from this link.