5 Very Useful Text Formulas – Power Query Edition

Years ago I published an article on my site called Five Very Useful Functions For Working With Text.  This post is alternative version, but with a twist.  I bring you 5 Very Useful Text Formulas - Power Query Edition.

The article is geared to explaining five functions specific to working with Text in Excel, and are a set of the most under-utilized functions in Excel (in my opinion).  It even lets you try them out live in the web page without even having to open Excel at all.

Now, over the past few months I’ve been working with Power Query, and one of the things that’s been driving me a little crazy is that the formula names in Power Query and not the same as they are in Excel.  I personally think this is a bit of a mistake, and that the formula names in Power Query should have been a bit more congruent with standard Excel formulas (Power Pivot’s DAX functions are similarly named to Excel, so why not Power Query’s M language?)

Some critical background

Before you bang your head against the wall, there are two things that are really different between Excel formulas and Power Query formulas:

Case sensitivity:

While Excel formulas are not case sensitive, Power Query formulas are.  If the Power Query formula signature says “Text.Start” then “TEXT.START” or “text.start” will NOT work for you.

Base 1 vs Base 0

Excel formulas are what we refer to as “Base 1”.  This means that you count starting at 1.  Power Query, on the other hand starts counting at 0, not 1.  The implications of this are that it is very easy to write your formula referring to a number that is out by 1.  To see the effects of this, check the section below on the alternate for the MID function equivalent.

Excel –> Power Query Formula Equivalents

Let’s take a look at how the five functions I illustrated in that original example differ from Excel to Power Query…

LEFT

To get the left x characters, we basically replace LEFT with Text.Start:

 Syntax Example Result Excel =LEFT(text,num_chars) =LEFT(“Excel”,2) Ex Power Query =Text.Start(text,num_chars) =Text.Start(“Excel”,2) Ex

Easy enough once you recognize it, although I would have preferred that the formula name was consistent.

RIGHT

To get the right x characters we have a similar situation.  The function name needs to change from RIGHT to Text.End:

 Syntax Example Result Excel =RIGHT(text,num_chars) =RIGHT(“Excel”,2) el Power Query =Text.End(text, num_chars) =Text.End(“Excel”,2) el

Okay, so we’re getting the hang of this now…  Just change the function name and the rest work the same, right?  Um, no.

MID

This one gets a bit weird.  First we replace MID with Text.Range.  Okay, no problem there.  But look at the results when we pass the same parameters:

 Syntax Example Result Excel =MID(text,start,num_chars) =LEFT(“Excel”,2,2) xc Power Query =Text.Range(text, start,num_chars) =Text.Range(“Excel”,2,2) ce

They differ a little, don’t they?  The issue comes down to that base 0 vs base 1 thing I mentioned above.  Where Excel’s formula language counts the word with E being character 1, Power Query considers that character 0.  So in this case, when we tell Power Query to start returning text at character 2, it pulls back c (E is 0, x is 1, c is 2).  Interestingly though, the last parameter needs to be 2 to pull back 2 characters.

LEN

Getting the length of a text string in Power Query is actually a bit more intuitive than Excel’s native function, only because the function name isn’t trimmed off.  Text.Length is what we need instead of LEN.

 Syntax Example Result Excel =LEN(text) =LEN(“Excel”) 5 Power Query =Text.Length(text) =Text.Length(“Excel”) 5

Notice that the result for this does return five characters, as you’d expect.  So this plainly works as a 1 based result in both Excel and Power Query.

FIND

And finally we come to the FIND function.  This one is again a bit confusing.  We’ve got 3 things to consider here:

1. The function name changes from FIND to Text.PositionOf
2. The parameters for the text we want to find and the text we want to search in get flipped around!
3. The result is 0 based, not 1 based
 Syntax Example Result Excel =FIND(find_text,within_text) =FIND(“xc”,“Excel”) 2 Power Query =Text.PositionOf(text, find_text) =Text.PositionOf(“Excel”,”xc”) 1

So in the case of FIND, we put the “xc” first, and “Excel” second.  But in the Power Query version, it’s completely opposite.  And look at that result… in Excel the x is treated as the 2nd character.  In Power Query it is too, but because it starts counting at 0 we get a 1.

Some thoughts

I find that even after using Power Query for a while now, I still have to look up the formula names from the Power Query formula categories page, both to find the formula name, and also the syntax.  Part of this is due to the fact that there is no auto-complete/syntax help in the Power Query engine (I’m sure it will come eventually), but part is also that my instinct is to type the Excel function name first.  And then, when I do get it right I’m constantly getting tripped up by the base 0 base 1 conversion.

As this is a tool aimed at Excel users, I am a bit disappointed in the formula naming convention.  I could get used to pre-pending Text. to all of my functions, but I really wish the rest was similar to Excel.  Maybe one day the PQ team can give us duplicate handles into the same back end function so that we can write stuff like this IN ADDITION to what already exists (don’t deprecate, just give us alternate routes):

• =Text.Right(text,characters)

• =Text.Mid(text,start,characters)

That would be good too, especially if the signature could be tweaked to require a one based parameter for the start character.  That would be consistent with what the Excel pro would expect due to their experience with the MID function.

• =Text.Find(find_text,within_text)

How much easier would that be to learn if the function not only accepted parameters in the same order as the Excel function, but returned a 1 based result rather than a zero based result (again, similar to the way the current FIND function returns.)

Despite what I’m suggesting here, it does need to be recognized that there is a way to do the job, which is critical.  I just think it could be better designed in order to make it easier for the seasoned Excel pro to learn because they’d be able to port their existing knowledge without having to learn a totally new syntax.

This entry was posted in Excel, Office 2013, Power Query by Ken Puls. Bookmark the permalink.

I'm the blog owner... My main website is located at www.excelguru.ca, and is devoted to holding tips and tricks on excel development.

21 thoughts on “5 Very Useful Text Formulas – Power Query Edition”

1. Thanks for the update on commands, very useful!

Having issues adding a new column to a query, to return domain address details for email addresses - usually performed using:

OLD:
=MID(CELL,FIND("@",CELL),40) query

using the new format as below returns error

NEW

* I use 40 as that is typically sufficient to get to end fo domain address (i.e. the @domain.com address)

AM i missing something?

2. Hi Greg,

Unfortunately, the Power Query Text.Range function can't be fed a final parameter that is larger than the number of characters, unlike MID. Give this a go, as it determines the number of characters for the final param:

3. Hi Ken -

Solid post. Thanks for that! I do have a question though. Do you know if it's possible to use the FIND formula to generate a location within a text to split into two columns?

Example -
I have the following information:

Chatsworth, CAFindlay College Prep
Chino Hills, CAChino Hills High School
Katy, TXCypress Lakes High School

I want to split at after the state abbreviation so that I have two columns, state and school. To find the number of characters to move (from the left) to get to that split point, I added a Custom column with following formula: Text.PositionOf([HOMETOWN],",")+4

... and it seems to work fine but now I am stuck. Do you know if I can use that information to split columns at my desired point?

Is there another way to approach this issue?

4. Hi Fred,

Why not just use the "Split Column --> by Delimiter" and feed it a customer delimiter of ", "? Then you could split it again using "Split Column --> By Number of Characters" to get the first two. (Just remember to split by the "left most" in both cases.

5. Ken -

I think that would work for almost all of the instances but how would you suggest I account for the following International examples:

Vaughan, Ont., CANHuntington Prep

Both of these require more characters with your "Split Column --> By Number of Characters" option than the rows with US locations and the CAN option would require a step to account for the extra comma.

My goal is to find a solution that would work for all available scenarios in my data set.

Thanks again

6. Hi Ken...and Fred 🙂
Problem is not as simple as we think... but try this code. I assume that there are no more than 10 words in row. If more, you can change the proper lines (2) of code.
let
Source = Excel.CurrentWorkbook(){[Name="tblSource"]}[Content],
SplitColBySpace = Table.SplitColumn(CustomCol,"Custom",Splitter.SplitTextByDelimiter(" "),{"1", "2", "3", "4", "5", "6", "7", "8", "9", "10"}),
ChType = Table.TransformColumnTypes(SplitColBySpace,{{"1", type text}, {"2", type text}, {"3", type text}, {"4", type text}, {"5", type text}, {"6", type text}, {"7", type text}, {"8", type text}, {"9", type text}, {"10", type text}}),
RemCol = Table.RemoveColumns(ChType,{"SourceData"}),
UnPivotOthCol = Table.UnpivotOtherColumns(RemCol, {"Indeks"}, "Atrybut", "Warto??"),
DuplCol = Table.DuplicateColumn(UnPivotOthCol, "Warto??", "Copy of Warto??"),
ToUpper = Table.TransformColumns(DuplCol,{{"Copy of Warto??", Text.Upper}}),
LIntersect = Table.AddColumn(ToUpper, "Custom", each List.Intersect({Text.ToList([Warto??]),Text.ToList([Copy of Warto??])})),
LCount = Table.AddColumn(LIntersect, "Custom.1", each List.Count([Custom])),
ColCountry = Table.AddColumn(LCount, "Custom.2", each if [Custom.1]=1 then null else Text.Start([Warto??],[Custom.1]-1)),
RemOtherCol = Table.SelectColumns(ColCountry,{"Custom.2", "Indeks"}),
Country = Table.SelectRows(RemOtherCol, each ([Custom.2] null)),
ExpNew = Table.ExpandTableColumn(MerTab, "New", {"Custom.2"}, {"Country"}),
RemColumns = Table.SelectColumns(SplitCol,{"Country", "City", "School"})
in
RemColumns

Regards 🙂

7. Fantastic, thanks for posting this Bill!

8. Thank you very much. Very helpful!

I just happend to see a copy-paste error in the RIGHT examples above:

Power Query =Text.End(text, num_chars) = Text.***Start***(“Excel”,2) el
obviously it should be
Power Query =Text.End(text, num_chars) =Text.***End***(“Excel”,2) el

Thanks again!

9. Thanks Anders. All fixed. 🙂

10. I'm stuck with splitting one column into two. The column contains values such as "2.8kg", "2L", "600 g", "10.35 g". I need to extract the numbers incl decimals into one column and the letter (or text) into another. I can do it in Excel through a macro and a nested formula but would like to do this in Power Query.

11. @Shyla, I'm looking for something simlar. I have cells containing a lot of text. I need to extract a string that always starts with PRB and the following 6 digits. If that string is not found in the cells a null value applies.

12. Hi Wubbe,

Are you saying that you want a specific 6 digits, or any 6 digits? Maybe have a look at the article I linked to for Syla's solution. (Just posted the link as I missed cross referencing it before.)

13. I'm stuck with trimming last two characters from a column. Tried using Text.TrimEnd function but it gives errors. Please assist.

14. To use Text.TrimEnd, you need to provide the list of characters you want to trim. That could get rather complicated. My suggestion would be to pull the left x characters where x is the length of the text string minus 2:

=Text.Start([Column1],Text.Length([Column1])-2)

You'll need to update Column1 to your value, of course.