Power Query Challenge 6

AndVgri · Aug 2, 2019

Maybe you missed the point of the thread? It's not a request for help, but a PowerQuery challenge. The idea is to solve the problem using PowerQuery and it's just for fun

I apologize for my English. But my posted VBA code only cooks source dataset for testing Power Query solutions. And it creates a bit more rows than Ken's row dataset.
If anybody not will able to use the dataset which I tested before publish my result then will him trust my conclusion and time results?
Sorry, if I created a problem for you have any problems by publishing my VBA code.
Regards,

AliGW · Aug 2, 2019

No, sorry - my fault. I missed the point of your post.

Maxim Zelensky · Aug 2, 2019

Hi All, hi Ken!
Nice challenge!

Here is my attempt #1:

Code:

// challenge6_codelet
    Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"InvoiceID", Int64.Type}, {"ItemID", type text}, {"Quantity", type text}, {"Price", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "tolist", each Table.FromColumns(List.Transform({[ItemID]}&{[Quantity]}&{[Price]},each Text.Split(_, "#(lf)")), {"ItemID","Quantity","Price"})),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"ItemID", "Quantity", "Price"}),
    #"Expanded {0}" = Table.ExpandTableColumn(#"Removed Columns", "tolist", {"ItemID", "Quantity", "Price"}, {"ItemID", "Quantity", "Price"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded {0}",{{"ItemID", type number}, {"Quantity", type number}, {"Price", type number}})
in
    #"Changed Type1"

and here is my UI-only attempt - tried it just for fun:

Code:

// challenge6_UI
let
    Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"InvoiceID", Int64.Type}, {"ItemID", type text}, {"Quantity", type text}, {"Price", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","#(lf)",",",Replacer.ReplaceText,{"ItemID", "Quantity", "Price"}),
    #"Inserted Merged Column" = Table.AddColumn(#"Replaced Value", "1", each Text.Combine({[ItemID], [Quantity], [Price]}, "],["), type text),
    #"Added Prefix" = Table.TransformColumns(#"Inserted Merged Column", {{"1", each "[[" & _, type text}}),
    #"Added Suffix" = Table.TransformColumns(#"Added Prefix", {{"1", each _ & "]]", type text}}),
    #"Parsed JSON" = Table.TransformColumns(#"Added Suffix",{{"1", Json.Document}}),
    #"Added Custom" = Table.AddColumn(#"Parsed JSON", "2", each Table.FromColumns([1], {"ItemID", "Quantity", "Price"})),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"ItemID", "Quantity", "Price", "1"}),
    #"Expanded {0}1" = Table.ExpandTableColumn(#"Removed Columns", "2", {"ItemID", "Price", "Quantity"}, {"ItemID", "Price", "Quantity"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded {0}1",{{"ItemID", type number}, {"Quantity", type number}, {"Price", type number}})
in
    #"Changed Type1"

Have fun!

cyborgski · Aug 2, 2019

Hi AndVGri,

What do you mean by "Do not use mouse hardcode in M"?

Also, you could have used the Table.Repeat, to replicate Ken's data to generate more lines.

Regards,
Mike

AndVgri · Aug 2, 2019

cyborgski said:
Also, you could have used the Table.Repeat, to replicate Ken's data to generate more lines.
Regards,
Mike

Yes, I did that. But with such dataset Ken's GroupAndSplit works with errors. It does not love repeating rows.
Regards,

AndVgri · Aug 2, 2019

It is Ken's solutions with twice repeat rows in source dataset.

GroupAndSplit has errors. https://drive.google.com/file/d/1YElvjVl-XpXGa1Nq1IEqNI8EBhOXmLQy/view?usp=sharing
Regards,

Ken Puls · Aug 2, 2019

AliGW said:
Thank you for posting. Maybe you missed the point of the thread? It's not a request for help, but a PowerQuery challenge. The idea is to solve the problem using PowerQuery and it's just for fun.

Well ultimately, we want to solve the challenge, but building scalable solutions is definitely a concern. If I build something that works, but takes 10 minutes to refresh on a 100,000 row dataset, it will become unusable. That's why I like doing these... it challenges me too, and not just getting the initial solution.

AliGW · Aug 2, 2019

Ken - I presume you read my later post where I said sorry for misunderstanding???

So which is the best solution here?

Ken Puls · Aug 2, 2019

AliGW said:
Ken - I presume you read my later post where I said sorry for misunderstanding???

Doh! Nope. Missed that there was another page of replies. You'd think I was a forum newbie! Sorry Ali!

AliGW said:
So which is the best solution here?

Honestly, I haven't had time to look at all of them. I want to take a deeper run through, but the List.Zip solutions definitely perform faster that the split and re-merge. I need time to work out more than that. (Unfortunately something that is in pretty short supply around here!)

albertan · Aug 12, 2019

Ken, very nice to see the outcome.
I would appreciate if you can please enlighten on ("#(lf)", QuoteStyle.Csv)
why are we using this type of delimiter when trying to split the data
thanks

AliGW · Aug 13, 2019

#lf is the token that represents a line feed, so Ken is using that as the character to find in the text to split the individual cell contents to rows.

PaulT · Sep 5, 2019

Challenge 6

//Create SplitColumn function:

(Source, ColumnName) =>
let
#"Removed Other Columns" = Table.SelectColumns(Source,{ColumnName}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Table.TransformColumnTypes(#"Removed Other Columns", {{ColumnName, type text}}, "en-AU"), {{ColumnName, Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.None), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), ColumnName),
#"Added Index" = Table.AddIndexColumn(#"Split Column by Delimiter", "Index", 0, 1)
in
#"Added Index"

//which is used in M below:

let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
QuantCol = SplitColumn(Source, "Quantity"),
PriceCol = SplitColumn(Source, "Price"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Table.TransformColumnTypes(Source, {{"ItemID", type text}}, "en-AU"), {{"ItemID", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "ItemID"),
#"Added Index" = Table.AddIndexColumn(#"Split Column by Delimiter", "Index", 0, 1),
#"Removed Other Columns" = Table.SelectColumns(#"Added Index",{"InvoiceID", "ItemID", "Index"}),
#"Merged Queries" = Table.NestedJoin(#"Removed Other Columns",{"Index"},QuantCol,{"Index"},"QuantCol",JoinKind.LeftOuter),
#"Expanded QuantColForJoinTest" = Table.ExpandTableColumn(#"Merged Queries", "QuantCol", {"Quantity"}, {"Quantity"}),
#"Merged Queries2" = Table.NestedJoin(#"Expanded QuantColForJoinTest",{"Index"},PriceCol,{"Index"},"PriceCol",JoinKind.LeftOuter),
#"Expanded QuantColForJoinTest2" = Table.ExpandTableColumn(#"Merged Queries2", "PriceCol", {"Price"}, {"Price"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded QuantColForJoinTest2",{"Index"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"InvoiceID", type text}, {"ItemID", type text}, {"Quantity", type number}, {"Price", type number}})
in
#"Changed Type1"

p45cal · Apr 23, 2020

Stealing the first 3 lines from sbghide in msg#10 (I'd already got there (with a lot more fuss though)), but with a different flourish:

Code:

let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
mChgTyp = Table.TransformColumnTypes(Source,{{"ItemID", type text}, {"Quantity", type text}, {"Price", type text}, {"InvoiceID", type text}}),
mSplitColDL = Table.TransformColumns(mChgTyp, {{"ItemID", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv)},{"Quantity", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv)},{"Price", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv)}}),
Expandeda = Table.ExpandTableColumn(Table.SelectColumns(Table.AddColumn(mSplitColDL, "a", each Table.FromColumns({[ItemID],[Quantity],[Price]},{"ItemID","Quantity","Price"})),{"InvoiceID", "a"}), "a", {"ItemID", "Quantity", "Price"}, {"ItemID", "Quantity", "Price"})
in
    Expandeda

Power Query Challenge 6

AndVgri

New member

AliGW

Administrator

Maxim Zelensky

New member

cyborgski

New member

AndVgri

New member

AndVgri

New member

Ken Puls

Administrator

AliGW

Administrator

Ken Puls

Administrator

albertan

Member

AliGW

Administrator

PaulT

New member

p45cal

Super Moderator