Power Query Challenge 6

Maybe you missed the point of the thread? It's not a request for help, but a PowerQuery challenge. The idea is to solve the problem using PowerQuery and it's just for fun
I apologize for my English. But my posted VBA code only cooks source dataset for testing Power Query solutions. And it creates a bit more rows than Ken's row dataset.
If anybody not will able to use the dataset which I tested before publish my result then will him trust my conclusion and time results?
Sorry, if I created a problem for you have any problems by publishing my VBA code.
Regards,
 
Hi All, hi Ken!
Nice challenge!

Here is my attempt #1:
Code:
// challenge6_codelet
    Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"InvoiceID", Int64.Type}, {"ItemID", type text}, {"Quantity", type text}, {"Price", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "tolist", each Table.FromColumns(List.Transform({[ItemID]}&{[Quantity]}&{[Price]},each Text.Split(_, "#(lf)")), {"ItemID","Quantity","Price"})),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"ItemID", "Quantity", "Price"}),
    #"Expanded {0}" = Table.ExpandTableColumn(#"Removed Columns", "tolist", {"ItemID", "Quantity", "Price"}, {"ItemID", "Quantity", "Price"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded {0}",{{"ItemID", type number}, {"Quantity", type number}, {"Price", type number}})
in
    #"Changed Type1"
and here is my UI-only attempt - tried it just for fun:
Code:
// challenge6_UI
let
    Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"InvoiceID", Int64.Type}, {"ItemID", type text}, {"Quantity", type text}, {"Price", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","#(lf)",",",Replacer.ReplaceText,{"ItemID", "Quantity", "Price"}),
    #"Inserted Merged Column" = Table.AddColumn(#"Replaced Value", "1", each Text.Combine({[ItemID], [Quantity], [Price]}, "],["), type text),
    #"Added Prefix" = Table.TransformColumns(#"Inserted Merged Column", {{"1", each "[[" & _, type text}}),
    #"Added Suffix" = Table.TransformColumns(#"Added Prefix", {{"1", each _ & "]]", type text}}),
    #"Parsed JSON" = Table.TransformColumns(#"Added Suffix",{{"1", Json.Document}}),
    #"Added Custom" = Table.AddColumn(#"Parsed JSON", "2", each Table.FromColumns([1], {"ItemID", "Quantity", "Price"})),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"ItemID", "Quantity", "Price", "1"}),
    #"Expanded {0}1" = Table.ExpandTableColumn(#"Removed Columns", "2", {"ItemID", "Price", "Quantity"}, {"ItemID", "Price", "Quantity"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded {0}1",{{"ItemID", type number}, {"Quantity", type number}, {"Price", type number}})
in
    #"Changed Type1"
Have fun!
 
Hi AndVGri,

What do you mean by "Do not use mouse hardcode in M"?

Also, you could have used the Table.Repeat, to replicate Ken's data to generate more lines.

Regards,
Mike

 

Also, you could have used the Table.Repeat, to replicate Ken's data to generate more lines.
Regards,
Mike
Yes, I did that. But with such dataset Ken's GroupAndSplit works with errors. It does not love repeating rows.
Regards,
 
Thank you for posting. Maybe you missed the point of the thread? It's not a request for help, but a PowerQuery challenge. The idea is to solve the problem using PowerQuery and it's just for fun. :)

Well ultimately, we want to solve the challenge, but building scalable solutions is definitely a concern. If I build something that works, but takes 10 minutes to refresh on a 100,000 row dataset, it will become unusable. That's why I like doing these... it challenges me too, and not just getting the initial solution. ;)
 
Ken - I presume you read my later post where I said sorry for misunderstanding???

So which is the best solution here?
 
Ken - I presume you read my later post where I said sorry for misunderstanding???
Doh! Nope. Missed that there was another page of replies. You'd think I was a forum newbie! Sorry Ali!

So which is the best solution here?
Honestly, I haven't had time to look at all of them. I want to take a deeper run through, but the List.Zip solutions definitely perform faster that the split and re-merge. I need time to work out more than that. (Unfortunately something that is in pretty short supply around here!)
 
Ken, very nice to see the outcome.
I would appreciate if you can please enlighten on ("#(lf)", QuoteStyle.Csv)
why are we using this type of delimiter when trying to split the data
thanks
 
#lf is the token that represents a line feed, so Ken is using that as the character to find in the text to split the individual cell contents to rows.
 
Challenge 6

//Create SplitColumn function:

(Source, ColumnName) =>
let
#"Removed Other Columns" = Table.SelectColumns(Source,{ColumnName}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Table.TransformColumnTypes(#"Removed Other Columns", {{ColumnName, type text}}, "en-AU"), {{ColumnName, Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.None), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), ColumnName),
#"Added Index" = Table.AddIndexColumn(#"Split Column by Delimiter", "Index", 0, 1)
in
#"Added Index"

//which is used in M below:

let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
QuantCol = SplitColumn(Source, "Quantity"),
PriceCol = SplitColumn(Source, "Price"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Table.TransformColumnTypes(Source, {{"ItemID", type text}}, "en-AU"), {{"ItemID", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "ItemID"),
#"Added Index" = Table.AddIndexColumn(#"Split Column by Delimiter", "Index", 0, 1),
#"Removed Other Columns" = Table.SelectColumns(#"Added Index",{"InvoiceID", "ItemID", "Index"}),
#"Merged Queries" = Table.NestedJoin(#"Removed Other Columns",{"Index"},QuantCol,{"Index"},"QuantCol",JoinKind.LeftOuter),
#"Expanded QuantColForJoinTest" = Table.ExpandTableColumn(#"Merged Queries", "QuantCol", {"Quantity"}, {"Quantity"}),
#"Merged Queries2" = Table.NestedJoin(#"Expanded QuantColForJoinTest",{"Index"},PriceCol,{"Index"},"PriceCol",JoinKind.LeftOuter),
#"Expanded QuantColForJoinTest2" = Table.ExpandTableColumn(#"Merged Queries2", "PriceCol", {"Price"}, {"Price"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded QuantColForJoinTest2",{"Index"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"InvoiceID", type text}, {"ItemID", type text}, {"Quantity", type number}, {"Price", type number}})
in
#"Changed Type1"
 
Stealing the first 3 lines from sbghide in msg#10 (I'd already got there (with a lot more fuss though)), but with a different flourish:
Code:
let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
mChgTyp = Table.TransformColumnTypes(Source,{{"ItemID", type text}, {"Quantity", type text}, {"Price", type text}, {"InvoiceID", type text}}),
mSplitColDL = Table.TransformColumns(mChgTyp, {{"ItemID", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv)},{"Quantity", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv)},{"Price", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv)}}),
Expandeda = Table.ExpandTableColumn(Table.SelectColumns(Table.AddColumn(mSplitColDL, "a", each Table.FromColumns({[ItemID],[Quantity],[Price]},{"ItemID","Quantity","Price"})),{"InvoiceID", "a"}), "a", {"ItemID", "Quantity", "Price"}, {"ItemID", "Quantity", "Price"})
in
    Expandeda
 
Back
Top