Row Count of Merged Table

Nick Burns

Member
Joined
May 24, 2017
Messages
162
Reaction score
0
Points
16
Excel Version(s)
Office 365
I have a self-join table where I'm trying to find the number of Child Jobs to a Parent Job. I guess you can call this a "COUNTIF()".

I've done two approaches in my attached file:

1) Merge the table, expand, Group By all the original columns to get the count of rows
Pros: Fast - as in mere seconds
Cons: Although my sample only shows 5 columns, my actual data has about 25 columns and may expand in the future making modifications to the Group By step necessary

2) Merge the table, add a column that does a Table.RowCount() on the column and removes the merge data
Pros: alleviates the con in scenario 1
Cons: Slow - with my data this takes about 8 - 9 minutes.

Note: my datasource is a CSV file from our ERP system. The attached has sample data as a Table source.

Am I missing something obvious for this or is there a better approach to this?

Thanks,
Nick
 

Attachments

  • ParentChildCount.xlsx
    106.1 KB · Views: 19
Hi Nick,
I'm not sure if ithis is exactly what you want.
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Grouped Rows" = Table.Group(Source, {"Parent"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(Source, {"Parent"}, #"Grouped Rows", {"Parent"},  "Table1", JoinKind.LeftOuter),
    #"Expanded {0}" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"Count"}, {"Count"})
in
    #"Expanded {0}"
 
Hi Nick,
I'm not sure if ithis is exactly what you want.
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Grouped Rows" = Table.Group(Source, {"Parent"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(Source, {"Parent"}, #"Grouped Rows", {"Parent"},  "Table1", JoinKind.LeftOuter),
    #"Expanded {0}" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"Count"}, {"Count"})
in
    #"Expanded {0}"

I love how there are so many solutions to a problem. As this example shows, there are some more proficient than others that are just "clever". And just because something could be done in one line doesn't mean it's the best.

Your solution is slick, but ooh, I have to ding you some points ... it adds 4 seconds using my real-world data ... I'm kidding! I can live with 4 seconds :)

Thanks - this does get around the GroupBy issue. I did come up with another solution on my GroupBy method: Using a step to gather the columns (Table.ColumnNames) then passing this list to my GroupBy step.

Thank you Bill! :clap2:
 
At the moment I don't understand why the number of columns is relevant for the grouping, because the columns by which the grouping is done won't change. But maybe it will become clearer if you show the coder of your new solution.
 
At the moment I don't understand why the number of columns is relevant for the grouping, because the columns by which the grouping is done won't change. But maybe it will become clearer if you show the coder of your new solution.

The data is coming from our ERP system - the criteria for analysis could change, either by dropping or adding new data (columns). The GroupBy step would break or miss columns if they are hard coded. This was the solution I came up with to alleviate that:

Code:
let    
    [COLOR=#3E3E3E]Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],[/COLOR]
    ColumnList = Table.ColumnNames(Source),
    #"Merged Queries" = Table.NestedJoin(Source, {"Child"}, Source, {"Parent"}, "Source", JoinKind.LeftOuter),
    #"Expanded Source" = Table.ExpandTableColumn(#"Merged Queries", "Source", {"Parent"}, {"Source.Parent"}),
    #"Grouped Rows" = Table.Group(#"Expanded Source", ColumnList, {{"Count", each Table.RowCount(_), Int64.Type}})
in
    #"Grouped Rows"

Contrast to the original:
Code:
let    
    [COLOR=#3E3E3E]Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],[/COLOR]
    #"Merged Queries" = Table.NestedJoin(Source, {"Child"}, Source, {"Parent"}, "Source", JoinKind.LeftOuter),
    #"Expanded Source" = Table.ExpandTableColumn(#"Merged Queries", "Source", {"Parent"}, {"Source.Parent"}),
    #"Grouped Rows" = Table.Group(#"Expanded Source", {"Child", "EMAC", "BILLED", "UNBILLRET", "Parent"}, {{"Count", each Table.RowCount(_), Int64.Type}})
in
    #"Grouped Rows"

Note: I did go with Bill's solution in the end.
 
Hi Nick,

sometimes it takes a little longer, but now after testing the last code myself, I now understand the point of your question and thus the clever solution.
 
Back
Top