Power Queries: Can I Get a Count for One Name, Across Three Columns?

jdanniel

Member
Joined
Jul 16, 2018
Messages
84
Reaction score
0
Points
6
Excel Version(s)
MS365
I'd like to ask a question about Power Queries, if I may.

My movie collection spreadsheet has three columns for actors.

Here's what I'd like to do, if it is possible:

I'd like to do a Power Query which gives me a total count for each individual actor, regardless of which column their name is in.

For example, I'd like to know how many movies Paul Newman is in, regardless of whether his name is in the Actor 1 column, the Actor 2 column, or the Actor 3 column.
Is this possible with Power Queries? Or do I have to use another Excel feature to do this?

If this can be done, I'd also like to do it for the Genre and Subgenre columns.
Thank you! Jd
 

Attachments

  • Movie Sheet 2019.xlsm
    59.3 KB · Views: 21
I suspect there are more elegant ways to do this...but here's my first pass at it.

I created a named cell (ActorName) to contain the actor to search for.
Then I created this query:
Query Name: CountActorMovies
M-Code:
Code:
let
    ActorName       = Excel.CurrentWorkbook(){[Name="ActorName"]}[Content]{0}[Column1],
    Source          = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    GetMatches      = Table.AddColumn(
        Table.ReplaceValue(Source,null,"|",Replacer.ReplaceValue,{"Actor 1", "Actor 2", "Actor 3"}), 
        "Custom", 
        each Text.Contains(Text.Upper([Actor 1]&"|"&[Actor 2]&"|"&[Actor 3]),Text.Upper(ActorName))),
    KeepMatchedRows = Table.SelectRows(GetMatches, each ([Custom] = true)),
    AddNameCol      = Table.AddColumn(KeepMatchedRows, "Actor", each ActorName),
    Results         = Table.Group(AddNameCol, {"Actor"}, {{"Movie Count", each Table.RowCount(_), type number}})
in
    Results

Does that help?
 

Attachments

  • Movie Sheet 2019.xlsm
    63.5 KB · Views: 22
Code:
let
    ActorName=Excel.CurrentWorkbook(){[Name="ActorName"]}[Content]{0}[ActorName],
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Actor 1", "Actor 2", "Actor 3"}),
    FindText = Table.FindText(#"Removed Other Columns",ActorName),
    #"Calculated Count" = Table.RowCount(FindText)
in
    #"Calculated Count"
 

Attachments

  • Answer.xlsm
    58 KB · Views: 13
My proposal
Code:
let
    ActorName=Excel.CurrentWorkbook(){[Name="ActorName"]}[Content]{0}[Column1],
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Filtered Rows" = Table.SelectRows(Source, each List.Contains({[Actor 1], [Actor 2], [Actor 3]}, ActorName, Comparer.OrdinalIgnoreCase)),
    TheEnd = Table.FromRows({{Text.Proper(ActorName), Table.RowCount(#"Filtered Rows")}}, {"Actor Name", "Movie Count"})
in
    TheEnd
 
Thank you all for replying, and for contributing code. Being a novice, I'm not entirely sure I explained what I wanted to do very well. Either that, or I don't quite fully understand the code. Probably both.

Do the code samples above allow me to have one master actors list, with a count for every actor? That's really what I'm looking for. I do not know if the code samples do this. The code samples, as I see them, permit one actor. Am I correct or incorrect about this?

If you look at my spreadsheet, you'll see a worksheet in it named Director Count. It has a master list of every director in the spreadsheet, along with a count for each director. Granted, the list of directors is in one column, not three. But that's what I was hoping for, with the actors. I'd like one column, with a list of all of the actors in the spreadsheet's three Actors columns. Next to each actors' name should be the count.

I apologize if I do not understand the code well enough to know if it does this or not. Is there a way to combine the three columns into one, for the specific purpose of having one master Actors list and count? I hope I explained myself better this time, and if I am not getting the gist of this code, I apologize. Thank you.
 
Something like this:
Code:
let    
    Source          = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Actor 1", "Actor 2", "Actor 3"}),
    #"Replaced Value" = Table.ReplaceValue(#"Removed Other Columns",null,"|",Replacer.ReplaceValue,{"Actor 1", "Actor 2", "Actor 3"}),
    #"Merged Columns" = Table.CombineColumns(#"Replaced Value",{"Actor 1", "Actor 2", "Actor 3"},Combiner.CombineTextByDelimiter("|", QuoteStyle.None),"Merged"),
    #"Filtered Rows" = Table.SelectRows(#"Merged Columns", each [Merged] <> "|||||"),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Filtered Rows", {{"Merged", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Merged"),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged", type text}}),
    #"Filtered Rows1" = Table.SelectRows(#"Changed Type", each ([Merged] <> "")),
    AllActors = Table.Sort(#"Filtered Rows1",{{"Merged", Order.Ascending}}),
    DistinctActors = Table.Distinct(AllActors),
    #"Merged Queries" = Table.NestedJoin(DistinctActors,{"Merged"},AllActors,{"Merged"},"DistinctActors",JoinKind.LeftOuter),
    #"Aggregated DistinctActors" = Table.AggregateTableColumn(#"Merged Queries", "DistinctActors", {{"Merged", List.Count, "Count of Merged"}}),
    #"Sorted Rows" = Table.Sort(#"Aggregated DistinctActors",{{"Merged", Order.Ascending}})
in
    #"Sorted Rows"

This one counts all the "Blank Actors"
Code:
let    
    Source          = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Actor 1", "Actor 2", "Actor 3"}),
    #"Replaced Value" = Table.ReplaceValue(#"Removed Other Columns",null,"|",Replacer.ReplaceValue,{"Actor 1", "Actor 2", "Actor 3"}),
    #"Merged Columns" = Table.CombineColumns(#"Replaced Value",{"Actor 1", "Actor 2", "Actor 3"},Combiner.CombineTextByDelimiter("|", QuoteStyle.None),"Merged"),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Merged Columns", {{"Merged", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Merged"),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged", type text}}),
    AllActors = Table.Sort(#"Changed Type",{{"Merged", Order.Ascending}}),
    DistinctActors = Table.Distinct(AllActors),
    #"Merged Queries" = Table.NestedJoin(DistinctActors,{"Merged"},AllActors,{"Merged"},"DistinctActors",JoinKind.LeftOuter),
    #"Aggregated DistinctActors" = Table.AggregateTableColumn(#"Merged Queries", "DistinctActors", {{"Merged", List.Count, "Count of Merged"}}),
    #"Sorted Rows" = Table.Sort(#"Aggregated DistinctActors",{{"Merged", Order.Ascending}})
in
    #"Sorted Rows"
 
Last edited:
YES! That is exactly what I'm looking for. Thank you so very much. Now, to try and learn some M language! See you in a few years! (kidding) Seriously, thank you. Because of this thread, I discovered what M language is. Been reading a little bit about it this afternoon. Thank you again!
 
Alternatively
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    OneColumnTbl = Table.AddColumn(Source, "Actor", each {[Actor 1],[Actor 2],[Actor 3]})[[Actor]],
    #"Expanded {0}" = Table.ExpandListColumn(OneColumnTbl, "Actor"),
    #"Grouped Rows" = Table.Group(#"Expanded {0}", {"Actor"}, {{"Movie Count", each Table.RowCount(_), type number}}),
    #"Sorted Rows" = Table.Sort(#"Grouped Rows",{{"Actor", Order.Ascending}})
in
    #"Sorted Rows"
 
Great!

One last alteration --
This one counts where the are NO actors listed:

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Actor 1", "Actor 2", "Actor 3"}),
    #"Replaced Value" = Table.ReplaceValue(#"Removed Other Columns",null,"|",Replacer.ReplaceValue,{"Actor 1", "Actor 2", "Actor 3"}),
    #"Merged Columns" = Table.CombineColumns(#"Replaced Value",{"Actor 1", "Actor 2", "Actor 3"},Combiner.CombineTextByDelimiter("|", QuoteStyle.None),"Merged"),
    #"Replaced Value1" = Table.ReplaceValue(#"Merged Columns","|||||","<none>",Replacer.ReplaceText,{"Merged"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value1", {{"Merged", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Merged"),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged", type text}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Merged] <> "")),
    AllActors = Table.Sort(#"Filtered Rows",{{"Merged", Order.Ascending}}),
    DistinctActors = Table.Distinct(AllActors),
    #"Merged Queries" = Table.NestedJoin(DistinctActors,{"Merged"},AllActors,{"Merged"},"DistinctActors",JoinKind.LeftOuter),
    #"Aggregated DistinctActors" = Table.AggregateTableColumn(#"Merged Queries", "DistinctActors", {{"Merged", List.Count, "Count of Merged"}}),
    #"Sorted Rows" = Table.Sort(#"Aggregated DistinctActors",{{"Merged", Order.Ascending}})
in
    #"Sorted Rows"

I wanted to add that this can all be done with the UI, however the #"Merged Queries" step needs to be modified to "Self Join" the AllActors and DistinctActors tables.
 
{face palm}
Had I done the GroupBy, my solution is purely UI driven!

Though I'm not sure why Bill's solution show 466 blanks, and mine shows 932 (twice as many)
 
#"Replaced Value" = Table.ReplaceValue(#"Removed Other Columns",null,"|",Replacer.ReplaceValue,{"Actor 1", "Actor 2", "Actor 3"}),
#"Merged Columns" = Table.CombineColumns(#"Replaced Value",{"Actor 1", "Actor 2", "Actor 3"},Combiner.CombineTextByDelimiter("|", QuoteStyle.None),"Merged"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Merged Columns", {{"Merged", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), let

932 because you are using the same sign ("|") in this 3 steps above. Change "|" in #"Replaced Value" to ";" (example only) and everything will be ok :)
 
Here's another approach using the UI. Select and unpivot the three actor columns, then group.


Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Title", type text}, {"Prefix", type any}, {"Director", type text}, {"Actor 1", type text}, {"Actor 2", type text}, {"Actor 3", type text}, {"Genre", type text}, {"Subgenre", type text}, {"Foreign", type text}, {"Silent", type any}, {"DVD #", Int64.Type}, {"Serial #", Int64.Type}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Actor 1", "Actor 2", "Actor 3"}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Removed Other Columns", {}, "Attribute", "Value"),
    #"Grouped Rows" = Table.Group(#"Unpivoted Columns", {"Value"}, {{"Count", each Table.RowCount(_), type number}}),
    #"Sorted Rows" = Table.Sort(#"Grouped Rows",{{"Value", Order.Ascending}})
in
    #"Sorted Rows"
 
Nick, I'm a little unsure about one thing. I'm not 100% sure I know what you mean by UI driven. I'm assuming you mean UI stands for User Interface. I interpret UI driven as meaning code is not necessary, and that this can be done with what's available in Excel as is. Am I correct? If I'm not, then I'm not sure what you mean. Thanks again for following up here. I appreciate it greatly.
 
Yes, you are correct. Once you load your table into the Power Query editor, you can point and click, and Excel will generate the M code. For example, selecting the three Actor columns, right-clicking and choosing Remove Other Columns will produce this line of code which you'll see in the formula bar and the advanced editor.

#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Actor 1", "Actor 2", "Actor 3"}),

It's kind of like the macro recorder in Excel. You can do a lot with the UI, but eventually you may want to modify the code and/or write it from scratch as many of the experts on this forum can do!


Nick, I'm a little unsure about one thing. I'm not 100% sure I know what you mean by UI driven. I'm assuming you mean UI stands for User Interface. I interpret UI driven as meaning code is not necessary, and that this can be done with what's available in Excel as is. Am I correct? If I'm not, then I'm not sure what you mean. Thanks again for following up here. I appreciate it greatly.
 
So....you're saying what I want to do can be done without hand-coding at all? If that's true, then I can do this without having any knowledge of mashup coding? I can do this strictly with the Power Query Editor's interface?

Okay...so if that's true, then my doing this is just a matter of me figuring out what things I have to select, what things I have to click, and which functions I have to choose?

If so, this is great, because I want to do the same exact thing with the Genre and Subgenre columns. Also, if I decide to add a few more Actors columns, it won't require having to modify the code?

Am I right about this? If I am, then I just have to sit down in front of Excel, and mess with it until I figure out how to do this. This is very encouraging. Thank you, everyone!
 
Yes, you can do a lot with the Power Query interface. Though if you add another actor column, you'll have to modify the query. There are a number of options

You could go into the advanced editor and insert "Actor 4" as so

#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Actor 1", "Actor 2", "Actor 3", "Actor 4"}),

Replacing the list with some code can select any number of columns that begin with "Actor"

= Table.SelectColumns(#"Changed Type", List.Select(Table.ColumnNames(#"Changed Type"), each Text.StartsWith(_, "Actor")))

Or staying within the PQ Editor you can find the step that selects the 3 Actor columns, delete it, then add a step to select the 4 columns. You'll get warnings and errors once you delete, but if done properly you can put Humpty Dumpty back together again!
 
I think I got this. I think I did this with just the UI.

I read Norm's instructions about unpivoting the columns. That was key here. (Thank you!)

So, after loading the "Main" worksheet into the PQ Editor, here's what I did:
1) I selected the columns I wanted to work with.
2) I right-clicked and selected Remove Other Columns.
3) I selected all three Actors columns.
4) I right-clicked and selected Unpivot Only Selected Columns.
5) Two columns then appeared: Attribute, and Value. I sorted the Value column, which has the names of the Actors, and selected that one column.
6) I selected Group By.
7) The Group By dialog box was set to Basic. I grouped by Value, typed in a New Column Name, and voila.

I'm pretty sure that did what I want. I haven't sat down yet to double-check the work. But it looks right.
I then did the exact same thing with the Genre and Subgenre columns. The results look correct at first glance, as well.

So…if what I did is correct, and accurate, this makes things so much easier, it's incredible.
Thank you, everyone! I sincerely appreciate the time and effort you all took to help me. I learned a lot from this conversation. Much respect and gratitude. Jd
 

Attachments

  • UI Only - No Code.xlsm
    68.5 KB · Views: 4
Back
Top