Page 1 of 2 1 2 LastLast
Results 1 to 10 of 12

Thread: Count if in Power BI

  1. #1
    Seeker pinarello's Avatar
    Join Date
    Jun 2019
    Location
    Germany
    Posts
    8
    Articles
    0
    Excel Version
    Office 365

    Count if in Power BI



    Register for a FREE account, and/
    or Log in to avoid these ads!

    Hello,

    on mrexcel.com/forum the question "Countif in power BI" was asked recently. Since I found the question attractive, I tried to find a solution and found one.

    When I applied it to 11,000 lines, the query ran for more than 7 minutes. By using "Table.Buffer" I could reduce the runtime to about 50 seconds, but this is still much too slow.

    To solve the problem, I had to use indexes. I suspect that the long runtime is related to this. Because even in another query in which I work with INDEX, the runtime is also very bad.

    However, I myself have only been working with Power Query since last December and don't know, for example, when and how "Table.Buffer" is best used.

    But the great solutions posted for Challenge 5 make me confident that there is a good and fast solution for this problem as well.
    Attached Files Attached Files

  2. #2
    Acolyte cyborgski's Avatar
    Join Date
    May 2019
    Posts
    26
    Articles
    0
    Excel Version
    Office 365
    Hi pinarello,

    I don't know that indexing is the issue but rather where you reference the prior record in step Userdef_Col.

    [ColA] <> Index_3[ColA]{[#"Index-3"]-1}

    In my experience, this technique is always slow. You might get better performance by merging and expanding to get the prior record.

    Alternatively, you can use a group on ColA, add an index, expand and resort to avoid referencing the prior record. The query below runs in a couple of seconds.

    Regards,
    Mike

    let
    Source = Excel.CurrentWorkbook(){[Name="Tabelle1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ColA", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "OriginalOrder", 1, 1),
    #"Grouped Rows" = Table.Group(#"Added Index", {"ColA"}, {{"Temp", each _, type table [ColA=text, OriginalOrder=number]}}),
    Transform = Table.TransformColumns(#"Grouped Rows",{{"Temp", each Table.AddIndexColumn(_, "ColB", 1, 1), type table}}),
    #"Expanded Temp" = Table.ExpandTableColumn(Transform, "Temp", {"OriginalOrder", "ColB"}, {"OriginalOrder", "ColB"}),
    #"Sorted Rows" = Table.Sort(#"Expanded Temp",{{"OriginalOrder", Order.Ascending}}),
    #"Removed Columns" = Table.RemoveColumns(#"Sorted Rows",{"OriginalOrder"})
    in
    #"Removed Columns"

  3. #3
    Acolyte cyborgski's Avatar
    Join Date
    May 2019
    Posts
    26
    Articles
    0
    Excel Version
    Office 365
    I have subsequently modified your code to show the self merge technique to reference the prior row. This query runs in seconds too.

    let
    Source = Excel.CurrentWorkbook(){[Name="Tabelle1"]}[Content],
    Index_1 = Table.AddIndexColumn(Source, "Index", 0, 1),
    Sorted_Rows = Table.Sort(Index_1,{{"ColA", Order.Ascending}, {"Index", Order.Ascending}}),
    Index_2 = Table.AddIndexColumn(Sorted_Rows, "Index-2", 1, 1),
    Index_3 = Table.AddIndexColumn(Index_2, "Index-3", 0, 1),
    #"Merged Queries" = Table.NestedJoin(Index_3, {"ColA", "Index-3"}, Index_3, {"ColA", "Index-2"}, "Index_3", JoinKind.LeftOuter),
    #"Expanded Index_3" = Table.ExpandTableColumn(#"Merged Queries", "Index_3", {"ColA"}, {"PriorRow.ColA"}),
    AddCountFrom = Table.AddColumn(#"Expanded Index_3", "CountFrom", each if [PriorRow.ColA] = null then [#"Index-2"]
    else null),

    #"Filled Down" = Table.FillDown(AddCountFrom,{"CountFrom"}),
    #"Added Custom" = Table.AddColumn(#"Filled Down", "ColB", each [#"Index-2"] -[CountFrom]+1),
    #"Sorted Rows" = Table.Sort(#"Added Custom",{{"Index", Order.Ascending}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Sorted Rows",{"ColA", "ColB"})
    in
    #"Removed Other Columns"
    Last edited by cyborgski; 2019-06-23 at 07:52 AM. Reason: Boldface wrong line

  4. #4
    Seeker pinarello's Avatar
    Join Date
    Jun 2019
    Location
    Germany
    Posts
    8
    Articles
    0
    Excel Version
    Office 365
    Hello, Mike,

    first of all many thanks for your two fantastic solutions. I took a close look at them and then let them run step by step to understand them.

    With my current state of knowledge, I would never have come up with the idea of solving the question in this way.

    With a data volume of 120,000 table lines, both are almost equally fast with a runtime of 1-2 seconds. And I can't tell which one is faster. If I test with 960,000 table rows, then your first solution takes about 12 seconds and the second about 40 seconds.

    By using "Table.Buffer" I can bring the first one down to 10.5 seconds, while the second one has a negative effect. Whereby I must confess that I am not yet clear, when and how "Table.Buffer" is used most meaningfully.

    The reference book, published in 2018, in German language, that I bought 3 months ago, is helpful at the beginning, but as soon as the questions become more complex, it doesn't help. And the Microsoft M-Reference (English) I can hardly use at the moment, because in my opinion, the examples can hardly be adapted to my own questions.

    In this respect, I think it was a good decision to become active on Excelguru.

    By the way, Pinarello was the brand of my 2nd racing bike, which I rode from 1983-1988. Currently I have been riding a full carbon racing bike with a weight of about 7 kilograms since the end of 2007.


    Best regards,
    Pinarello

  5. #5
    Acolyte NormS's Avatar
    Join Date
    Jul 2017
    Posts
    25
    Articles
    0
    Excel Version
    Excel 2016 ProPlus
    Grouping and adding an index as cyborgski has done is a powerful tool. This blog post of Ken's provides step-by-step instructions.

    https://www.excelguru.ca/blog/2018/0...g-power-query/

    Norm

  6. #6
    Seeker pinarello's Avatar
    Join Date
    Jun 2019
    Location
    Germany
    Posts
    8
    Articles
    0
    Excel Version
    Office 365
    Many thanks to all of you.


    I already gave a very positive feedback yesterday. But my answer still has to be approved by the administrator.

    If my answer yesterday is not released, then I will probably have to reformulate it again.



    Addendum: Funny, today's answer went straight through. Maybe I happened to use a keyword on the index yesterday
    Last edited by pinarello; 2019-06-24 at 01:28 PM.

  7. #7
    Acolyte cyborgski's Avatar
    Join Date
    May 2019
    Posts
    26
    Articles
    0
    Excel Version
    Office 365
    Hi pinarello,


    Glad the post helped you.


    I have found Table.Buffer (and List.Buffer) to be useful when calling the same table over and over again in a column or in List.Generate/List.Accumulate. And I always found .Buffer hurts my NestedJoins.


    You may find better performance by putting a key on your Index3 step to be used in the NestedJoin step. But this is not guarentee of improvement either.


    Index_3 = Table.AddKey(Table.AddIndexColumn(Index_2, "Index-3", 0, 1),"Index-3", true),


    The rule that I follow most in PQ is "if it helps then keep it, otherwise lose it".


    Regards,
    Mike

  8. #8
    Seeker pinarello's Avatar
    Join Date
    Jun 2019
    Location
    Germany
    Posts
    8
    Articles
    0
    Excel Version
    Office 365
    Hi mike,

    when I replace the Index_3 code I get error message: It's not possible convert value "Index-3" into type "List".

    Regards,
    Pinarello

  9. #9
    Acolyte cyborgski's Avatar
    Join Date
    May 2019
    Posts
    26
    Articles
    0
    Excel Version
    Office 365
    Sorry, forgot the braces around "Index-3"
    Index_3 = Table.AddKey(Table.AddIndexColumn(Index_2, "Index-3", 0, 1),{"Index-3"}, true),

  10. #10
    Seeker pinarello's Avatar
    Join Date
    Jun 2019
    Location
    Germany
    Posts
    8
    Articles
    0
    Excel Version
    Office 365
    Using "AddKey" yesterday your solution with the self merge technique for 960,000 lines only took about 11 seconds. For me incomprehensible, this variant needs today between 35 and 40 seconds (several attempts).

    But I noticed something else.

    In my folder I have your 2 queries, the query from citizenbh (mrexcel.com/forum - link to the beginning of the thread), which he posted after my positive experiences here, and my own (slow) query.

    And only the presence of my slow query, even if I don't open and update it, causes the task manager for Excel to permanently display a CPU load of 30 - 50 percent (960,000 input lines).

    Only after I have commented on my query and refreshed it once, Excel now shows a CPU load of 0.1 percent.

    Do you have any idea why this might be?

Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •