aggregate header values which change monthly

wabarker1

New member
Joined
May 19, 2017
Messages
3
Reaction score
0
Points
0
Location
West Chester, Pennsylvaina
Excel Version(s)
365
I have a rather large data dump from our planning tool.

I would like to perform the following tasks....

I would like to delete columns in red (D-M), not needed on a daily basis.

I would like to aggregate columns in purple into 3 columns, but these headers change every month... so i dont know if it can be done with relative references or sum with some type of wildcard. All 108 columns (in purple) can be summarized for my purposes into 3 columns (in blue) and then deleted.

36 Month Total Sales History
36 Month Processed Sales History
36 Month Tot. Validated Forecast

the summary data set is very small... but my recordset can contain about 800K rows in a csv.

the format of the file will remain unchanged, but the column headers will have the dates shift monthly (Columns S-DV)

Help on this would be greatly appreciated!View attachment Consensus_PQ.xlsx
 
See if this works for you

Load data into powerquery
Select red columns and right click remove columns
Choose the blue columns and right click unpivot other columns
Add new custom column with some sort of formula that removes the prefix from the purple columns. I'm using this to look for (space)(dash)(space)
Code:
if Text.PositionOf([Attribute]," - ")<0 then [Attribute] else Text.Range([Attribute],3+Text.PositionOf([Attribute]," - "))
Select that new column and choose Pivot, using with Value columns as the "Value Column"

Full code below. Sample attached View attachment Book3.xlsx

Code:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Removed Columns1" = Table.RemoveColumns(Source,{"Planning Hierarchy 7", "Planning Hierarchy 7 Description", "Planning Hierarchy 6", "Planning Hierarchy 6 Description", "Planning Hierarchy 5", "Planning Hierarchy 5 Description", "Planning Hierarchy 4", "Planning Hierarchy 4 Description", "Planning Hierarchy 3", "Planning Hierarchy 3 Description"}),
    #"Unpivoted Other Columns1" = Table.UnpivotOtherColumns(#"Removed Columns1", {"Region Description", "Country", "Country Description", "Planning Hierarchy 2", "Planning Hierarchy 2 Description", "Location", "Label", "Demand Stream"}, "Attribute", "Value"),
    #"Added Custom1" = Table.AddColumn(#"Unpivoted Other Columns1", "Custom.1", each if Text.PositionOf([Attribute]," - ")<0 then [Attribute] else Text.Range([Attribute],3+Text.PositionOf([Attribute]," - "))),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"Attribute"}),
    #"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Custom.1]), "Custom.1", "Value", List.Sum)
in  #"Pivoted Column"
 
It worked... but...

I was able to execute what you provided... but note.. this took a minute or two to refresh...this file only has 60K rows... but the other file has 800K rows...

I was wondering if it is possible to delete the first row, sum the columns based on column numbers... then add in the headers via a merge query?
 
Ok, try this
View attachment Book4.xlsx
Create three lists by filtering column names
Create three custom columns that sum contents of each column on those lists
Delete extra columns

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Removed Columns" = Table.RemoveColumns(Source,{"Planning Hierarchy 7", "Planning Hierarchy 7 Description", "Planning Hierarchy 6", "Planning Hierarchy 6 Description", "Planning Hierarchy 5", "Planning Hierarchy 5 Description", "Planning Hierarchy 4", "Planning Hierarchy 4 Description", "Planning Hierarchy 3", "Planning Hierarchy 3 Description"}),
    TotalSales = List.Select(Table.ColumnNames(#"Removed Columns"), each Text.Contains(_, "Total Sales History")),
    ProcessedSales = List.Select(Table.ColumnNames(#"Removed Columns"), each Text.Contains(_, "Processed Sales History")),
    Validated= List.Select(Table.ColumnNames(#"Removed Columns"), each Text.Contains(_, "Validated Forecast")),
    #"Added Custom" = Table.AddColumn(#"Removed Columns", "Total Sales", each List.Sum(Record.FieldValues(Record.SelectFields(_,TotalSales)))),
    #"Added Custom2" = Table.AddColumn(#"Added Custom", "Processed Sales", each List.Sum(Record.FieldValues(Record.SelectFields(_,ProcessedSales)))),
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Validated", each List.Sum(Record.FieldValues(Record.SelectFields(_,Validated)))),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom3",List.Combine({TotalSales,ProcessedSales,Validated}))
in
#"Removed Columns1"
 
Back
Top