Results 1 to 7 of 7

Thread: Combining Data from multiple files with inconsistent date headings

  1. #1
    Neophyte simsy's Avatar
    Join Date
    May 2020
    Location
    UK
    Posts
    2
    Articles
    0
    Excel Version
    365

    Post Combining Data from multiple files with inconsistent date headings



    Register for a FREE account, and/
    or Log in to avoid these ads!

    Hi,

    I need to combine the data in around 30 files into a single Excel file for further analysis. I've been doing this with VBA for around the last 6 years, which, while slow, has worked.

    However the number of files I need to combine has grown significantly and my VBA code struggles to finish in a reasonable time.

    Now PQ is part of Excel, it seems to make sense to put all this data in the Data Model for my analysis.

    The problem I have is, the format of the data is very simple (it's a matrix. . not a table). I thought the very useful Unpivot command would be the answer, but I can't make it work.

    If you look at the attached file, you see the simple format that I'm trying to convert in to a table. . . . but the table headings ( week commencing dates ) can all have different start dates.

    Could somebody please give me a brief guide of the best way to approach this problem, I think I need to do an independent Unpivot for each file, but that's purely me describing what I think is the solution.

    Many thanks

    Ex Inconsistent Column Headings.xlsx

  2. #2
    Conjurer alansidman's Avatar
    Join Date
    Oct 2018
    Location
    Steamboat Springs
    Posts
    235
    Articles
    0
    Excel Version
    2019
    Using your example of three files, mock up what you want the end result to look like. This will help significantly in determining how to combine them. How will duplicate dates be handled?
    Last edited by alansidman; 2020-05-28 at 10:33 PM.

  3. #3
    Super Moderator Bob Phillips's Avatar
    Join Date
    Mar 2011
    Posts
    1,782
    Articles
    0
    Excel Version
    O365
    Pull all the files in from the folder, and run a function on them to unpivot, then pivot them back.

    Code:
    let
        Source = Folder.Files("D:\_projects\_8 Community\_forums\ExcelGuru\ExcelGuru 10655 - Combine Files Various Dates"),
        convert = (input as text) as table =>
        let
            Source = Excel.Workbook(File.Contents(input), null, true){[Item="Sheet1",Kind="Sheet"]}[Data],
            headers.Promote = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
            dates.Unpivot = Table.UnpivotOtherColumns(headers.Promote, {"Location", "Operator", "Work Type"}, "Date", "Value")
        in
            dates.Unpivot,
        file.AddCol = Table.AddColumn(Source, "Custom", each convert([Folder Path]&[Name])),
        extras.RemoveCol = Table.RemoveColumns(file.AddCol,{"Content", "Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}),
        file.Expand = Table.ExpandTableColumn(extras.RemoveCol, "Custom", {"Location", "Operator", "Work Type", "Date", "Value"}, {"Location", "Operator", "Work Type", "Date", "Value"}),
        #"Renamed Columns" = Table.RenameColumns(file.Expand,{{"Date", "DateString"}}),
        data.Type = Table.TransformColumnTypes(#"Renamed Columns",{{"DateString", type date}, {"Location", type text}, {"Operator", type text}, {"Work Type", type text}, {"Value", type number}}),
        date.Format = Table.AddColumn(data.Type, "Date", each Date.ToText([DateString], "dd-MMM-yyyy")),
        datestring.RemoveCol = Table.RemoveColumns(date.Format,{"DateString"}),
        dates.Pivot = Table.Pivot(datestring.RemoveCol, List.Distinct(datestring.RemoveCol[Date]), "Date", "Value", List.Sum)
    in
        dates.Pivot

  4. #4
    Neophyte simsy's Avatar
    Join Date
    May 2020
    Location
    UK
    Posts
    2
    Articles
    0
    Excel Version
    365
    Thanks Bob, that works a treat

  5. #5
    Acolyte Dico's Avatar
    Join Date
    May 2020
    Posts
    26
    Articles
    0
    Excel Version
    2019 Pro Plus
    Quote Originally Posted by Bob Phillips View Post
    Pull all the files in from the folder, and run a function on them to unpivot, then pivot them back.

    Code:
    let
        Source = Folder.Files("D:\_projects\_8 Community\_forums\ExcelGuru\ExcelGuru 10655 - Combine Files Various Dates"),
        convert = (input as text) as table =>
        let
            Source = Excel.Workbook(File.Contents(input), null, true){[Item="Sheet1",Kind="Sheet"]}[Data],
            headers.Promote = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
            dates.Unpivot = Table.UnpivotOtherColumns(headers.Promote, {"Location", "Operator", "Work Type"}, "Date", "Value")
        in
            dates.Unpivot,
        file.AddCol = Table.AddColumn(Source, "Custom", each convert([Folder Path]&[Name])),
        extras.RemoveCol = Table.RemoveColumns(file.AddCol,{"Content", "Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}),
        file.Expand = Table.ExpandTableColumn(extras.RemoveCol, "Custom", {"Location", "Operator", "Work Type", "Date", "Value"}, {"Location", "Operator", "Work Type", "Date", "Value"}),
        #"Renamed Columns" = Table.RenameColumns(file.Expand,{{"Date", "DateString"}}),
        data.Type = Table.TransformColumnTypes(#"Renamed Columns",{{"DateString", type date}, {"Location", type text}, {"Operator", type text}, {"Work Type", type text}, {"Value", type number}}),
        date.Format = Table.AddColumn(data.Type, "Date", each Date.ToText([DateString], "dd-MMM-yyyy")),
        datestring.RemoveCol = Table.RemoveColumns(date.Format,{"DateString"}),
        dates.Pivot = Table.Pivot(datestring.RemoveCol, List.Distinct(datestring.RemoveCol[Date]), "Date", "Value", List.Sum)
    in
        dates.Pivot

    Hi Bob Philipps,
    That's a very interesting code. I did a test on an example and it works very well.
    I'm trying to understand the steps created automatically using the interface and those created manually.
    Does the code in red correspond to a manual entry?

    Thank you in advance

  6. #6
    Super Moderator Bob Phillips's Avatar
    Join Date
    Mar 2011
    Posts
    1,782
    Articles
    0
    Excel Version
    O365
    Quote Originally Posted by Dico View Post
    I'm trying to understand the steps created automatically using the interface and those created manually.
    Does the code in red correspond to a manual entry?
    Yes and no. I first created a query that read in one file, promoted the headers and unpivoted it, all via the interface, and then I turned that into a generic function (function signature, changed the file name to the parameter input), and embedded that manually into the query that grabbed all of the files.

  7. #7
    Acolyte Dico's Avatar
    Join Date
    May 2020
    Posts
    26
    Articles
    0
    Excel Version
    2019 Pro Plus
    Thanks Bob for your response,
    I think I understand now how you did it,
    it's actually easier than writing manually some code steps.

    Have a nice day.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •