Results 1 to 3 of 3

Thread: Merging multiple XML files

  1. #1
    Conjurer WizzardOfOz's Avatar
    Join Date
    Sep 2013
    Location
    Australia
    Posts
    184
    Articles
    0

    Merging multiple XML files



    Register for a FREE account, and/
    or Log in to avoid these ads!

    I've downloaded several large XML files (about 40 M each), each year is treated separate in the the files i.e. each node in the XML files has the structure of:

    <report>
    <Year>2000<Year>
    rest of xml follows

    The files change infrequently, (maybe twice a year). I need to statistically analyse the files. Should I combine the files into one large file (about 500 MB) or simply loop over each year?
    Is below the best way to join the files and what am I doing wrong that it is not saving the file?

    Thanks


    Code:
    Sub JoinXML()
    Const mPath As String = "C:\tmp\data_"
    
    
    Dim XDoc As MSXML2.DOMDocument60, XCombined As MSXML2.DOMDocument60
    Dim Level1 As MSXML2.IXMLDOMNode, RootElement As MSXML2.IXMLDOMNode
    Dim yr As Long
    
    
        Set XCombined = New MSXML2.DOMDocument60        'the new file
        Set RootElement = XCombined.createElement("data")    ' the root
        
        Set XDoc = New MSXML2.DOMDocument60             'the existing year
        
        For yr = 1999 To 2013
            XDoc.async = False
            XDoc.validateOnParse = False
            
            XDoc.Load (mPath & yr & ".xml")
            For Each Level1 In XDoc.DocumentElement.ChildNodes
                If Level1.NodeType = NODE_ELEMENT Then Call RootElement.appendChild(Level1)
            Next Level1
        Next yr
        XCombined.Save (mPath & ".xml")
        Application.StatusBar = " "
            
        Set XDoc = Nothing
        Set XCombined = Nothing
    
    
    End Sub

  2. #2
    Conjurer WizzardOfOz's Avatar
    Join Date
    Sep 2013
    Location
    Australia
    Posts
    184
    Articles
    0
    Ok solved my own coding problem, after set RootElement = XCombined.createElement need to add the line
    Xcombined.appendchild RootElement
    Would have thought that would be defined in the set statement.

    Anyway back to the original question, should one combine multiple XML files or loop over each file?

  3. #3
    Conjurer WizzardOfOz's Avatar
    Join Date
    Sep 2013
    Location
    Australia
    Posts
    184
    Articles
    0
    I combined the 15 files into one large file and then ran a test to select records that matched two conditions.
    Using 15 yearly files took just under one minute (00:00:54) compared to one large file that took more than double the time (00:02:24).
    Looking at the task manager, it seems like the slow step is reading the file into memory and it is much faster reading 15 files smaller files than one large file.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •