Big error in Power Query during web import.

pinarello

Member
Joined
Jun 21, 2019
Messages
214
Reaction score
4
Points
18
Location
Germany
Excel Version(s)
Office 365
For example, when I look at the website https://www.oddsportal.com/matches/hockey/20210907/, the sorting by time shows me the leagues (separate row) and below that the games of this league at this time.

If I now import "Table 0" of this website using Power Query, then the first line with the text: "World "Club Friendly" is interpreted as a header line and all following lines with identical content are automatically deleted. So all friendlies starting at later times are now assigned to the league above.

For this case, there should be the import option "Table without headers", where all rows are imported without Power Query automatically deleting rows.
 
When I link to that website, this is the M-Code generated
Code:
let
    Source = Web.Page(Web.Contents("https://www.oddsportal.com/matches/hockey/20210907/")),
    Data0 = Source{0}[Data],
    #"Changed Type" = Table.TransformColumnTypes(Data0,{{"World»Club Friendly", type text}, {"World»Club Friendly2", type text}, {"World»Club Friendly3", type text}, {"1", Int64.Type}, {"X", type text}, {"2", Int64.Type}, {"B's", type text}})
in
    #"Changed Type"

There is no "Promote First Row as Headers" step as the Table being imported already has the columns labeled. Power Query isn't deleting anything.
If you delete the Changed Type and Navigation steps and click on the whitespace next to the Table value in row 1 you'll see the column names.
If you don't like those headers, I would use the step Use Headers as First Row - this will demote the headers and rename the columns 1 - n.
Then name the columns as needed.
From there start your transforms.
 
Last edited:
Hello Nick,

thank you very much for your efforts, but I think you didn't understand the problem properly.

Since the data is imported from the website sorted by time, it is a coincidence which league for a sport and a day is displayed first. And exactly the first line is interpreted by Power Query as header. And all following lines, whose content is identical with the automatically interpreted header line, are automatically deleted by Power Query, before a user has the chance to prevent this.

If you look at the website, you will see the match Liptovsky Mikulas (Svk) - Znojmo (Cze) starting at 15:00 / 3:00 p.m.
under World >> Club Friednly. But after import by Power Query this match is shown under Russia >> VHL.
 
In the navigator preview, the web view shows the "World >> Club Friendly" on for the match at 15:00 (3:00 p.m.). But already in the preview of the table view, it is displayed incorrectly.

So for me it is clearly an error of Power Query.

import oddsportal.- Navigator - web view.jpgimport oddsportal.- Navigator - table view.jpg
 
Back
Top