Web scraping tennis rankings

I decided to build a web scraper to parse data from livetulokset.com. First my goal is to scrape all player’s names and rankings for them.

Environment: ASUS ROG G55VW, Xubuntu 18.04, Anaconda, Python 3.7. Using Jupyter-notebook.

Imported needed libraries and tested if managed to grab correct HTML site.

Inspected the element which I needed.

Then I managed to grab wanted element. Next I need to loop through every one of them

I double checked if results was what was wanted, because there wes dublicated numbers as seen in image below. But players are even with their points so it’s not a parsing error.

Then player’s names are needed.

Let’s try to convert this to JSON

Now at least the first lines are converted into JSON format, but I do suppose it should show me every ranking converted.

Added few lines to the end for saving these values to a file. And we can see that it doesn’t grab every player, just the #1 Djoco.

Well, I really didn’t expect result like this. No idea how to fix this. It might be pretty close to get the wanted outcome. In image below printed in wanted format but only the first #1 player, I WANT THEM ALL.

FINALLY after way too many hours of research I found right search words. Now both ranking and the corresponding name is in the same JSON object.

Now it SHOULD be easier to add more lists and then convert a bigger database to charts or something like that! At the moment this code writes this kind of files:

To Be Continued

Better way to create json file from multiple lists?
two Lists to Json Format in python
Building a Web Scraper from start to finish
Python JSON
How do I save results of a “for” loop into a single variable?
5.1.3. List Comprehensions

Leave a Reply

Your email address will not be published. Required fields are marked *