Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kaggle - Complete Leaderboard Download

I am trying to download the Kaggle leaderboard table available under an individual Kaggle competition. I have used the Kaggle API and also downloaded it via the 'Raw Data' output but the table data is incomplete.

The downloaded table specifically does not contain information on '# of Entries' and 'Member Details (if available for a competition)'.

I have tried scraping the table (based on code available here) as well but the code is unable to identify any table on the website:

from bs4 import BeautifulSoup
import requests
import pandas as pd
import re
# Site URL
url="https://www.kaggle.com/c/jane-street-market-prediction/leaderboard"

# Make a GET request to fetch the raw HTML content
html_content = requests.get(url).text

# Parse HTML code for the entire site
soup = BeautifulSoup(html_content, "lxml")
#print(soup.prettify()) # print the parsed data of html

# The following line will generate a list of HTML content for each table
leaderboard = soup.find_all('table', attrs={"class": "competition-leaderboard__table"})
print("Number of tables on site: ",len(leaderboard))

Would be great if someone could help out on this. Thanks in advance!

like image 306
beta Avatar asked Oct 30 '25 21:10

beta


1 Answers

You can try the Meta Kaggle dataset. It has files with team membership data and solutions submitted by team and competition.

P.S. Parsing competition web pages is indeed hard - I've spent hours trying to get info that way.

like image 171
JohnM Avatar answered Nov 01 '25 11:11

JohnM



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!