Updates from OGB
Please update your package to 1.3.1 (April 8th, 2021).
April 8th, 2021: Package updated to package
- Thanks to the DGL Team, all the LSC data is now hosted on AWS. This significantly improves the download speed around the globe! The underlying LSC data stays exactly the same.
March 15th, 2021: OGB-LSC at KDD Cup 2021 started!
- We organize the machine learning challenge on large-scale graph data.
- Please update your package to
1.3.0, through which the OGB-LSC datasets are accessible.
Feb 28th, 2021: Package updated to
- Fixed downloading bug of https (expired certificate) by switching to http.
Feb 24th, 2021: Package updated to
ogbg-codehas been deprecated due to prediction target (i.e., method name) leakage in input AST.
ogbg-code2has been introduced that fixes the issue, where the method name and its recursive definition in AST are replaced with a special token
Dec 29th, 2020: Package updated to
ogbl-wikikghave been deprecated due to a bug in negative samples in test/validation sets.
ogbl-wikikg2are introduced that fixes the issue.
Oct 13rd, 2020: Rules for the experimental protocol clarified.
Oct 11st, 2020: Call for dataset contribution.
We opened the dataset contribution from our community. If you have interesting graph datasets, we are looking forward to hearing from you (details here)!
Sep 12nd, 2020: Package updated to
ogbn-papers100Mdata loading more tractable by using compressed binary files (fix issue).
- Introduced DatasetSaver module for external contributors.
- Made dataset object compatible to DGL v0.5 (not backward compatible for heterogeneous graph datasets).
Aug 11st, 2020: Package updated to
We changed the evaluation metric of
ogbg-molpcba from PRC-AUC to Average Precision (AP). AP is shown to be more appropriate to summarize the non-convex nature of the Precision Recall Curve . The leaderboard and our paper have been updated accordingly.
 Jesse Davis and Mark Goadrich. The relationship between precision-recall and roc curves. InInternational Conference on Machine Learning (ICML), pp. 233–240, 2006.
July 25th, 2020: Leaderboard policy updated.
In the leaderboard submission, we additionally require reporting validation performance and tuned hyper-parameters. Our goal is to encourage the fair model selection procedure, by preventing the development of models that are over-tuned to our public test sets. Please refer to here for more details. We thank the community for the great suggestion. If you have previously made leaderboard submissions, please tell us the above two information.
June 27th, 2020: Leaderboard policy updated.
- Additional information is required for leaderboard submission (thanks to the suggestion from Google group discussion and Github issue).
- The package version requirement has been added for each dataset.
- To make sure all the leaderboard submissions use the same datasets and evaluators, we have added the package version requirement for each dataset. It can be checked at both dataset pages (e.g., here) and leaderboard pages (e.g., here).
- We highly recommend always using the newest package version. Our data loader only downloads and processes the modified datasets.
June 26th, 2020: Package updated to
- [Bug fix] The
ogbn-magdataset has been changed to exclude duplicated edges (fix issue).
- [Bug fix] The Evaluator for
ogbl-ddihas been changed to use Hits@50 and Hits@20, respectively.
- [Bug fix] The DGL data loader for the two heterogeneous graph datasets (
ogbl-biokg) is fixed (fix issue).
- Baseline performance on
ogbl-ppahas been updated.
- Arxiv paper has been updated accordingly.
June 11st, 2020: Second major release of OGB.
- 5 new datasets (
ogbg-code) and their benchmark experiments have been added.
- Our arXiv paper has been updated accordingly.
- Our package has been updated to
1.2.0that includes the new datasets. No change has been applied to the existing datasets.
- Baseline performance on
ogbl-citationhas been improved.