Updates from OGB

Please update your package to 1.3.1 (April 8th, 2021).

April 8th, 2021: Package updated to package 1.3.1
  • Thanks to the DGL Team, all the LSC data is now hosted on AWS. This significantly improves the download speed around the globe! The underlying LSC data stays exactly the same.

March 15th, 2021: OGB-LSC at KDD Cup 2021 started!
  • We organize the machine learning challenge on large-scale graph data.
  • Please update your package to 1.3.0, through which the OGB-LSC datasets are accessible.

Feb 28th, 2021: Package updated to 1.2.6.
  • Fixed downloading bug of https (expired certificate) by switching to http.

Feb 24th, 2021: Package updated to 1.2.5.
  • ogbg-code has been deprecated due to prediction target (i.e., method name) leakage in input AST.
  • ogbg-code2 has been introduced that fixes the issue, where the method name and its recursive definition in AST are replaced with a special token _mask_.

Dec 29th, 2020: Package updated to 1.2.4.
  • ogbl-citation and ogbl-wikikg have been deprecated due to a bug in negative samples in test/validation sets.
  • ogbl-citation2 and ogbl-wikikg2 are introduced that fixes the issue.

Oct 13rd, 2020: Rules for the experimental protocol clarified.

Oct 11st, 2020: Call for dataset contribution.

We opened the dataset contribution from our community. If you have interesting graph datasets, we are looking forward to hearing from you (details here)!


Sep 25th, 2020: OGB paper accepted to NeurIPS.

Sep 12nd, 2020: Package updated to 1.2.3.
  • Made ogbn-papers100M data loading more tractable by using compressed binary files (fix issue).
  • Introduced DatasetSaver module for external contributors.
  • Made dataset object compatible to DGL v0.5 (not backward compatible for heterogeneous graph datasets).

Aug 11st, 2020: Package updated to 1.2.2.

We changed the evaluation metric of ogbg-molpcba from PRC-AUC to Average Precision (AP). AP is shown to be more appropriate to summarize the non-convex nature of the Precision Recall Curve [1]. The leaderboard and our paper have been updated accordingly.

[1] Jesse Davis and Mark Goadrich. The relationship between precision-recall and roc curves. InInternational Conference on Machine Learning (ICML), pp. 233–240, 2006.


July 25th, 2020: Leaderboard policy updated.

In the leaderboard submission, we additionally require reporting validation performance and tuned hyper-parameters. Our goal is to encourage the fair model selection procedure, by preventing the development of models that are over-tuned to our public test sets. Please refer to here for more details. We thank the community for the great suggestion. If you have previously made leaderboard submissions, please tell us the above two information.


June 27th, 2020: Leaderboard policy updated.
  1. Additional information is required for leaderboard submission (thanks to the suggestion from Google group discussion and Github issue).
    • We additionally require reporting hardwares and #parameters in the leaderboard submission. Please refer to here for more details.
    • If you have previously made leaderboard submissions, please tell us the above two information.
  2. The package version requirement has been added for each dataset.
    • To make sure all the leaderboard submissions use the same datasets and evaluators, we have added the package version requirement for each dataset. It can be checked at both dataset pages (e.g., here) and leaderboard pages (e.g., here).
    • We highly recommend always using the newest package version. Our data loader only downloads and processes the modified datasets.

June 26th, 2020: Package updated to 1.2.1.

June 11st, 2020: Second major release of OGB.

May 4th, 2020: First major release of OGB.
  • Package updated to 1.1.1.
  • Paper uploaded to arXiv.

May 1th, 2020: Package updated to 1.1.0.