top of page

Downloading the Data

  • Option 1: Direct download from the tasks page, in JSON lines format.​
     

  • Option 2: Using the datasets library.

​

​

​

​

​

 

Running experiments

  • The code used to run our experiments is available on GitHub.

​

Making a Leaderboard Submission

  • Create a comma-separated values (CSV) file with the headers (Task, ID, Prediction), where each row represents one output.
    For example:








     

​

We recommend using our conversion script to produce the CSV file from JSON prediction files to avoid discrepancies.

As inputs, it expects a predictions JSON file with a mapping from an ID to a textual prediction, for every task, e.g:

​

​

​

​

​

  • Login to the website (using your Google account is recommended).
     

  • Upload your CSV file via the submission page.
     

  • Within a few minutes, check your email for a confirmation message that your submission has been received.
     

  • Results will be sent by email within 24 hours. Valid public submissions will immediately appear on the leaderboard.

​

  • Each user is limited to 5 submissions per week and a total of 10 submissions per month.

​

​

If you need any help, please reach out to scrolls-benchmark-contact@googlegroups.com

​

​

Task,ID,Prediction

qasper,8941956c4b67e2436bbaf372a120f358f50c377b,"English, German, French"

qasper,5b63fb32633223fa4ee214979860349242a11451,"sentiment classifiers"

...

quality,72790_5QFDYSRE_4,"C"

...

summ_screen_fd,fd_Gilmore_Girls_01x13,"Rory's charity rummage sale is a disaste..."

...

from datasets import load_dataset

z_scrolls_datasets = ["gov_report", "summ_screen_fd", "qmsum","squality","qasper", "narrative_qa", "quality","musique","space_digest", "book_sum_sort"]

data = [load_dataset("tau/zero_scrolls", dataset) for dataset in z_scrolls_datasets]

{

"example_id1": "prediction1",

"example_id2": "prediction2",...

}

bottom of page