Scratch
Working with Intermediate Data in Code Ocean
Last updated
Was this helpful?
Working with Intermediate Data in Code Ocean
Last updated
Was this helpful?
The /scratch
folder is a dedicated folder mounted to the Capsule that ensures large intermediate data can be easily used in Code Ocean. It functions differently during Cloud Workstation sessions and Reproducible Runs. In both cases, the /scratch
folder is mounted EFS storage that is practically unlimited in size.
When launching a Cloud Workstation, working with a large volume of data can significantly affect the performance of the Capsule. Disk space is limited and copying large volumes of data back and forth between the Cloud Workstation and Capsule is time-consuming and not recommended.
For Cloud Workstation (CW) sessions, the /scratch
folder is a mounted drive whose contents will persist throughout the lifetime of the Capsule. Files written to scratch during a CW session will be visible in the capsule IDE after the session is Shut Down and will be available in all subsequent sessions unless deleted by the user. These files will not be available during a Reproducible Run.
The /scratch
folder is a convenient location to store large data before from either the Capsule IDE or during a CW session.
For Reproducible Runs, the /scratch
folder functions as a temporary folder that is empty at the start of the run and will be emptied before the end of the run. The Capsule workspace (i.e. the core files excluding Data Assets) is limited to 5GB and therefore a Reproducible Run will fail if this limit is exceeded by creating new files during the run. The /scratch
folder can be used during a run to safely create files or work with intermediate data of any size. Since the folder is emptied before the end of each run, any results must be moved to the results folder.
The Reproducible Run /scratch
is not the same folder as Cloud Workstation /scratch
. The Reproducible Run scratch is emptied before the end of each run, the content will not be visible in the Capsule IDE.