To launch an example:
# Recommended: YAML + CLI sky launch examples/<name>.yaml # Advanced: programmatic API python examples/<name>.py
Machine learning examples:
dvc_pipeline.yaml
: Use DVC to easily run ML pipelines on the cloud and version-control the results in git and cloud buckets. An existing DVC remote and DVC pipeline are prerequisites. A detailed tutorial is available here.huggingface_glue_imdb_app.yaml
: Use Huggingface Transformers to finetune a pretrained BERT model.resnet_distributed_torch.yaml
: Run Distributed PyTorch (DDP) training of ResNet50 on 2 nodes.detectron2_app.yaml
: Run Detectron2 on a V100 GPU.TPU examples
tpu/tpu_app.yaml
: Train on a TPU node on GCP. Finetune BERT on Amazon Reviews for sentiment analysis.tpu/tpuvm_mnist.yaml
: Train on a TPU VM on GCP. Train on MNIST in Flax (based on JAX).
resnet_app.py
: ResNet50 training on GPUs, adapted from tensorflow/tpu.The training data is currently a public, "fake_imagenet" dataset (
gs://cloud-tpu-test-datasets/fake_imagenet
, 70GB).resnet_distributed_tf_app.py
: Distributed training variant of the above, via TensorFlow Distributed.huggingface_glue_imdb_grid_search_app.py
: Grid search: run many trials concurrently on the same VM.
...and many more.
General examples:
detectron2_docker.yaml
: Using Docker to run Detectron2 on GPUs.using_file_mounts.yaml
: Usingfile_mounts
to upload local/cloud paths to a cluster.multi_hostname.yaml
: Run a command on multiple nodes.env_check.yaml
: Using environment variables in therun
commands.multi_echo.py
: Launch and schedule hundreds of bash commands on the clouds, with configurable resources. Similar to grid search.
...and many more.