Posted On: Nov 29, 2023
Amazon SageMaker notebook jobs allows data scientists to run their notebooks on demand or on a schedule with a few clicks on Amazon SageMaker Studio, a web-based IDE for machine learning (ML). Today, we’re excited to announce that you can programmatically run notebooks as jobs using APIs provided by SageMaker Pipelines, SageMaker's ML workflow orchestration service. Furthermore, you can create a multi-step ML workflow with multiple dependent notebooks using these APIs.
Data scientists can use SageMaker Notebooks Jobs for use cases such as running long running notebooks, recurring report generation, and for scaling up from preparing small sample datasets to working with petabyte-scale big data. When moving these notebooks to production, customers need API support for programmatically executing notebooks as a part of CI/CD workflows. This launch introduces the notebook job as a built-in step type when building pipelines using Amazon SageMaker Pipelines. Customers can leverage this notebook job step to easily run notebooks as jobs with just a few lines of code using Amazon SageMaker Python SDK. Additionally, customers can also stitch multiple dependent notebooks together to create a workflow in the form of Directed Acyclic Graphs (DAG). Customers can then run these notebooks jobs or DAGs, manage and visualize the using Amazon SageMaker Studio.
This feature is generally available in all AWS commercial regions where Amazon SageMaker Studio is available. To learn more, see the SageMaker Studio developer guide or the feature blog.