Polyaxon v1.6: Offline Mode for ML Tracking, Timeline, File Initializer, & Operator Concurrent Reconciles

Today, we are pleased to announce the v1.6 release of our MLOps platform, a stable version which brings several new visualization features, scalable operator concurrency enhancements, and fixes.

Offline mode for ML tracking

Often users will need to test or debug their scripts before running them in-cluster. Polyaxon detects an environment variable POLYAXON_NO_OP that allows users to disable all client and tracking calls.

The offline mode comes as an additional alternative that allows users to log experiment data on the local disk without requiring or connecting to a Polyaxon server.

Users can trigger all calls and see the corresponding results, this is particularly useful to debug situations that only get triggered in-cluster or to help users build some additional functionality on top of the logic provided by Polyaxon. Additionally, Polyaxon provides some CLI interfaces to view, list, and sync the metadata and artifacts to a Polyaxon instance.

It’s easy to trigger the offline mode, you can set the environment variable POLYAXON_IS_OFFLINE, or directly pass the flag is_offline to the tracking Run or tracking init.

After running offline experiments, users can list all offline runs using polyaxon ops list --offline and view the content of an offline experiment using polyaxon ops get -uid UUID --offline.


Finally if an experiment is promising, users can sync it to the server using: polyaxon ops sync -uid UUID [-c/--clean] or polyaxon ops sync -a -c to sync all offline experiments and clean the local folder.

Improved manual management of runs

Most of the experiment tracking that users perform are in-cluster and only concern the run that’s being executed by the pod running on Kubernetes. However, several users are also launching local experiments within a notebook sessions both in-cluster or out of the cluster, and they generally like to track those experiments as well.

The tracking Run and module init have now a new flag called is_new that initializes a new run and creates a new instance, for example, if a user is running a notebook in-cluster, they can instrument several runs within that notebook:

from polyaxon import tracking

# First run
tracking.init(..., is_new=True)
tracking.log_metrics(metric1=0.1, metric2=3.0, ...)
# Stopping the current run

# Second run
tracking.init(..., is_new=True)
tracking.log_metrics(metric1=0.2, metric2=4.2, ...)
# Stopping the current run

File initializer

Often users will need to perform an operation that only triggers some bash logic, a tiny python script, or they might need to create a file based on the input of a polyaxonfile and feed it to their main logic. Polyaxon has now a new initializer: file. This initializer allows to create a file, a here script (or here-doc) that can be used by following init containers or the main container.

version: 1.1
kind: component
  kind: job
    - file:
        content: |
          print("Hello World")
    image: polyaxon/polyaxon-quick-start
    workingDir: '{{ globals.artifacts_path }}'
    command: [python3, -u,]

The new file initializer has several use-cases, and eventually will be the way to initialize dockerfiles instead of the current dockerfile initializer.

Please checkout the new intro section that goes through a couple of examples of how to use this new initializer.

Timeline view

If you are running DAGs, hyperparam tuning jobs, or schedules, you can take advantage of the new view that shows an aggregated table of all dependent runs of that pipeline and a Gantt chart to show a visual timeline of when they started and finished.

  • Timeline for a matrix definition


  • Timeline for a DAG definition


  • Timeline for a cron schedule


Improved client with more artifacts interfaces

This version brings some missing high level artifacts methods that were missing in previous versions:

  • delete_artifact: To delete a single artifact.
  • delete_artifacts: To delete a directory.
  • upload_artifacts_dir: To upload a directory.

Logs storage improvement

Logs are now stored in a similar way to other metric events, i.e. as a pipe separated values, this should reduce the file sizes significantly, the files are also more readable when used outside of Polyaxon UI or CLI.

New versions for TFJob and MPIJob

All Kubeflow charts were updated to point to their latest versions. Also the UI is now smarter at showing logs of the main containers, even when the containers have different naming conventions than the one used by Polyaxon.

Scalable operator reconciles

If you are running a high number of concurrent operations, you will see that with the default deployment options you might notice slower jobs’ termination or reporting.

The deployment configs (for Polyaxon CE and Polyaxon Agent) have now a new field maxConcurrentReconciles that you can use to perform more concurrent reconciliations, the operator has also some more logic to detect issues and report correct states back to the API:

  maxConcurrentReconciles: 10

The operator was also updated to pull lower conditions and report them when an operation has issues:


Comparison table opened indicator

The comparison table has a new indicator when a run is opened in flyout mode:


Learn More about Polyaxon

This blog post just goes over a couple of features that we shipped since our last product update, there are several other features and fixes that are worth checking. To learn more about all the features, fixes, and enhancements, please visit the release notes.

Polyaxon continues to grow quickly and keeps improving and providing the simplest machine learning layer on Kubernetes. We hope that these updates will improve your workflows and increase your productivity, and again, thank you for your continued feedback and support.

Subscribe to Polyaxon

Get the latest posts delivered right to your inbox