[N] Determined Deep Learning Training Platform

Thank you for mentioning Polyaxon (https://github.com/polyaxon/polyaxon/) we have been open-source from day one, and we are very committed to the open-source community.

From a quick look at the determined documentation, it seems like there's some overlap and differences, I will try to summarize my understanding, I did not do a deep dive, please correct if I am wrong:

  • ML/DL workload:

    • determined: It seems like is a bit involved and opinionated since they provide their own python interfaces for Tensorflow and Pytorch.
    • Polyaxon is agnostic to what the user is running inside their containers, we have a client for helping users with tracking if they opt to use the built-in tracking system, but you can run any containers based on any language. This allows users to leverage all type of DL and ML libraries and frameworks.
  • Tracking:

    • Both platform provide built-in callbacks and interfaces for tracking metrics.
    • Polyaxon can track other meta data: artifacts, images, audios, videos, PR/ROC curves, custom curves.
  • Table comparison, dashboards, and visualizations:

    • It's seems that both have a table for comparing runs, but I cannot comment about the details, Polyaxon has a flexible table for comparing runs with possibility to filter and sort by several columns (params, metrics, commits, docker images, ...)
    • Some screenshots show that determined has visualizations, Polyaxon has also a visualization system that can be extended by the users using Plotly/Bokeh/Vega.
    • Notebooks:
    • Both platforms can schedule and orchestrate notebooks on CPUs/GPUs, Polyaxon can also schedule on TPUs, I am not sure if there's a mention for TPUs in the determined documentation.
    • Tensorboards:
    • Both platforms can schedule tensorboards, I am not very sure about how deep the tensorboard integration in determined, but in Polyaxon you can schedule tensorboards for single runs, multiple runs, a log dir on PVCs, NFS, GlusterFs, S3, GCS, Azure Blob Storage,..., and you can also schedule performance based tensorboards (tensorboard for runs with a specific metric performance), tensorboard based on a tag or a condition.
    • Hyperparams tuning:
    • Both platforms support random search and grid search.
    • Both platforms have a version of Hyperband.
    • Determined has an implementation for Population-based training.
    • Polyaxon has a Bayesian optimization and hyperopt support as well.
    • Several users have been asking for the possibility to use their own suggestion and optimization research and methods, so we introduced an iterative interface that allows users to schedule their own search and suggestion methods for creating hyper-parameters spaces.
    • Distributed learning:
    • Determined has it's own distributed learning interface.
    • Polyaxon has integrations with the MPI-operator, PytorchOperator, TFOperator, and Horovod.
    • On-Prem and cloud support:
    • Both platforms can be deployed on an on-prem infrastructure.
    • Both platforms can be deployed on AWS and GCloud.
    • Polyaxon can be deployed on Azure and any kubernetes cluster.
    • Besides notebooks and tensorboards, Polyaxon is adding the possibility to schedule streamlit, voila, and any custom service. We published recently a blog post about all new UI/UX changes and features (https://medium.com/polyaxon/polyaxons-new-user-experience-1563f9a0959b).
    • We are testing a native integration with Spark and Dask with some customers as well.
/r/MachineLearning Thread Parent