Hacker Newsnew | past | comments | ask | show | jobs | submit | ospillinger's commentslogin

Hey Aaron, I work on Cortex which is a tool for continuously deploying models as HTTP endpoints on AWS. Under the hood we use Kubernetes instead of Lambda to avoid cold starts, enable more flexibility with customizing compute and memory usage (e.g. running inference on GPUs), and support spot instances. Could you clarify your comment regarding editing of config files? Is it still a problem if the configuration is declarative and tracked in git? I'd love to hear your feedback! (GitHub: https://github.com/cortexlabs/cortex | website: https://cortex.dev/)


Sure, I'm thinking about the development lifecycle in terms of what actions data scientists have to take to get a model deployed. Anytime the process has a branch (ie: you need to change this file whenever something elsewhere changes) then I know I'm going to forget to do that.

If we were to use Cortex, we would likely wrap the creation of cortex.yml in a function and call it when we're saving our models. We do something similar right now and store the meta in json files for later deployment. I love tracking config in git too.


That makes sense. Programmatically updating cortex.yaml is a common use case especially when you're thinking about continuous deployment. We also have a Python client which can replace the cortex.yaml file (https://www.cortex.dev/deployments/python-client).


From the MLflow Models docs: "An MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools—for example, real-time serving through a REST API or batch inference on Apache Spark. The format defines a convention that lets you save a model in different “flavors” that can be understood by different downstream tools."

Cortex is what they are referring to as a downstream tool for real-time serving through a REST API. In other words, MLflow helps with model management and packaging, whereas Cortex is a platform for running real-time inference at scale. We are working on supporting more model packaging formats and I think it's a good idea to support the MLflow format as well.


Each model is loaded into a Docker container, along with any Python packages and request handling code. The cluster runs on EKS on your AWS account. Cortex takes the declarative configuration from 'cortex.yaml' and creates it every time you run 'cortex deploy' so the containers don’t change unless you run 'cortex deploy' again with updated configuration. This post goes into more detail about some of our design decisions: https://towardsdatascience.com/inference-at-scale-49bc222b3a...


Thank you!


Yes, Cortex uses ONNX Runtime (https://github.com/microsoft/onnxruntime) under the hood so any model that can be exported to ONNX can be deployed.


Is it only able to handle ONNX models? That's a pretty massive limitation compared to a hosted SageMaker endpoint.


Contributor here - Cortex supports Tensorflow saved models in addition to ONNX. PyTorch support is on the roadmap. Do you have specific frameworks in mind that you would like Cortex to support?


Perfect. Nothing in particular other than TF.


My understanding is that Seldon and Kubeflow are more geared towards infrastructure engineers. Our goal is to hide the infrastructure tooling so that Kuberentes, Docker, or AWS expertise isn’t required. Cortex installs with one command, models are deployed with minimal declarative configuration, autoscaling works by default, and you don’t need to build Docker images / manage a registry.


Thanks!

I bet you could get Cortex running on Kubeflow pretty easily since it's all K8s anyway.


Good idea, I definitely think it's doable.


Thanks for the feedback! We aren't trying to invent another infrastructure provisioning language, and I agree that Terraform would be the right choice if that was the case. Our YAML is more similar to the configuration of deployment tools like Netlify or CircleCI. We use CloudFormation and Kubernetes under the hood but our goal is to provide a much higher abstraction for data scientists / ML engineers.


Not entirely. The abstractions are different between infrastructure deployment.... versus configuration yml of circleci.

The declaration of deployment state is a very BIG and hard problem that has had millions of collective man hours spent over decades. I urge you not to think of it as a simple configuration.

In fact it is so hard that AWS has to build a new language on top of typescript ..versus cloudformation templates that it already had.

https://docs.aws.amazon.com/cdk/latest/guide/home.html

What you are building makes sense - I would drop cloudformation and surface Terraform right till the top.

So the way to use your tool is to install and use a new Terraform "provider".


The Terraform provider idea is interesting, I'll think about it more carefully. Almost all of our deployment configuration under the hood is done with Kubernetes (which is focused on the declaration of deployment state). We modeled our configuration after Kubernetes for that reason, and we want to go beyond low-level infrastructure configuration by allowing users to configure prediction tracking, model retraining thresholds, and other more ML specific features using the same declarative paradigm and in the same configuration files.


Terraform has first class support for K8s - https://www.terraform.io/docs/providers/kubernetes/index.htm... In fact, i would say that's what Terraform was built around, so there's nothing more maintained than the k8s provider.

In addition, it has EKS (https://www.terraform.io/docs/providers/aws/r/eks_cluster.ht...) as well as GKE providers (https://www.terraform.io/docs/providers/google/r/container_c...) in case you are so inclined.


Terraform slightly predates Kubernetes, May 21 vs. June 6, 2014. They were developed independently.

Links at: https://landscape.cncf.io/category=automation-configuration,...


Well, CDK actually produces CloudFormation templates. Sorry, but I always feel the urge to jump in when people claim Terraform should be used instead of CloudFormation because of personal preferences. If you are AWS native and already using CloudFormation, I see no reason to switch. CloudFormation provides a ton of functionality out of the box and Amazon handles it for you. Rollbacks alone are a huge reason one might want to use it over Terraform.


well the reason to switch from CDK to Terraform is that your infrastructure management becomes a lot more cloud agnostic.

That's basically what Terraform is for anyway. If you wanted, you could have scripted using the AWS SDK in python or something.

if that's not a concern, then i suppose CDK is as good as any (probably better, since its in Typescript...but then so is Pulumi)


We solve a similar problem to SageMaker but we are focused on developer experience and flexibility.

- Deployments are defined with declarative configuration and no custom Docker images are required (although you can use your own if you want)

- You have full access to the instances, autoscaling groups, security groups, etc

- Less tied to AWS (GCP support is in the works)

- We are working on higher level features like prediction monitoring, alerting, and model retraining

- It's open source and free vs SageMaker's ~40% markup


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: