When talking to people who haven’t deployed ML models, I keep hearing a lot of misperceptions about ML models in production. Here are a few of them.
(1/6)
(1/6)
1. Deploying ML models is hard
Deploying a model for friends to play with is easy. Export trained model, create an endpoint, build a simple app. 30 mins.
Deploying it reliably is hard. Serving 1000s of requests with ms latency is hard. Keeping it up all the time is hard.
(2/6)
Deploying a model for friends to play with is easy. Export trained model, create an endpoint, build a simple app. 30 mins.
Deploying it reliably is hard. Serving 1000s of requests with ms latency is hard. Keeping it up all the time is hard.
(2/6)
2. You only have a few ML models in production
Booking, eBay have 100s models in prod. Google has 10000s. An app has multiple features, each might have one or multiple models for different data slices.
You can also serve combos of several models outputs like an ensemble.
(3/6)
Booking, eBay have 100s models in prod. Google has 10000s. An app has multiple features, each might have one or multiple models for different data slices.
You can also serve combos of several models outputs like an ensemble.
(3/6)
3. If nothing happens, model performance remains the same
ML models perform best right after training. In prod, ML systems degrade quickly bc of concept drift.
Tip: train models on data generated 6 months ago & test on current data to see how much worse they get.
(4/6)
ML models perform best right after training. In prod, ML systems degrade quickly bc of concept drift.
Tip: train models on data generated 6 months ago & test on current data to see how much worse they get.
(4/6)
4. You won’t need to update your models as much
One mindboggling fact about DevOps: Etsy deploys 50 times/day. Netflix 1000s times/day. AWS every 11.7 seconds.
MLOps isn’t an exemption. For online ML systems, you want to update them as fast as humanly possible.
(5/6)
One mindboggling fact about DevOps: Etsy deploys 50 times/day. Netflix 1000s times/day. AWS every 11.7 seconds.
MLOps isn’t an exemption. For online ML systems, you want to update them as fast as humanly possible.
(5/6)