1 Real-time Machine Learning
TODO bullets
this book contains practical advice on how to use ML models together with real-time systems. in simple terms, this means connecting a previously trained ML model in regular software and performing inference in real-time. see two examples below


the corect term is real-time this is why we’ll use it. online means something else…
it could well be that in the future, most systems will be RTML systems, but there will always be use-cases that cannot be driven by ML, because they don’t support probabilistic decisioning. This is in fact a major reason why RTML projects fail and we’ll explain it in more detail in the next chapter
link text in this chapter, we will…..
1.1 Traditional vs ML-enabled systems
well, that looks just like any other system to me. Isn’t this just software engineering? What’s so different about ML-enabled RT software?
- ML-enabled systems have different failure modes and they can fail silently
- a model outputting garbage outputs is arguably worse than software that’s out of service, because it will keep making bad business decisions.
- ML-enabled systems need a very different monitoring setup from regular software
- The performance of ML-enabled systems decays over time
- ML-enabled systems are non-deterministic and data-dependent by construction, so the usual testing strategies don’t work
- The skills involved in training and maintaining a RT model are different from those in traditional sofware teams (math, statistics, etc)
TODO figure out how to add a link to deterministic heuristics here and how they are sometimes the signal
understood, but where does RT enter the picture? Aren’t all applications of ML like this?. No. see next
1.2 Real-time vs Batch ML
explain the usual RTML flow
Well, isn’t all production ML like this? How else is it done?
“ad-hoc” ML operation: input datasets are built and scored when needed and then the output is manually sent to whichever team will use it to make decisions.
batch production ml: input datasets are automatically built and scored every day/week or month.
Well, why doesn’t everyone just use real-time ML then? It looks much better
it’s more expensive and riskier and sometimes the use case doesn’t justify the cost (e.g. too few instances being scored…)
sometimes the use-case is not that well defined (who will use the scores, what the decision layer policy will look like, etc) so a batch model is a good MVP until the use-case is solid enough to justify a RT ML model.
sometimes it just doesn’t make sense because the customer doesn’t need a real-time answer. E.g. a decision of whether or not to lend 1 billion dollars to a big company is something that takes weeks and maybe months to decide. It makes little sense to have this be a RT ML flow.
modeling teams are sometimes far away from “engineering” teams (sometimes in different business units) so it’s easier to have a flow where datasets get scored and then “passed on” to the engineering teams from time to time.
- differences in mindsets, jargon, incentives, etc.
- modeling teams don’t always want to cede “power” to engineering teams.