Agile Machine Learning: From Theory to Production
Later this year, Sumanas and I will be co-presenting a talk about researching Machine Learning in an agile development environment at the JAXLondon conference. This is a high level overview of some of the topics we will be presenting (we will also try to get some cool ML demos in there too, just to make things a bit more interesting).So what’s the problem?
Artificial Intelligence(AI) and Machine Learning(ML) are all the rage right now - at Google’s recent I/O 2017 event they really put ML front and center with plans to bake it into all their products, and lots of other large companies are aligning themselves as Machine Learning companies as a natural progression from Big Data.According to a recent Narrative Science survey, 38% of enterprises surveyed were already using AI, with 62% expecting to be using it by 2018. So it would be understandable if many companies might be feeling the pressure to invest in an AI strategy, before fully understanding what they are aiming to achieve, let alone how it might fit into a traditional engineering delivery team.
We have been working the last 12 months on taking a new product to market and trying to go from a simple idea to a production ML system. Along the way we have had to integrate open ended academic research tasks with our existing agile development process and project planning, as well as working out how to deliver the ML system to a production setting in a repeatable, robust way, with all the considerations expected from a normal software project.
Here are a few things you might consider if you are planning your ML roadmap (and topics we will cover in more detail in the JAXLondon session in October)
Machine Learning != your product
Machine Learning is a powerful tool to enhance a product - whether it be by reducing costs of human curation, or more powerful understanding for your voice/natural language interface - however, Machine Learning shouldn’t be considered the selling point of the product - think of the end result Product First, that is, is there a market for the product regardless of whether it is powered by ML or human oversight. Consider if it makes sense to build a fully non-ML version of the product to start proving the market fit and delivering value to customers.Start small and build
The Lean Startup principles of MVP and fast iterations still apply here: following on from the point above, starting with a non-ML product, as you start to apply ML techniques, if you are able to start leveraging some of these techniques to get even a small increase in performance (better recommendations, reduced human effort/cost, improved user experience - replacing human process with ML for just 5% of cases can start to realise cost benefits) then that can start adding value straight away. From a small start you are able to start proving the value that can be added whilst also getting the ML infrastructure tested and proven.Tie into development sprint cycles
You may be hiring a new R&D team, or you may be using members from your existing engineering team, either way, it is helpful to have them working in similar development sprint cycles (if you work in sprints). It will allow both sides of the teams to understand what is happening and how work is progressing - product and engineering changes and issues might be useful in informing the direction of R&D and likewise, there may be data features or feedback from the R&D team that could be easily engineered and would make things simpler for research. Whilst research is ongoing, and often a time consuming task, having fortnightly (or whatever the sprint length is) checkpoints where ideas can be discussed and demoed can be good for the whole team’s understanding as well as being a positive motivator.Don’t forget Clean Code!
Whilst experimenting and researching different ideas it can be pretty easy to fall into hacking mode, just rattling out rough scripts to prove an initial concept or idea - and there is definitely a place for this, but as your team progresses it will be more beneficial to actually invest in good coding principles. Whilst one-off scripts make sense to be hacked out, as the team works across several ideas, having code that is re-usable and organised sensibly with proper separation of concerns can make the research in the future easier, as well as reducing the cost when it comes to the production-isation. Investing in some machinery to make experiments easily testable (and benchmarking different solutions) will be very beneficial to invest in from the start.Recommended Reading
While the interwebs are awash with Machine Learning articles, tutorials and click-baitey links guaranteed to reduce the error of your model in 3 quick steps, the following is a small list of resources that we think are worth browsing.- http://colah.github.io/ - Math-lite blog covering core ML & Deep Learning concepts run by a Research Scientist at Google Brain
- Deep Visualisation Toolbox - Video (~4 mins) showing how a deep net teaches itself ‘features’ about the dataset
- http://playground.tensorflow.org/ - Play with a neural network right in the browser. A good resource to get a feel for how simple networks learn using a point-and-click interface.
- Course on ML taught at UBC by Nando de Freitas
For the academically inclined, the following is a list of papers, both recent and not so recent:
- AlexNet Paper - First paper about a Deep Net showing state-of-the-art performance
- Dropout Layers - A simple way to prevent a NN from overfitting
- Adversarial Training Paper - Intentionally inducing worst-case perturbations
- Deep Residual Networks - Deep Residual Learning for Image Recognition
- Data Augmentation Paper - Unsupervised feature learning by data augmentation
- DenseNets Paper - Densely connected NN layers
During the session in JAXLondon later this year, we will go into more detail with these ideas, as well as others including technical and architectural considerations for building and deploying an ML stack.
0 comments: