Applying novel research methods to production systems can be messy. When no one has done it before, you have to experiment, try things out, change tactics, abandon early attempts. This can leave you with tools that don't interoperate, duplicated infrastructure, a confusing backlog of tasks with no priorities and no overall direction, and engineers facing surprise rewrites.
Anybody who's been around software development in the past two decades is familiar with the standard approach for not getting buried by these kinds of challenges: Agile methodology, which increases the rate of iteration and builds flexibility into the process.
At Hop, much of the software engineering work we do is in support of clients' machine learning research projects. A client team might consist of several researchers working semi-independently, wrangling data, furiously changing model architectures, training models, and writing all the software infrastructure needed to perform those tasks. As software engineering goes, supporting research projects is especially challenging. Sometimes there's no fixed end goal that can be decomposed into stories or tickets; the goal is continued, flexible improvement of the research process. And the researchers themselves can complicate things -- while they're brilliant at working with data and at coaxing models to behave, they may have idiosyncratic habits when it comes to writing code and don't often have the luxury of spending time on engineering best practices.
The principles in the Manifesto for Agile Software Development are still relevant to engineers like us in research environments, but benefit from a second look. Below, we examine some of the twelve principles laid out in the manifesto, reflecting our experience at Hop working on a broad range of research-oriented projects.
Welcome changing requirements, even late in development. Agile processes harness change for the customer's competitive advantage.
In a research context, we need to be even more flexible about pivoting than what’s expressed here. Research involves numerous unknown unknowns, and while changing course if something isn't working is not necessarily any more important than in the general case, it is more likely to happen, and more often. The other key nuance here is that researchers need to be opportunistic. Often, there are highly divergent paths to solving a given problem. If a novel approach presents itself, they need to be ready to completely scrap a previous approach. Changes in requirements can be frustrating for their engineer counterparts, but they are easier to handle if you are expecting them and you build in affordances for big changes.
Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
Some would argue that feedback cycles need to be shorter in general, but this is especially true in the research context. The other complication is that there can be multiple cycles at play.
We have researchers deploying changes to production based on user data. This should happen quickly in order to be responsive to changes in user behavior. Then we also have the broader, high-level research cycle. This will be slower but should be punctuated with regular demonstrations (more on that later).
Lastly, you have engineers shipping code and tools to support the research effort. The longer a researcher is left working on a problem in isolation, the more they will build up ad-hoc tooling that may need to be rebuilt later. Furthermore, this can lead to establishing workflows and processes that are hard to change down the road. This is not to say that researchers aren't capable of building flexible, scalable systems, only that their primary objective is to produce research insights -- having them focused on, say, building APIs and infrastructure is a misallocation of resources, at best. Also, without adding some intentionality, there is little chance one researcher’s tools will align with the ad-hoc tools built up by other researchers on the team. Iterating on tooling with very short cycles can help you stay ahead of this.
Business people and developers must work together daily throughout the project.
Here, we consider the researcher as the "business person" or stakeholder. What is especially important is keeping the engineer informed on the research direction as much as possible. This means having insight into the broader research program, rather than just the day-to-day tactical objectives. There are two reasons for this.
First, the more context the engineer has for the research space, the better they are able to plan where to invest their efforts. Deciding when to build a robust, flexible system versus when to put something quick and dirty together for a one-off experiment is a big challenge. Extra context can help spread your engineering budget more efficiently. For example, if you see a lot of exploratory data work down the road, it could be worth building a really powerful visualization tool. Or, you might decide to deploy some distributed computation tools like Metaflow or Ray, but that would only make sense if you need more than just one or two large jobs to run.
Second, engineers can provide valuable input into the research road map. This is mainly by providing expertise around what is feasible with respect to engineering effort. That is, an engineer may be able to identify easier paths to the same insights, or give caution to difficult approaches. For example, if an engineer truly groks the research objectives, perhaps they can flag issues with data schemas ("we'll never be able to query individual records in real time") or suggest alternatives which approximate the objective ("but we could if precomputing a running average is close enough").
Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.
This is mainly a reminder that there is some really cool research being done today, and giving engineers just a small taste of what's going on can be highly motivating. We're not building systems in a vacuum. If a research team’s work has a cool demo, make sure you leverage that demo to get everyone involved super pumped. The engineers will need motivation when they are slogging it out, wrangling messy data and refactoring interfaces to write unit tests.
Working software is the primary measure of progress.
In research contexts, progress is measured differently than in product contexts. Not every positive result can be shipped to production, but one highly effective way to demarcate progress is with demonstrations. Regular demos are a great way to validate that systems are running end-to-end, and to exercise code paths. With very open-ended projects, this can be especially important to figure out what areas are important to bolster or jettison.
On the other hand, non-working software that proves negative results can be just as valuable. Often, it makes sense to leave bits of code from previous experiments to show what was tried and what directions were abandoned. This code doesn't need to be maintained as your system evolves; it just needs to provide an archaeological trail to follow if you ever need to revisit old ideas. Perhaps your data costs change by an order of magnitude and it suddenly is worth the effort to squeeze all you can out of what you have — good thing you still have that notebook where you first tried it out!
Simplicity--the art of maximizing the amount of work not done--is essential.
This is particularly important in research projects because opportunity costs are high and much work is thrown out or unmaintained. Once insights are reached via experimentation, only sometimes do we keep the original code working or port it into a production system. This means that lots of research code is not worth the investment of rigorous engineering effort. Knowing where to make the investment and where to defer is its own topic (discussed above), but getting that trade-off right can mean being able to increase your capacity for additional experiments or building better production systems.
Building software for research teams is still software engineering, but the way we make decisions about trade-offs can differ significantly. Revisiting the Agile Manifesto principles from a research perspective can help you navigate this unique problem space much more effectively.
— Mark Flanagan, ML Engineer @ Hop