So the first post, outlaying some of the cautions of microservices did not scare you off. You still want to learn more about how to migrate your current architecture that is causing you to lose sleep into something that resembles microservices. The next step is to get something into production. Test the waters a bit.
Find Candidates
It is likely that in your current system there are already some candidates that would work well as independent services. It was probably a historical “accident” that they are packaged and deployed with the current monolith. Or perhaps they are already separate services, managed outside the “normal” release process. Of the possible choices, lean towards the ones that have the least number of external dependencies, as they will be the easiest.
Now, these probably were not developed with a “cloud native” or 12-Factor App mentality. That is okay for now. What we are doing here is simply validating our approach and initial tech choices. We don’t want to spend a lot of time building up the best system we could possibly imagine. Those things will never be used. We need to take something that is currently being used, but has clear boundaries, and is relatively low-impact. We need to uncover the pitfalls that we cannot see. So while your invoicing system may fit as a well-disconnected system, don’t choose that as the first thing to move.
Establish the Pipeline
After you have found a candidate, you need to get the delivery pipeline in place. Figure out all the tech choices available to you. Pick the ones that best suit your needs. This is very situation-dependent, so there is not any more specific advice that I can give in regards to tech choices. There are other places to go for that.
Remember the goal at this point is to get something into production. If you are lucky enough to have found a service that has no external dependencies; no database or queue or anything like that, that is best. You can deploy that sucker into “production” in a way that lets you poke at it without anything actually using it. You can work through most of the kinks early. If you are not so lucky, and the service needs some external dependency, like a database, there are some tricks to achieve a similar result.
We know there are going to be pitfalls and bugs. This first attempt is likely to fail in spectacular ways. The trick is to limit the blast radius, so failures don’t actually impact the production system that is in use. That may mean running a new instance of the database from a recent backup, and “teeing” data to the new service to check results. But the main goal is still to get the service deployed on our new ecosystem, with our new delivery pipeline, and move production traffic to it.
Repeat
Repeat the process, find new “edge” services that are relatively easy to move, and start moving them. As you clear away the easiest ones, other services will start to appear less daunting than they were at the start. You (and your organization and teams) will become more at ease with the new process. This will also give you confidence to start to tackle some of the harder services too.
Gotchas
There are some gotchas worth calling out to watch out for here. First is licensing. Are there any license issues around moving this service? The legacy service may have been living on its own old Solaris box because it was licensed by CPU of the physical machine it was running on. Moving to a containerized or virtualized environment may mean that you are suddenly out of compliance, by hundreds of thousands of dollars.
Second, hidden dependencies and lack of knowledge. Researching these outsider services takes a bit of software archaeology. Nobody in the company may remember what the thing is, what it does, where the code lives, etc. You will need to become an expert in this dusty old thing. You will need to figure out what it does and how it does it. It is common for these old services to require a file mount somewhere, or even just some form of persistent disk. That is an easy dependency to overlook, and may be missed through all of testing, until the thing has been running in prod for a month and an automated update causes your service to restart, losing the ephemeral disk.
Lastly, watch out for extensive load. Check if the service is independent for scalability reasons. If so, then this is not a good service to take out as your first test drive. Look for something that has low usage. Again, watch out for bursty loads. Some services may appear to be relatively unused, but will see major spikes at the end of the month or fiscal quarter.