New Grad SWE looking for advice on how to tackle a confusing codebase

Personally for me the biggest point of confusion is figuring out how components interact. So when I start a job I start with the high level questions

The first question I ask is what are we doing. Let's say that my team's primary operation is to take 5 kinds of requests and produce results for all of them I find out what those requests are and what data they contain. And then I find out what is the expected output.

Let's say the workflow is lengthy and consists of multiple components I would ask my teammates to draw a high level diagram of all the components responsible for turning an input request into the response we send to our customers.

I then try to find out how tasks are picked up by component B from Component A. Is there a daemon that kickstarts it? Is there a cron job running in the background?

Then I start looking at the meta implementation details. Is there some kind of state machine that manages a set of actions depending on the state? Why is job X sent to component A and not to component B etc

Once I have an understanding about this, I start examining the individual components. What is the input of component A? and what output does it produce? and I trace the actions performed by the component. This is usually the straightforward bit because all the information you need for this part is with you in one place. Sure the code may be badly written but given enough time you can figure it out by yourself.

One thing to understand is that there are a few things that you simply will not be able to figure out on your own so you need to get your teammates to help fill in the gaps. It is especially hard if there is some obscure daemon or system process running in the background that does mysterious things or if you do deployments per region and certain components work only for certain regions etc.

/r/cscareerquestions Thread