Lessons from six months doing financial research
Six months ago, I started a job doing financial research. Here is what I have learned in this time.
Some caveats: My role also involves other tasks so the number of hours spent on research is much smaller than 6 * 20 * 10. Also, I am very junior so my responsibilities tend to lean more towards figuring out why a particular thing doesn’t work than towards solving markets.
Motivation
I find myself procrastinating a lot. In the past, this was usually because I was having to do things which I thought were pointless. This is different though, since I genuinely want to do a good job and I find the work interesting.
It took me an embarassingly long time to start trying to solve this problem instead of just feeling shame about it and doing nothing.
I tend to procrastinate research for one of two reasons: Either I don’t know what to work on or I get frustrated by the work that I am doing.
Not knowing what to work on
Sometimes when I am not being productive, I notice that I don’t know what specifically I would be doing if I was. In some cases noticing is enough for my brain to shift gears. I realize what I need to do and get unstuck immediately. At other times I try to do that but I hit a wall. The felt sense is one of trying to push through viscous fog. Then my task becomes breaking the task down into pieces that I know how to approach. I write “Plan work” down on my todo list. Determining that this needs doing was valuable and writing it down stops me from getting distracted and forgetting it again.
Why is this a bigger problem for me in research than in software dev? Part of it I think is just experience. I’ve been coding for 10 years now, but doing research only for 6 months. The individual constituent tasks tasks (e.g. “get these data from the database and arrange them in that shape”) take longer because the software ecosystem is new to me. But I think that breaking a complex task into small chunks is a skill that I have developed for software dev but not for research. I am used to it being there and get confused and frustrated when it isn’t.
Another reason could be that requirements in research change much quicker than they do in dev, sometimes many times in a single hour. This makes keeping track of what to do harder.
And a third reason, although I am less sure about this, is that it often easy to formulate a question but it takes a lot of implementation work – model training, data wranging, etc. – to get an answer.
So how have I started to manage these problems? The first thing is just to accept that research tasks take a long time and that estimating this cost is a skill which will develop over time. Secondly, just writing out very explicit lists of tasks on a low level of abstraction often helps me get unstuck.
Getting frustrated
Sometimes the path is (relatively) clear, I want to get something done but I still find myself unable to. Sometimes the things that should work don’t, or I waste half an hour on an obscure error. Or there is too much lag in my remote desktop connection. Irritation increases to the point where I’m to annoyed to work on anything.
This has often resulted in me going on a crusade to implement an infrastructure fix, or a refactor, or an automated test, to make sure that nobody ever faces the same problem again. This doesn’t feel like it’s getting me closer to my research goal, but is probably a good thing and, in any case, helps by soothing my mind. On the other hand, swapping out a gnarly research task for a small engineering project with an immediate payoff is often just procrastination in disguise.
Sometimes I cannot address the cause of my frustration directly since it is composed of many small annoyances of the shape “I didn’t know matplotlib
had this weird quirk” or “I was running a different version of the package than I thought.” The solution here is often just not to do this silly thing again or to remember to check for that kind of bug next time. I imagine a lot of this will go away once I have more experience with research and the software ecosystem.
Sometimes the answer is just to go for a walk or to work on something else for a bit.
Related work
The framework Alex Vermer describes in his blog post seems like it could be valuable, but I have not experimented with it myself.
Coding (mostly focusing on jupyter notebooks)
Code for research is very different from code from software engineering. In SE making something more general than it needs to be ‘just in case’ is often a bad idea. In research we want to be able to poke at everything from all possible angles, with as little friction as possible. This is something that feels important and that I think about a lot, but I haven’t yet figured out how to navigate this tradeoff well.
I do, however, have some thoughts on how to structure jupyter notebooks:
-
The web interface is a pretty terrible editor, but a decent viewer for plots and other outputs. I usually end up writing a lot of my notebook code in vim and switching to jupyter lab when I need work on it with another team member, or when I go from the development stage to the looking and poking at data stage1. I have heard that VS code does both well.
-
For each notebook, there should be a specific question which it is trying to answer.
-
The pains of code duplication sneak up on you really quickly. Instead of applying the rule of 3, I stick to the rule of 2: When I need code from another notebook, I copy it to a
tools.py
or similar and refactor the source and destination notebooks to use that, instead of duplicating it even once. I think out-of-order execution of notebook cells and the erratic path a research session can take make notebooks much more messy than regular code. Accordingly, I’m more aggressive with refactoring.
Talk to others!
Most of this post has crystalized from conversations with colleagues. Research is hard and everyone knows it. I was surprised how much people could empathize with my specific problems, despite my lack of experience. If at all possible, ask other researchers for advicie, both specific and generic.
-
I alternate between those, it’s not a waterfall of course. The line is very fuzzy too, and the need for me to use two tools is mostly there because the vim experience isn’t very polished. ↩