Silico Blog | Improving your model

Building things is hard

"First make it work. Then, make it work better." I came across this quote and it has has stuck for being such a simple way to communicate a useful process for building complex things.

It's also important in that it implies that getting something to work as you expect is hard. It is.

A gif of Homer Simpson attempting to build a bbq, holding up the perfect picture of the BBQ over the one he built, slowly removing it to find his version is just a pile of poorly constructed pieces, stuck in cement.

What makes it hard to build complex things?

Well, many things - especially in the broader context of programming complex systems. There is (depending on who you ask) no silver bullet technology or approach to successfully building complex systems.

Much has and will be written about this, but if we restrict the problem and limit ourselves to thinking about programs a single person will write and maintain, which is more common in science and personal work, two things stand out to me:

1. What is the problem you're trying to solve?

First, in the early stages of a project, it's generally true that you're unaware of deficiencies in your practical and theoretical understanding of the problem. This assuming you have any real idea as to what the problem really is.

This is usefully true even if you're well versed in the conceptual dimensions of your project, and probably moreso in scientific or analytical programming - where you'll be working on things you haven't encountered before.

2. There are things you only learn by building

Second, building complex things is hard because it's difficult to reason out a practical solution before exploring the problem through an attempt to build it in part or in (god forbid) whole.

This is problematic for a number of reasons, but largely because by the time you've learned important lessons, you've also got yourself a bunch of maybe-working, maybe-useful code that may not fit the problem as you now understand it. It can be hard and expensive to work your way to where you need to be once you've started.

Understanding the problem

Iterative improvement is a process, but it's really about understanding the problem you're trying to solve. And not just understanding in the sense of I think I have some knowledge of the abstract concepts (as might be found in a textbook), but rather the deep details of a working, useful, solution.

Every now and then I come across the idea that the hard part of programming is not writing code. This is hard too. Rather, that it's difficult to develop a deep enough familiarity with the problem to understand why you should do anything at all.

They say...
It's been said that by the time you really understand what you're trying to do, writing code is mostly a matter of typing. ... mostly.

What if I understand everything already?

You can try not iterating, and attempt instead to understand everything about the problem before you start writing code. However, experience writing non-trivial software usually gets around to the idea that this is very difficult.

Good luck
There are contexts in which a lot of up-front work is necessary. If you don't know you're in one, you're probably not. And if you're in one and you don't know it, you're in pretty big trouble right now.

Build to understand

One of the best approaches to manage the risks from 1 and 2 is to build a solution through iteration, improving your understanding of the problem and solution concurrently.

A key part of this is recognizing that your project, and code, is constantly changing and to build in a way that facilitates the software changing throughout its lifecycle.

Accept your limits
Another way to think about this is that your first real task is to find a way to tackle your own ignorance, and one of the more effective ways to do that is to figure it out as quickly as possible. Start with an understanding of some piece of the project, build it, see if you know what you know, and go from there. Hubris and software don't really mix. Unless you're looking for funding. Then it seems to.

There are nearly 6 million caveats to this approach, but in the current context - programming simple models and primarily for the purpose of learning - it'll be very useful.

Programming as a tool for science

It bears repeating that as it comes to writing code for science, it might be tempting to think 1) understand all the science and 2) program it. However, it's very useful to think of coding as a tool for understanding the problem, as well something you use to produce a result.

In practice this means your code and understanding will evolve together, ideally, and that you need to think about how to get the most out of this feedback loop.

Education is sequential, doing is not

Related to the idea of iteration: Most formal learning in science and/or programming - textbooks, blogs, courses, schools - is sequential in nature. It goes from less detail and complexity to more.

This leaves a misleading impression that the work to create those things is also fundamentally sequential. It's mostly not, and would more often be described (charitably) as a mess of a process.

The work of writing science code is mostly iterative, a constant back and forth. It takes real effort to structure knowledge as a clear sequence of steps. However, since this is the starting point for new learners, it's not always obvious how it all came to be.

Next up, improving actual code.

Making it work better

Series: Earth, Mars and models