Challenges of the Code Documentation

Here's the interesting problem.

There are numerous situations, when code contains a lot of important information. This important code can change really frequently too.

Let's say that we need to relay this important information for somebody who is not intimately familiar with the codebase. For example:

  • Researchers depending on the conventions and transformations in some data pumping project.
  • New users being introduced into some project via articles with a lot of samples.
  • Managers, requiring knowledge of some business constants and rules.
  • 3rd party Developers, that have to integrate with some API, while having the access to the latest samples, restrictions and constraints.

Needless to say, that important code pieces could be scattered across multiple projects, adding friction to people that need to have a look at them fast.

We do want to have this friction at minimum! This way we increase the chances that some questions could be resolved by looking at the documentation, instead of wasting time and potentially involving somebody else into this quest for the answers. Saved time essentially translates in reduced expenses and faster reaction of an organization (resulting in improved ability to compete on the market).

There also might be some important contextual information about this code. It might or might not be valuable for the certain party, but developers would want to write it somewhere (enabling them to forget details and free Brain RAM for the other tasks). Comments usually help here, but they have to stay with the code and are limited to the plain-text (no graphs, images, tables or even bold).

One common way of relaying this information (in some specific context) is to document the code in external docs, while including the latest snippets. However, the code tends to change a lot. This is especially true for fast-paced environments with tight feedback loops and low-friction development (and deployments).

So we have got ourselves a problem here:

  • we either need to waste time and concentration on updating the documentation after every significant code change (i.e.: a few times a day);
  • or we have to accept the fact that the documentation is out-of-date and essentially useless;
  • or we have to include links like: "for the actual details look in the method DoomsdayMachine.RefreshWorld() and any other methods it might call". We'll also need to remember to update the links, should the class be renamed or moved.

One logical solution is to have auto-generated documentation that could be compiled from some text, while automatically linking to the code sources. And it has to survive refactoring and class renames.

I know that Lokad researchers use LaTeX with some scripts for such tasks. However the whole LaTeX thing looks a bit of overkill here, plus I'm not sure it can bind to some MSIL-level markers within the .NET code, while providing common publishing functionality.

Ideally this would work like this:

  • Project has documentation files stored and versioned side-by-side with the sources (ideally in the same solution).
  • These documentation files are expressive enough to contain graphs, images, tables and all the other nice publishing things, while referencing some code blocks in the project.
  • Editing the documentation would be WYSIWYG-friendly, while the original document format would be friendly to the version control (and seeing the changes).
  • Changing the original code (i.e.: adding a few lines in the beginning of the file, or moving method around) should not break the documentation.
  • Whenever needed (or continuously on the integration server) these separate doc files are assembled and rendered to the desired publishing format (i.e.: online docs or PDF).
  • Any document-level compilation problems are detected immediately (i.e. when building documentation).

Does anybody have similar problems and ways of solving them? What do you think?

- by .