Challenges of the Code Documentation
Friday, July 30, 2010 at 11:52 Tweet Here's the interesting problem.
There are numerous situations, when code contains a lot of important information. This important code can change really frequently too.
Let's say that we need to relay this important information for somebody who is not intimately familiar with the codebase. For example:
- Researchers depending on the conventions and transformations in some data pumping project.
- New users being introduced into some project via articles with a lot of samples.
- Managers, requiring knowledge of some business constants and rules.
- 3rd party Developers, that have to integrate with some API, while having the access to the latest samples, restrictions and constraints.
Needless to say, that important code pieces could be scattered across multiple projects, adding friction to people that need to have a look at them fast.
We do want to have this friction at minimum! This way we increase the chances that some questions could be resolved by looking at the documentation, instead of wasting time and potentially involving somebody else into this quest for the answers. Saved time essentially translates in reduced expenses and faster reaction of an organization (resulting in improved ability to compete on the market).
There also might be some important contextual information about this code. It might or might not be valuable for the certain party, but developers would want to write it somewhere (enabling them to forget details and free Brain RAM for the other tasks). Comments usually help here, but they have to stay with the code and are limited to the plain-text (no graphs, images, tables or even bold).
One common way of relaying this information (in some specific context) is to document the code in external docs, while including the latest snippets. However, the code tends to change a lot. This is especially true for fast-paced environments with tight feedback loops and low-friction development (and deployments).
So we have got ourselves a problem here:
- we either need to waste time and concentration on updating the documentation after every significant code change (i.e.: a few times a day);
- or we have to accept the fact that the documentation is out-of-date and essentially useless;
- or we have to include links like: "for the actual details look in the method DoomsdayMachine.RefreshWorld() and any other methods it might call". We'll also need to remember to update the links, should the class be renamed or moved.
One logical solution is to have auto-generated documentation that could be compiled from some text, while automatically linking to the code sources. And it has to survive refactoring and class renames.
I know that Lokad researchers use LaTeX with some scripts for such tasks. However the whole LaTeX thing looks a bit of overkill here, plus I'm not sure it can bind to some MSIL-level markers within the .NET code, while providing common publishing functionality.
Ideally this would work like this:
- Project has documentation files stored and versioned side-by-side with the sources (ideally in the same solution).
- These documentation files are expressive enough to contain graphs, images, tables and all the other nice publishing things, while referencing some code blocks in the project.
- Editing the documentation would be WYSIWYG-friendly, while the original document format would be friendly to the version control (and seeing the changes).
- Changing the original code (i.e.: adding a few lines in the beginning of the file, or moving method around) should not break the documentation.
- Whenever needed (or continuously on the integration server) these separate doc files are assembled and rendered to the desired publishing format (i.e.: online docs or PDF).
- Any document-level compilation problems are detected immediately (i.e. when building documentation).
Does anybody have similar problems and ways of solving them? What do you think?
Reader Comments (4)
I concord with the opinion that LaTeX is overkill :-)
So far, the best available solution seems to auto-generate documentations based on XML comment markup. Yet, this approach has significant limitations:
- editing basic HTML without wysiwyg is a pain.
- no graphics allowed.
- documentation structure is rigidly bound to the code structure.
Then, I think part of the problem lies in the way we access the documentation. I am no big fan of PDF, except when the document is really formatted as an eBook with the possibility to print. Then, HTML is somewhat OK, but going back and from from Visual Studio to the browser is tedious.
To some extend, I think a better solution would be to have a Visual Studio extension that would render (real-time) the document of any browsed item, either local code (with the possibility to edit it) or 3rd party libraries (docs being embedded into the XML file along the assembly).
Innovasys have a product called Document! X that may help.
LaTex - I studied Maths many years ago and we were shown Latex. I just thought I'd stick with paper and pen ;)
Take a look at Image Insertion extension for VS 2010
http://visualstudiogallery.msdn.microsoft.com/en-us/793d16d0-235a-439a-91df-4ce7c721df12
The Image Insertion sample lets you insert images directly in line with your code to help you visualize aspects of your code. The editor extension supports drag and drop from the solution explorer and automatically sizes images to the space available pushing aside the text to make sure everything is readable.
Then this is probably possible to auto-generate documentation from xml with tools like SandCastle. But this will require some additional coding.
Thanks for the hint. That's a nice idea. Have you tried it with ReSharper?