This is a description to a programming and code formatting style for Mathematica programs that I call "munging style code layout".
Any self-contained part of a Mathematica program is a
Mathematica expression. As the oft-mentioned idiom said, in
Mathematica, "everything is an expression". The appearance of a
Mathematica expression is function[arg1, arg2, ...]
("functions", composite expressions) or simply foobar
("symbols", atomic expressions). This is true for any part of a
Mathematica program at any depth.
Any Mathematica function is essentially a nested structure of the above building blocks. Complex Mathematica program tends to have deeply nested expressions like
f4[arg41, f3[arg31, arg32, f2[arg21, f1[input], ...], ...], arg43, ...] (* comment explaining what f1, f2, f3, f4 does *)
Imagine a complex Mathematica program having hundreds of above
expressions, sometimes nesting each other. It is hard to read or
understand code formatted this way because one needs to constantly
jumping out and back into the brackets [ ... ]
. This
is a stack)
data structure stored in one's brain that one has to frequently
navigate.
Over the time I have gradually established a particular
formatting style for complex Mathematica functions with such deep
nested structure. I think it helps writing and understanding my
Mathematica programs. I call it munging style code
layout. It's nothing but putting small parts of the code
onto separate lines and use
Composition
to chain them up:
Composition[
f4[arg41, #, arg43, ...]&, (* comment for what f4 does *)
f3[arg31, arg32, #, ...]&, (* comment for what f3 does *)
f2[arg21, #, ...]&, (* comment for what f2 does *)
f1 (* comment for what f1 does *)
][input]
I call it munging because it overall operates on a compact initial input argument, typically represented by a Mathematica symbol, and process it step by step and little by little, for multiple steps before returning a final desired output.
This way of laying out the code has a few advantages:
-
The program body will spread out to multiple lines naturally according to each step's logic. The code is still logically nested, but not visually nested.
-
Each munging step can be commented at the end of line, so it's easier to write and read the comment. One line of code is structurally and logically simple, so each line's comment will be short too. As a result, the comments for the complete program should also be easier to read and understand.
-
It is easier to write and debug code. A typical scenario is
Write the first simple step
Composition[
f1[arg11, #, arg13]& (* comment for what f1 does *)
][input]
Run it and verify the first step does what it should do, then add a second step, and comment as you code:
Composition[
f2[arg21, #, ...]&, (* comment for what f2 does *)
f1[arg11, #, arg13]& (* comment for what f1 does *)
][input]
Notice, when writing the body of f2[...]
, one needs
not to edit around f1[....]
, but on
a separate new line. This may not sound like a big deal, but,
according to my personal experience, it eliminates a lot of chances
of messing up with f1[...]
when typing
f2[...]
. If one decides the f2
just typed
down shouldn't stay, he can just simply select the whole line of of
f2
-- typically a keyboard-only operation or a
mouse-only operation depending on the programmer's habit and the
editor -- and delete it without worrying about messing up any part
of f1[...]
or needing to copy f1[...]
out
safely. So there is an ergonomic advantage to separate
f1[...]
and f2[...]
into different lines.
You write your code in small chunks, and manage the chunks as
compositing terms of the entire logic.
In addition, one can easily print out the intermediate value in-between two steps by simply inserting a NOP step:
Composition[
f2[arg21, #, ...]&, (* comment for what f2 does *)
(Print@#; #)&, (* NOP step to print out intermediate value for debugging *)
f1 (* comment for what f1 does *)
][input]
When you have more steps added, you might find the first few steps aren't perfect, now you can insert NOP step to print out more intermediate values:
Composition[
...,
f4[arg41, #, arg43, ...]&, (* comment for what f4 does *)
(Print@#; #)&,
f3[arg31, arg32, #, ...]&, (* comment for what f3 does *)
(Print@#; #)&,
f2[arg21, #, ...]&, (* comment for what f2 does *)
f1 (* comment for what f1 does *)
][input]
Notice inserting these NOP steps again doesn't require editing
at lines of the actual code (f1[...]&
,
f2[...]&
, ...). So there is natural and convenient
separation of debugging code and real code.
All of this may appear to be too trivial a matter to document. But I found Mathematica code is harder to format than most other programming languages, partially because the character of Mathematica language (functional and symbolic) and partially because neither Mathematica notebook nor Wolfram Workbench provides sophisticated and robust automatic code formatting/indentation. This little formatting rule seems to help me writing better Mathematica code, sometimes also faster and easier in doing so.
Related: