Stencil and Function Basics
NDSL derives its power from the ability to accelerate and dynamically compile code. To do this, NDSL has two constructs which are used to denote and contain acceleratable code: "stencils" and "functions". Conceptually, the two are similar, but their uses are different.
Stencils
Stencils are the primary method of creating parralelizable code with NDSL, and indeed in NDSL everything must begin with a stencil.
Within a 3-dimensional domain, NDSL evaluates computations in two parts. If we assume an (X, Y, Z) coordinate system as a reference, NDSL separates computations in the horizontal (X, Y) plane from the vertical (Z) column.
In the horizontal plane, computations are always executed in parallel, which means that there is no assumed calculation order within the plane. This concept is the foundation of of NDSL's performance capabilities, and cannot be altered.
In the vertical column, comptuations are performed by an iteration policy that is declared within the stencil. This is done to enable the implementation of more scientific patterns using NDSL. We will discuss this in more detail shortly.
Basic Stencil Example
To demonstrate how to implement a NDSL stencil, let's step through an example that copies the values of one field into another field. First, we import several packages:
Next, we define our stencil template:
Note that there is no return statement here. Stencils modify fields in place may not contain a return statement, and therefore must have all inputs and outputs passed into it at call.
This stencil template has a number of important features and keywords. Let's start with the inputs:
in_field
and out_field
. These are both declared to be type FloatField
. This notation is used
in traditional Python, and may be familiar to you as optional "type hinting". For stencils
(and functions) these type hints are required, and the code will not execute if the supplied type
does not match the declared type.
Looking into the stencil code, we can see the two most important keywords in NDSL.
The statement with computation(PARALLEL)
signals that all three dimensions can be executed in
parallel. The statement with interval(...)
signals that the computation should apply to all
vertical levels over which the stencil is executed. We will discuss both of these in more detail
later.
Now we set up our class:
Here is where we actually build our stencil (lines 14-17). Behind the scenes the system is
compiling code based on a number of considerations, most important of which is the target
architecture: CPU or GPU. We will discuss more build options in a future section, but you will
always have to specify func
(to determine what stencil is being built), and compute_dims
(we will discuss restricting the compute_dims
in a later guide).
Finally we can run the program:
Temporary Fields
NDSL has the ability to generate temporary quantities within a stencil. All temporary quantities (defined as variables which are used within the stencil but not passed to the stencil at call) are initalized as a field of dimensions equal to the full model domain. These fields can be used just as any other field can be used.
Offsets
Within a stencil, points are referenced relative to each other. For example, the statement
field[0, 0, 1]
implies that you want an offset of positive one along the K axis (Z dimension
in our example). Offsets can only occur at read; NDSL does not allow writing with an offset.
Additionally, it is not possible to write to a field when you read from it with an offset in the
same statement (e.g. field = field[0, 0, 1]
is illegal).
With this knowledge, we can now create a stencil that copies data from the level above:
Note that the interval is restricted to prevent the computaiton from occuring on the top
level of out_field
, as looking "up" from the top level would look into the extra point included
for interface calculations. Since this computation is not being performed on the interface (both
quantities are created using Z_DIM
, not Z_INTERFACE_DIM
), that row of data will be zero, and
accessing it here will have unintented consequences in subsequent calculations.
Intervals and Iteration Policies
The statement with interval()
controls the subset of the K axis over which the stencil is
executed, and is controlled using traditional Python indexing (e.g. interval(0, 10)
,
interval(1, -1)
). The argument None
can be used to say "go to end of the domain" (e.g.
interval(10, None)
). The argument ...
signals compute at all levels, equivalent to 0, None
.
NDSL has three possible iterations policies: PARALLEL
, FORWARD
, andBAKWARD
.
Choosing PARALLEL
states that all three dimensions can be executed in parallel. In this state,
each point in the domain is computed independently in the fastest possible order. This means that
fields are being written in random order, and therefore any operations which depends on data at
other points in a field which is being written (e.g. out = out[0, 0, 1]
) cannot use
PARALLEL
.
The FORWARD
and BACKWARD
options can be considered non-parallel options for the K axis. These
options require that all calculations for a perticular K level are computed before moving on to the
next K level. This is often significantly slower than PARALLEL
, but ensures that each kernels
has the information between execution of each index along the K axis. FORWARD
executes from the
first argument in the with interval()
statement to the second (e.g.
with computaiton(FORWARD), interval(0, 10)
begins at 0 and ends at 9), while BACKWARD
does the
opposite (begins at 9, ends at 0).
FORWARD
and BACKWARD
are useful for more complex situations where data is being read with an
offset and written in the same stencil:
In this example, in_field
is being read in but also modified. It is therefore necessary to use
FORWARD
to ensure that there is not a situation where in_field is doubled at a level n
before it is read at level n - 1
.
Using offsets often requires a careful consideration of iteration policy.
Flow Control
A number of traditional Python flow control keywords can be used within NDSL stencils. These are:
if
elif
else
while
for
Conditionals
NDSL allows conditionals to be used inside of a stencil in the same way they would be used in traditional Python.
Loops
Loops are legal with NDSL stencils, but must have definite limits (e.g. no while True
)
NEED TO UNDERSTAND THE NEW FOR LOOP FEATURE THEN WRITE SOMETHING ABOUT IT
Functions
Functions in NDSL are used similar to traditional Python functions. They can be used to make code visually more appealing, and are inlined at execution.
Functions follow all the same rules as stencils, a but have a number of important quirks. Functions:
- cannot contain the keywords computaiton
or interval
(they rely on the host stencil for this info)
- cannot be called outside of a stencil
- must have a single return statement, but can return multiple values
- are not tied to a single stencil, and may be reused across any number of stencils
- do not need to be constructed with a stencil factory (they are built with the stencil)
Below is an example of a NDSL function, called from within a stencil:
Builtin Functions
NDSL has a number of builtin functions:
abs
: absolute valuemin
: minimummax
: maximummod
: modulosin
: sinecos
: cosinetan
: tangentsinh
: hyperbolic sinecosh
: hyperbolic cosinetanh
: hyperbolic tangentasin
: arc sineacos
: arc cosineatan
: arc tangentasinh
: inverse hyperbolic sineacosh
: inverse hyperbolic cosineatanh
: inverse hyperbolic tangentsqrt
: square rootexp
: inverse hyperbolic sinelog
: natural loglog10
: base 10 loggamma
: gamma functioncbrt
: cubic rootisfinite
: determine if number is finiteisinf
: determine if number is infiniteisnan
: determine if number is nanfloor
: round down to nearest integerceil
: round up to nearest integertrunc
: truncateround
: round to nearest integererf
: error functionerfc
: complementary error function
NOTE. EVENTUALLY ADD LINKS TO FULL DOCSTRING DOCUMENTATION FOR EACH FUNCTION ONCE THEY EXIST
Stencils vs Functions
Every computaiton block must have an outermost layer of a stencil. The stencil signals that parallel computation is possible, and defines the iteration policy and interval. From this point the following logic applies:
-
It is not possible to call one stencil from within another, as that would create a situation where there are two layers of parallelization. If you want to reuse code across multiple stencils, it should be put into a function.
-
For similar reasons, it is not possible to call a stencil from within a function.
-
As long as you are originating from a stencil, it is possible to call one function from within another function.
Basic Stencil & Function Tips and Tricks
Multiple stencils can be used sequantially with no impact on performanc - NDSL will combine these stencils during compilation.
Similarly, a single stencil can have any number of with computation()
and with interval()
statments. They do not need to appear in pairs, but often do (in which case they can be
concisely stated as with computation(), interval()
). Remember that all numerical calculations must
be inside of these two statements (either directly in a stencil, or indirectly via a function call).
When writing code in NDSL, it is generally best to prioritze readability and make your code as approachable as possible. NDSL will find ways to optimize it and make it as fast as possible.
Looking Backwards to Move Forward
This guide introduced the basic principles of stencils and functions. We have discussed how and when to use each, and highlighted features that make their use more flexible, and discussed limitations which act as guardrails against unfavorable outcomes.
With this knowledge, it is possible to use NDSL and obtain much of the designed performance; however, development may at times seem tedious and implementing more complicated patterns may be challenging. Future sections will introduce more complex ideas and discuss quality of life features which aim to alleviates these issues.