10. C Porting Guide¶
This short document contains a collection of tips and tricks for porting simple numerical C code to Futhark. Futhark’s sequential fragment is powerful enough to permit a rather straightforward translation of sequential C code that does not rely on pointer mutation. Additionally, we provide hints on how to recognise C coding patterns that are symptoms of C’s weak type system, and how better to organise it in Futhark.
One intended audience of this document is a programmer who needs to translate a benchmark application written in C, or needs to use a simple numerical algorithm that is already available in the form of C source code.
10.1. Where This Guide Falls Short¶
Some C code makes use of unstructured returns and nonlocal exits
(return
inside loops, for example). These are not easy to express
in Futhark, and will require massaging the control flow a bit. C code
that uses goto
is likewise not easy to port.
10.2. Types¶
Futhark provides scalar types that match the ones commonly used in C:
u8
/u16
/u32
/u64
for the unsigned integers,
i8
/i16
/i32
/i64
for the signed, and f32
/f64
for
float
and double
respectively. In contrast to C, Futhark does
not automatically promote types in expressions - you will have to
manually make sure that both operands to e.g. a multiplication are of
the exact same type. This means that you will need to understand
exactly which types a given expression in original C program operates
on, which generally boils down to converting the type of the
(type-wise) smaller operand to that of the larger. Note that the
Futhark bool
type is not considered a number.
10.3. Operators¶
Most of the C operators can be found in Futhark with their usual
names. Note however that the Futhark /
and %
operators for
integers round towards negative infinity, whereas their counterparts
in C round towards zero. You can write //
and %%
if you want
the C behaviour. There is no difference if both operands are
non-zero, but //
and %%
may be slightly faster. For unsigned
numbers, they are exactly the same.
10.4. Variable Mutation¶
As a sequential language, most C programs quite obviously rely heavily
on mutating variables. However, in many programs, this is done in a
static manner without indirection through pointers (except for arrays;
see below), which is conceptually similar to just declaring a new
variable of the same name that shadows the old one. If this is the
case, a C assignment can generally be translated to just a
let
-binding. As an example, let us consider the following
function for computing the modular multiplicative inverse of a 16-bit
unsigned integer (part of the IDEA encryption algorithm):
static uint16_t ideaInv(uint16_t a) {
uint32_t b;
uint32_t q;
uint32_t r;
int32_t t;
int32_t u;
int32_t v;
b = 0x10001;
u = 0;
v = 1;
while(a > 0)
{
q = b / a;
r = b % a;
b = a;
a = r;
t = v;
v = u - q * v;
u = t;
}
if(u < 0)
u += 0x10001;
return u;
}
Each iteration of the loop mutates the variables a
, b
, v
,
and u
in ways that are visible to the following iteration.
Conversely, the “mutations” of q
, r
, and t
are not truly
mutations, and the variable declarations could be moved inside the
loop if we wished. Presumably, the C programmer left them outside for
aesthetic reasons. When translating such code, it is important to
determine exactly how much true mutation is going on, and how much
is just reuse of variable space. This can usually be done by checking
whether a variable is read before it is written in any given
iteration - if not, then it is not true mutation. The variables that
change value from one iteration of the loop to the next will need to
be maintained as merge parameters of the Futhark do
-loop.
The Futhark program resulting from a straightforward port looks as follows:
let main(a: u16): u32 =
let b = 0x10001u32
let u = 0i32
let v = 1i32 in
let (_,_,u,_) = loop ((a,b,u,v)) while a > 0u16 do
let q = b / u32.u16(a)
let r = b % u32.u16(a)
let b = u32.u16(a)
let a = u16.u32(r)
let t = v
let v = u - i32.u32 (q) * v
let u = t in
(a,b,u,v)
in u32.i32(if u < 0 then u + 0x10001 else u)
Note the heavy use of type conversion and type suffixes for constants.
This is necessary due to Futhark’s lack of implicit conversions. Note
also the conspicuous way in which the do
-loop is written - the
result of a loop iteration consists of variables whose names are
identical to those of the merge parameters.
This program can still be massaged to make it more idiomatic Futhark -
for example, the variable t
only serves to store the old value of
v
that is otherwise clobbered. This can be written more elegantly
by simply inlining the expressions in the result part of the loop
body.
10.5. Arrays¶
Dynamically sized multidimensional arrays are somewhat awkward in C, and are often simulated via single-dimensional arrays with explicitly calculated indices:
a[i * M + j] = foo;
This indicates a two-dimensional array a
whose inner dimension
is of size M
. We can usually look at where a
is allocated to
figure out what the size of the outer dimension must be:
a = malloc(N * M * sizeof(int));
We see clearly that a
is a two-dimensional integer array of size
N
times M
- or of type [N][M]i32
in Futhark. Thus, the update
expression above would be translated as:
let a[i,j] = foo in
...
C programs usually first allocate an array, then enter a loop to
provide its initial values. This is not possible in Futhark -
consider whether you can write it as a replicate
, an iota
, or
a map
. In the worst case, use replicate
to obtain an array of
the desired size, then use a do
-loop with in-place updates to
initialise it (but note that this will run stricly sequentially).