futhark-opencl¶
SYNOPSIS¶
futhark opencl [options…] <program.fut>
DESCRIPTION¶
futhark opencl
translates a Futhark program to C code invoking
OpenCL kernels, and either compiles that C code with a C compiler to
an executable binary program, or produces a .h
and .c
file
that can be linked with other code. The standard Futhark optimisation
pipeline is used.
futhark opencl
uses -lOpenCL
to link (-framework OpenCL
on
macOS). If using --library
, you will need to do the same when
linking the final binary.
OPTIONS¶
Accepts the same options as futhark-c.
ENVIRONMENT VARIABLES¶
CC
The C compiler used to compile the program. Defaults to
cc
if unset.
CFLAGS
Space-separated list of options passed to the C compiler. Defaults to
-O -std=c99
if unset.
EXECUTABLE OPTIONS¶
Generated executables accept the same options as those generated by
futhark-c. For the -t
option, The time taken to perform
device setup or teardown, including writing the input or reading the
result, is not included in the measurement. In particular, this means
that timing starts after all kernels have been compiled and data has
been copied to the device buffers but before setting any kernel
arguments. Timing stops after the kernels are done running, but before
data has been read from the buffers or the buffers have been released.
The following additional options are accepted.
- -h, --help
Print help text to standard output and exit.
- --build-option=OPT
Add an additional build option to the string passed to
clBuildProgram()
. Refer to the OpenCL documentation for which options are supported. Be careful - some options can easily result in invalid results.- --default-group-size=INT
The default size of OpenCL workgroups that are launched. Capped to the hardware limit if necessary.
- --default-num-groups=INT
The default number of OpenCL workgroups that are launched.
- --default-threshold=INT
The default parallelism threshold used for comparisons when selecting between code versions generated by incremental flattening. Intuitively, the amount of parallelism needed to saturate the GPU.
- --default-tile-size=INT
The default tile size used when performing two-dimensional tiling (the workgroup size will be the square of the tile size).
- -d, --device=NAME
Use the first OpenCL device whose name contains the given string. The special string
#k
, wherek
is an integer, can be used to pick the k-th device, numbered from zero. If used in conjunction with-p
, only the devices from matching platforms are considered.- --dump-opencl=FILE
Don’t run the program, but instead dump the embedded OpenCL program to the indicated file. Useful if you want to see what is actually being executed.
- --dump-opencl-binary=FILE
Don’t run the program, but instead dump the compiled version of the embedded OpenCL program to the indicated file. On NVIDIA platforms, this will be PTX code.
- --load-opencl=FILE
Instead of using the embedded OpenCL program, load it from the indicated file.
- --load-opencl-binary=FILE
Load an OpenCL binary from the indicated file.
- -n, --no-print-result
Do not print the program result.
- -p, --platform=NAME
Use the first OpenCL platform whose name contains the given string. The special string
#k
, wherek
is an integer, can be used to pick the k-th platform, numbered from zero.- -P, --profile
Gather profiling data while executing and print out a summary at the end. When
-r
is used, only the last run will be profiled. Implied by-D
.- --param=ASSIGNMENT
Set a tuning parameter to the given value.
ASSIGNMENT
must be of the formNAME=INT
Use--print-params
to see which names are available.- --print-params
Print all tuning parameters that can be set with
--param
or--tuning
.- --tuning=FILE
Read size=value assignments from the given file.
- --list-devices
List all OpenCL devices and platforms available on the system.