futhark-opencl¶
SYNOPSIS¶
futhark opencl [options…] infile
DESCRIPTION¶
futhark opencl
translates a Futhark program to C code invoking
OpenCL kernels, and either compiles that C code with gcc(1) to an
executable binary program, or produces a .h
and .c
file that
can be linked with other code. The standard Futhark optimisation
pipeline is used, and GCC is invoked with -O
, -lm
, and
-std=c99
. The resulting program will otherwise behave exactly as
one compiled with futhark c
.
futhark opencl
uses -lOpenCL
to link (-framework OpenCL
on
macOS). If using --library
, you will need to do the same when
linking the final binary.
OPTIONS¶
-h | Print help text to standard output and exit. |
--library | Generate a library instead of an executable. Appends .c /.h
to the name indicated by the -o option to determine output
file names. |
-o outfile | Where to write the result. If the source program is named
foo.fut , this defaults to foo . |
--safe | Ignore unsafe in program and perform safety checks unconditionally. |
-v verbose | Enable debugging output. If compilation fails due to a compiler error, the result of the last successful compiler step will be printed to standard error. |
-V | Print version information on standard output and exit. |
-W | Do not print any warnings. |
--Werror | Treat warnings as errors. |
EXECUTABLE OPTIONS¶
Generated executables accept the same options as those generated by
futhark-c. For the -t
option, The time taken to perform
device setup or teardown, including writing the input or reading the
result, is not included in the measurement. In particular, this means
that timing starts after all kernels have been compiled and data has
been copied to the device buffers but before setting any kernel
arguments. Timing stops after the kernels are done running, but before
data has been read from the buffers or the buffers have been released.
The following additional options are accepted.
--build-option=OPT | |
Add an additional build option to the string passed to
clBuildProgram() . Refer to the OpenCL documentation for which
options are supported. Be careful - some options can easily
result in invalid results. | |
--default-group-size=INT | |
The default size of OpenCL workgroups that are launched. Capped to the hardware limit if necessary. | |
--default-num-groups=INT | |
The default number of OpenCL workgroups that are launched. | |
--default-threshold=INT | |
The default parallelism threshold used for comparisons when selecting between code versions generated by incremental flattening. Intuitively, the amount of parallelism needed to saturate the GPU. | |
--default-tile-size=INT | |
The default tile size used when performing two-dimensional tiling (the workgroup size will be the square of the tile size). | |
-d, --device=NAME | |
Use the first OpenCL device whose name contains the given string.
The special string #k , where k is an integer, can be used to
pick the k-th device, numbered from zero. If used in conjunction
with -p , only the devices from matching platforms are
considered. | |
--dump-opencl=FILE | |
Don’t run the program, but instead dump the embedded OpenCL program to the indicated file. Useful if you want to see what is actually being executed. | |
--dump-opencl-binary=FILE | |
Don’t run the program, but instead dump the compiled version of the embedded OpenCL program to the indicated file. On NVIDIA platforms, this will be PTX code. | |
--load-opencl=FILE | |
Instead of using the embedded OpenCL program, load it from the indicated file. | |
--load-opencl-binary=FILE | |
Load an OpenCL binary from the indicated file. | |
-p, --platform=NAME | |
Use the first OpenCL platform whose name contains the given string.
The special string #k , where k is an integer, can be used to
pick the k-th platform, numbered from zero. | |
--print-sizes | Print all sizes that can be set with -size or --tuning . |
-P, --profile | Gather profiling data while executing and print out a summary at the
end. When -r is used, only the last run will be profiled.
Implied by -D . |
–size=NAME=INT
Set a configurable run-time parameter to the given value. Use--print-sizes
to see which are available.
--tuning=FILE | Read size=value assignments from the given file. |