.. role:: ref(emphasis) .. _futhark-opencl(1): ============== futhark-opencl ============== SYNOPSIS ======== futhark opencl [options...] DESCRIPTION =========== ``futhark opencl`` translates a Futhark program to C code invoking OpenCL kernels, and either compiles that C code with a C compiler to an executable binary program, or produces a ``.h`` and ``.c`` file that can be linked with other code. The standard Futhark optimisation pipeline is used. ``futhark opencl`` uses ``-lOpenCL`` to link (``-framework OpenCL`` on macOS). If using ``--library``, you will need to do the same when linking the final binary. The GPU terminology used is derived from CUDA nomenclature (e.g. "thread block" instead of "workgroup"), but OpenCL nomenclature is also supported for compatibility. OPTIONS ======= Accepts the same options as :ref:`futhark-c(1)`. ENVIRONMENT VARIABLES ===================== ``CC`` The C compiler used to compile the program. Defaults to ``cc`` if unset. ``CFLAGS`` Space-separated list of options passed to the C compiler. Defaults to ``-O -std=c99`` if unset. EXECUTABLE OPTIONS ================== Generated executables accept the same options as those generated by :ref:`futhark-c(1)`. For the ``-t`` option, The time taken to perform device setup or teardown, including writing the input or reading the result, is not included in the measurement. In particular, this means that timing starts after all kernels have been compiled and data has been copied to the device buffers but before setting any kernel arguments. Timing stops after the kernels are done running, but before data has been read from the buffers or the buffers have been released. The following additional options are accepted. --build-option=OPT Add an additional build option to the string passed to ``clBuildProgram()``. Refer to the OpenCL documentation for which options are supported. Be careful - some options can easily result in invalid results. --default-thread-block-size=INT, --default-group-size=INT The default size of thread blocks that are launched. Capped to the hardware limit if necessary. --default-num-thread-blocks, --default-num-groups=INT The default number of thread blocks that are launched. --default-threshold=INT The default parallelism threshold used for comparisons when selecting between code versions generated by incremental flattening. Intuitively, the amount of parallelism needed to saturate the GPU. --default-tile-size=INT The default tile size used when performing two-dimensional tiling (the workgroup size will be the square of the tile size). -d, --device=NAME Use the first OpenCL device whose name contains the given string. The special string ``#k``, where ``k`` is an integer, can be used to pick the *k*-th device, numbered from zero. If used in conjunction with ``-p``, only the devices from matching platforms are considered. --dump-opencl=FILE Don't run the program, but instead dump the embedded OpenCL program to the indicated file. Useful if you want to see what is actually being executed. --dump-opencl-binary=FILE Don't run the program, but instead dump the compiled version of the embedded OpenCL program to the indicated file. On NVIDIA platforms, this will be PTX code. --load-opencl=FILE Instead of using the embedded OpenCL program, load it from the indicated file. --load-opencl-binary=FILE Load an OpenCL binary from the indicated file. -p, --platform=NAME Use the first OpenCL platform whose name contains the given string. The special string ``#k``, where ``k`` is an integer, can be used to pick the *k*-th platform, numbered from zero. --list-devices List all OpenCL devices and platforms available on the system. SEE ALSO ======== :ref:`futhark-test(1)`, :ref:`futhark-cuda(1)`, :ref:`futhark-c(1)`