﻿Blaze Execution
===============

 * [Blaze Function Use Cases](blazefunc-usecases.md)
 * [Blaze NumPy-like API](blaze-numpy-api.md)
 * [Elementwise Reductions](elwise-reduction-ufuncs.md)
 * [Blaze AIR](blaze-air.md)
 * [Deferred CKernel Interface](deferred-ckernel-interface.md)
 * [CKernel](ckernel-interface.md)


The blaze execution system takes blaze expression, which are built up by
applying those functions over blaze arrays. Blaze arrays can describe data
or other (sub-)expression. The goal for the blaze execution system is to
determine a suitable evaluation strategy compatible with the input data
sources. This may include assembling something like an SQL query, composing
existing compiled code, or JIT-compiling kernel implementations written in
flypy.

The expressions themselves are simple nodes in a DAG, which are typed as
soon as they are created, which allows early error detection.
Nodes are created through function application:

    blaze.add(a, b) # return a new blaze array with a deferred expression
                    # Add(a.expr, b.expr)

High-level expressions may be sent over a network or passed on to the execution
engine to determine a suitable evaluation plan. The evaluation plan may
include C ABI constructs to compose code from low level libraries, JIT-compiled
code, assembled queries, etc.

This document describes the design of the execution system, how it goes from
expressions to execution.

At the highest level we have blaze expressions, in between we have an
intermediate representation of the expression along with execution details,
and at the lowest level we have a set of kernels linked together by their
arguments. This process works as follows:

    * operations over arrays are lazy and accumulate a directed acyclic graph
    * blaze.eval() starts the execution process, which involves conversion to
      AIR and executing the result


Blaze Expressions and Blaze Functions
-------------------------------------

 * [Blaze Function Use Cases](blazefunc-usecases.md)

Blaze expressions are generated by applying blaze functions.
Blaze functions are the user facing representation of functionality
in blaze. This is like the `ufunc`/`gufunc` of numpy, but more general since
we have more flexible ways to pattern match inputs. This happens through
datashape type signatures, to which arguments must conform.

These functions may be *implemented* using kernels of various sorts for various
backends.

We support open-ended extension through overloading at the blaze function level.
There are two forms of overloading at play:

    - function overloading
    - kernel overloading

The former overloads a logical blaze function, for instance for typing
purposes. The second form allows kernel (implementation) overloading,
where kernels are associated logically with blaze functions for a
certain implementation kind. Implementation kinds include flypy, SQL,
ckernel, and so forth.


Blaze Expression Lowering to Blaze AIR
--------------------------------------

The highest level of the blaze execution system is taking the interface
provided to users of blaze, which includes blaze functions and data
descriptions from concrete input arrays, and lowers it to blaze AIR.


Blaze AIR JIT Compilation
-------------------------

 * [Blaze AIR Documentation](blaze-air.md)

Once we are in blaze AIR (Array Intermediate Representation), we apply
successive passes in a pipeline to reduce the expressions to kernels which
can be applied to their arguments. This is further described in the link
above.


The Deferred CKernel Interface
------------------------------

 * [Deferred CKernel Interface Documentation](deferred-ckernel-interface.md)

At the lowest level
are primitive C ABI interfaces designed to be interoperable across
any library boundaries, including between systems using different
standard libraries. These low level interfaces are used by the
higher level systems as JIT compilation targets, and as a way to
import implementation kernels from outside of blaze.

One fundamental aspect of both blaze and dynd is deferred execution.
For supporting deferred and cached execution at the low level, just one
small step above the ckernel interface, is the deferred ckernel.
This object provides a simple interface to building a ckernel whose
structure has already been determined up to the dynd type level, and
just needs dynd metadata and a kernel type (i.e. single or strided) to
build a ckernel.

One of the use cases driving the deferred dynamic kernel is to provide
individual kernels to blaze and dynd function dispatch. While many of
the functions provided by blaze will be JIT compiled LLVM bitcode, there
needs to also be a way to expose functions to blaze from external systems
which know little or nothing about blaze and dynd.


The CKernel Builder
-------------------

 * [CKernel Documentation](ckernel-interface.md)

The lowest level execution interface in blaze is the ckernel.
Any time an operation gets executed in blaze, it is first reduced
down into a ckernel, either via JIT compilation or assembling together
other ckernels, and then executed as a ckernel.

When constructing a ckernel from JIT compilation or deferred ckernels,
the `ckernel_builder` object is used. This is a small C struct with a
static buffer which can hold small ckernels, and API functions for
dynamically growing the ckernel buffer for larger ones.

At the ckernel level, all information about types and possible variations
about memory layout has been baked into the code and data that make
up the ckernel. All that is left is the ability to call the kernel function
and to free the resources associated with the ckernel. This means that
code using a ckernel can be quite simple, it just needs to know the ckernel's
function prototype, and have data pointers that it knows conforms to the
types baked into the ckernel, and it can execute it.
