ACL Pack API Reference

Version: 1.0.3
Platform: Android arm64-v8a · Linux aarch64
Language: C++17
Delivery: Static library (libacl.a) + Headers

Operator Catalog
Getting Started
Error Codes
Type Definitions
Analysis
Arithmetic
Color Conversion
Filter
Geometric
Feature Detection
Transform
Math
Drawing
Contour Analysis
Utilities
Namespace Availability Matrix

Operator Catalog

One-page panorama of every operator ACL Pack ships, grouped by category. Tier suffix tells you which license unlocks the call:

No suffix — Starter tier
[Pro] — requires Pro or Business license
[Business] — requires Business license

Per-operator signatures and supported types live in the category sections below.

Category	Operators
Filter	GaussianBlur, BoxFilter, Filter2D, SepFilter2D, Sobel, Scharr, Laplacian, Canny, MedianFilter, BilateralFilter `[Pro]`, NLMeansDenoising `[Pro]`, GuidedFilter `[Pro]`, UnsharpMask `[Pro]`, StackBlur, GaborFilter `[Pro]`, Erode / Dilate, EdgePreservingFilter `[Business]`, MergeMertens `[Business]`, Tonemap `[Business]`
Color	RGB2Gray, BGR↔RGB / BGRA / RGBA (10 channel-swap variants), RGB↔HSV `[Pro]`, RGB↔Lab `[Pro]`, RGB↔YUV (NV21 / YV12 / YUV444, BT601 / 709 / 2020), Bayer demosaic, GammaTransform
Geometric	Resize (NEAREST / LINEAR2D / AREA_AVG / CUBIC4x4), Rotate (0 / 180 / CW90 / CCW90 / FLIP_V / FLIP_H / XPOSE), PyrDown / PyrUp / buildPyramid, NEON resizeYUV / rotateYUV
Arithmetic	AddImg, AbsDiff, AddWeighted, AlphaImgFusion, Multiply, Threshold, AdaptiveThreshold `[Pro]`, Bitwise AND / NOT / XOR, LUT, ConvertScaleAbs, InRange, Normalize, Phase `[Pro]`, Magnitude `[Pro]`, LinearTransform2x2 `[Business]`
Analysis	Integral, Histogram, BlockAverage, EqualizeHist, CopyMakeBorder, CLAHE `[Pro]`, HistMatch `[Pro]`, MinMaxLoc, Mean, Count, MatchTemplate `[Pro]`, Moments `[Pro]`, FindContours `[Pro]`, ExtractBlockPixels `[Business]`, DistanceTransform `[Business]`, ConnectedComponentLabeling / connectedComponent_8n_dfs `[Business]`
Feature	FAST, Harris, Shi-Tomasi (+ Detect variant), ORB (detect + detectAndCompute), HOG, HoughLines / HoughLinesP / HoughCircles, OpticalFlowLK, bfMatch / bfMatchBinary / bfKnnMatch / bfKnnMatchBinary — all `[Pro]`; SIFT, SURF `[Business]`
Transform	WarpAffine, WarpPerspective, Remap (CPP only), GetAffineTransform, GetPerspectiveTransform, GetRotationMatrix2D, FindHomography `[Pro]`, yuvRemap `[Business]`
Math (NEON)	DFT (dft1d / dft2d / dftReal1d / idftReal1d), mulSpectrums, getOptimalDFTSize — all `[Pro]`
Draw (CPP)	drawLine, drawRect, drawCircle, putText (u8 only)
Contour (CPP)	contourArea, arcLength, boundingRect, convexHull, approxPolyDP, minAreaRect, fitEllipse — all `[Pro]`

Commercial Tier And Type Policy

The customer-facing tier names are Starter, Pro, and Business. Older labels such as Core, Advanced, and Full are not used by the commercial headers or package metadata.

Tier	Operator availability	Image pixel types admitted by the commercial header
Starter	Starter operators	`uint8_t`
Pro	Starter + Pro operators	`uint8_t`, `uint16_t`
Business	Starter + Pro + Business operators	`uint8_t`, `uint16_t`, `float`

Each operator's Types table describes the implementation-level template support. In commercial packages, the actually callable type set is the intersection of that table and the tier type policy above. Unsupported combinations are rejected by the delivered <acl/api.h> at compile time, often through explicitly deleted template specializations. For example, blockAverage<uint16_t> and blockAverage<float> are deleted in Starter; blockAverage<float> is deleted in Pro; Business exposes all three listed pixel types.

The Trial package is separate from the generic acl::* / acl::neon::* API surface. It exposes only two fixed-parameter wrappers under acl::trial: resizeBilinear2xDown_cpp and resizeBilinear2xDown_neon. Trial users should not call the generic namespaces documented for paid tiers.

cpp

namespace acl::trial {
int resizeBilinear2xDown_cpp(const uint8_t* srcImage, uint8_t* dstImage);
int resizeBilinear2xDown_neon(const uint8_t* srcImage, uint8_t* dstImage);
}

These Trial wrappers use the fixed Trial input size 1920x1280; the 2x downscale wrapper writes 960x640.

Getting Started

Initialization

cpp

#include <acl/acl.h>

// Initialize with license file
int result = acl::init("/path/to/license.dat");

// Check the result
if (result == 0) {
    // Success — all operators available
} else {
    // Failure — see error codes below
}

// Get library version
const char* ver = acl::version();  // "1.0.3"

License Initialization

cpp

namespace acl {
    int init(const char* licensePath, JNIEnv* env = nullptr, jobject context = nullptr);
    const char* version();   // returns "1.0.3"
}

Parameter	Type	Description
`licensePath`	`const char*`	Absolute path to `license.dat` on the device
`env`	`JNIEnv*`	JNI environment, optional. Reserved in the ABI for future use; the current implementation does not read it. Pass `nullptr` for pure native calls
`context`	`jobject`	Android Context, same as `env`

init() reads the license file and verifies its integrity and tier. Within the scope of your purchase agreement, the version you received continues to work. Call it once at process start before invoking any operator; calling it again later in the same process returns the same status without re-reading the file.

Return values:

0 — success
-1001 — License invalid (file missing, signature corrupted, tampered with, or init() not called)
-1005 — Tier mismatch: license.tier does not match the compiled library tier (at init stage), or the operator is not in the current tier (at runtime)
-1006 — Resolution does not match the Trial fixed size (1920×1280)

Architecture

ACL Pack provides two parallel implementations:

acl::{module}::* — Portable C++ scalar implementation. Many operators are templated for the standard image pixel types (uint8_t / uint16_t / float; see each operator's description for exact combinations). All scalar operators sit directly under acl::{module}:: (no cpp segment). All declarations ship in a single header — <acl/api.h>.
acl::neon::{module}::* — ARM NEON hand-vectorized implementation. Most operators target uint8_t first; uint16_t and float support is operator-dependent and may use scalar fallback. Typical speedup is 2-25× over the scalar layer, peaking at 50×+.

The two API signatures are almost identical. On Android arm64-v8a the neon:: version is recommended; if a given entry point is only provided in scalar (the docs will note this explicitly), use the corresponding acl::{module}:: entry point.

short / int16_t, int, int64_t, and double are supported in specific auxiliary roles such as gradient outputs, labels, integral accumulators, transform matrices, moments, and parameters. They are not general image pixel input types. <acl/typeDef.h> defines shared public structs/enums; it is not a guarantee that every enum value or datatype is implemented by every operator.

Conventions

Image data is passed as raw pointers (const T* input, T* output)
stride is in bytes (not pixels). When 0 is passed it is auto-computed as width × channels × sizeof(T) (requires contiguous memory)
cn is the channel count (1 = grayscale, 3 = RGB, 4 = RGBA)
Returns 0 (ACL_OK) on success; negative values are error codes
Memory is allocated and freed by the caller; the library has zero implicit allocation
Most operators do not support inplace (src == dst) and require a separately allocated output buffer; the few that support inplace are noted at the operator
Operator calls require acl::init() to have returned 0; otherwise they return -1001 without performing any computation

Error Codes

cpp

#include <acl/err.h>

Code	Macro	Description
`0`	`ACL_OK`	Operation completed successfully
`-1`	`ACL_ERR_GENERIC`	Unclassified failure
`-2`	`ACL_ERR_INVAL`	Invalid parameter (null ptr, zero size, out-of-range enum)
`-3`	`ACL_ERR_NOMEM`	Out of memory / allocation failed
`-4`	`ACL_ERR_NOSUP`	Unsupported type / parameter combination
`-5`	`ACL_ERR_IO`	File / port open or I/O failure
`-1001`	`ACL_ERR_LICENSE_INVALID`	License file missing, corrupt, tampered with, or `acl::init()` has not yet succeeded
`-1005`	`ACL_ERR_NOT_LICENSED`	Tier mismatch: `license.tier` does not match the compiled library tier (detected by `acl::init`), or the requested operator is not available in the current tier (detected at call site)
`-1006`	`ACL_ERR_RESOLUTION_LIMIT`	Resolution does not match the Trial fixed size (1920×1280)

-1002, -1003, and -1004 are reserved in the ABI but never returned at runtime. All macros are defined in <acl/err.h>. See License Guide for detail on how -1001 / -1005 are raised from the license layer.

Type Definitions

cpp

#include <acl/typeDef.h>

Enums

RotateOrient

cpp

enum class RotateOrient {
    ROT_0,      // No rotation (copy)
    ROT_180,    // 180-degree rotation
    ROT_CW_90,  // Clockwise 90 degrees
    ROT_CCW_90, // Counter-clockwise 90 degrees
    FLIP_V,     // Vertical mirror (flip top-bottom)
    FLIP_H,     // Horizontal mirror (flip left-right)
    XPOSE       // Matrix transpose
};

InterpMode

cpp

enum class InterpMode {
    NEAREST,    // Nearest neighbor
    LINEAR2D,   // Bilinear interpolation
    AREA_AVG,   // Area-average (for downscaling)
    CUBIC4x4    // Bicubic (4x4 neighborhood)
};

BorderType

Border handling modes — how to fill out-of-bounds pixels when the kernel extends past the image. Example sequence abcdefgh (input):

cpp

enum class BorderType {
    BORDER_CONSTANT,    // 'iiiiii|abcdefgh|iiiiii' — fill with the constant parameter
    BORDER_REPLICATE,   // 'aaaaaa|abcdefgh|hhhhhh' — replicate edge pixels
    BORDER_REFLECT,     // 'fedcba|abcdefgh|hgfedc' — reflection including the edge pixel
    BORDER_WRAP,        // 'cdefgh|abcdefgh|abcdef' — wrap-around
    BORDER_REFLECT_101, // 'gfedcb|abcdefgh|gfedcb' — reflection excluding the edge pixel
    BORDER_DEFAULT = BORDER_REFLECT_101
};

BayerPattern

cpp

enum class BayerPattern { RGGB, GRBG, GBRG, BGGR };

ColorCvtGrayMode

cpp

enum class ColorCvtGrayMode {
    GRAY_LUMA,     // BT.601 luma (0.299R + 0.587G + 0.114B)
    GRAY_MAX,      // Per-pixel max of R, G, B
    GRAY_MIN,      // Per-pixel min of R, G, B
    GRAY_AVG,      // Simple average (R + G + B) / 3
    GRAY_WEIGHTED  // User-supplied weights (cR * R + cG * G + cB * B)
};

ThreshMode

cpp

enum class ThreshMode {
    THRESH_BINARY,     // dst = (src > thresh) ? maxVal : 0
    THRESH_BINARY_INV, // dst = (src > thresh) ? 0 : maxVal
    THRESH_TRUNC,      // dst = (src > thresh) ? thresh : src
    THRESH_TOZERO,     // dst = (src > thresh) ? src : 0
    THRESH_TOZERO_INV, // dst = (src > thresh) ? 0 : src
    THRESH_OTSU        // Automatic threshold (Otsu's method)
};

YUVEncodeStandard

cpp

enum class YUVEncodeStandard {
    STD_BT601,    // ITU-R BT.601 (SDTV)
    STD_BT709,    // ITU-R BT.709 (HDTV)
    STD_BT2020,   // ITU-R BT.2020 (UHDTV)
    STD_CUSTOM    // Caller-supplied 3x3 conversion matrix
};

NormType

cpp

enum class NormType { NORM_INF, NORM_L1, NORM_L2, NORM_MINMAX };

AdaptiveThreshMethod

cpp

enum class AdaptiveThreshMethod {
    ADAPTIVE_THRESH_MEAN_C,     // Mean within block
    ADAPTIVE_THRESH_GAUSSIAN_C  // Gaussian-weighted mean within block
};

MorphOp

cpp

enum class MorphOp {
    ERODE,   // Erosion (take minimum over kernel coverage)
    DILATE   // Dilation (take maximum over kernel coverage)
};

ValueRange

Used to specify the output value range (e.g. by normalize):

cpp

enum class ValueRange {
    STD_NEG1_TO_POS1,    // Result falls in [-1, 1]
    UNIT_INTERVAL,       // Result falls in [0, 1]
    NATIVE_FULL_SCALE    // Result falls in the full positive range of the output type (e.g. u8 → [0, 255])
};

TemplateMatchMethod

cpp

enum class TemplateMatchMethod {
    TM_SQDIFF, TM_SQDIFF_NORMED,
    TM_CCORR,  TM_CCORR_NORMED,
    TM_CCOEFF, TM_CCOEFF_NORMED
};

DftFlags

cpp

enum DftFlags { DFT_FORWARD = 0, DFT_INVERSE = 1, DFT_SCALE = 2 };

Data Structures (Geometry / Hough / Features / Color / Contour)

These structs are passed to operators as parameter bundles or returned as result containers. Grouped by purpose:

Geometry primitives — Point2f, Point2i, Size2f, RotatedRect (used by minAreaRect / fitEllipse / findContours)
Hough results — Vec2f, Vec3f, Vec4i (output formats for houghLines / houghCircles / houghLinesP)
Features & Matching — KeyPoint, KeyPointORB, KeyPointExt, DMatch, HOGParams (Harris / FAST / ORB / SIFT / SURF / HOG / bfMatch / bfKnnMatch)
Color conversion — YUVConvertParams (a single bundle that drives every rgb2YUV / yuv2RGB operator)
Contour & Moments — HierarchyEntry, Moments (output of findContours / moments)

Common top-level acl:: namespace (<acl/typeDef.h>):

cpp

namespace acl {
    // 2D floating-point point (e.g. RotatedRect center)
    struct Point2f {
        float x, y;
        Point2f();
        Point2f(float x, float y);
    };

    // 2D floating-point size (e.g. RotatedRect size)
    struct Size2f {
        float width, height;
        Size2f();
        Size2f(float w, float h);
    };

    // Rotated rectangle — used by minAreaRect / fitEllipse
    struct RotatedRect {
        Point2f center;
        Size2f  size;       // Note: the order of width/height is determined by minAreaRect; the reader should not assume a size ordering
        float   angle;      // Rotation angle (degrees)
        RotatedRect();
        RotatedRect(Point2f c, Size2f s, float a);
    };

    // Descriptor match result — used by bfMatch / bfKnnMatch
    struct DMatch {
        int   queryIdx;     // Query descriptor index (default -1)
        int   trainIdx;     // Train descriptor index (default -1)
        float distance;     // Descriptor distance
        DMatch();
        DMatch(int q, int t, float d);
        bool operator<(const DMatch&) const;   // Sorted by distance
    };

    // Small vector types — used by houghLines / houghCircles / houghLinesP
    struct Vec2f { float val[2]; };   // (rho, theta) — houghLines
    struct Vec3f { float val[3]; };   // (cx, cy, radius) — houghCircles
    struct Vec4i { int   val[4]; };   // (x1, y1, x2, y2) — houghLinesP line segment

    // YUV ↔ RGB conversion parameter bundle — used by every rgb2YUV / yuv2RGB operator
    struct YUVConvertParams {
        YUVEncodeStandard yuv_std        = YUVEncodeStandard::STD_BT601;
        bool              yuv444_fmt     = true;   // true = 4:4:4 packed, false = 4:4:4 planar (only for the 4:4:4 ops)
        bool              nv21_fmt       = true;   // true = NV21 (V before U), false = NV12 (U before V) (only for the NV-series ops)
        bool              rgb_fmt        = true;   // true = RGB, false = BGR
        int               bit_depth      = 0;      // 0 = auto-detect from the pixel type (u8 = 8, u16 = 16)
        int               shift_right    = 0;      // optional output bit-shift (used by 16-bit pipelines)
        int               shift_left     = 0;      // optional input bit-shift (used by 16-bit pipelines)
        bool              yuv_full_range = true;   // true = full range (0–255), false = limited range (16–235)
        bool              rgb_full_range = true;   // true = full range, false = limited range
    };
}

The defaults match the common BT.601 8-bit full-range RGB pipeline. Pass an empty {} to take all defaults, or override only the fields that differ from your pipeline. See the Color Conversion section for examples.

All public structs and enums live in the single top-level acl:: namespace (no sub-namespaces):

cpp

namespace acl {
    // ── Geometry ────────────────────────────────────────────
    struct Point2i { int x, y; };

    // ── Contour / Moments ───────────────────────────────────
    struct HierarchyEntry { int next, prev, first_child, parent; };   // default -1
    struct Moments { double m00, m10, m01, m20, m11, m02, m30, m21, m12, m03; };
    enum DistanceType { DIST_L1 = 1, DIST_L2 = 2, DIST_LINF = 3 };
    enum ContourRetrMode {
        CONTOUR_RETR_EXTERNAL = 0,
        CONTOUR_RETR_LIST     = 1,
        CONTOUR_RETR_CCOMP    = 2,
        CONTOUR_RETR_TREE     = 3
    };
    enum ContourApproxMethod {
        CONTOUR_CHAIN_APPROX_NONE   = 1,
        CONTOUR_CHAIN_APPROX_SIMPLE = 2
    };

    // ── Features ────────────────────────────────────────────
    struct KeyPoint { int x, y; float response; };
    struct KeyPointORB { float x, y, response, scale, angle; uint8_t descriptor[32]; };
    struct KeyPointExt { float x, y, response, scale, angle; float descriptor[128]; };
    struct HOGParams {
        int cellSize, blockSize, nbins, blockStride;
        // Defaults: cellSize=8, blockSize=2, nbins=9, blockStride=1
    };

    // ── Filter mode tags ────────────────────────────────────
    enum EdgePreservingType {
        EDGE_PRESERVING_RECURSIVE = 1,  // Fast
        EDGE_PRESERVING_NORMCONV  = 2   // High quality
    };
}

Analysis

Namespace: acl::analysis (CPP) / acl::neon::analysis (NEON)

integral

Integral image (Summed Area Table): I(x, y) = ∑ src(i, j), 0 ≤ i ≤ x, 0 ≤ j ≤ y. Used for O(1) rectangular region sums (boxFilter, Haar features, etc.).

Tier: Starter+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`SrcType`	`uint8_t`, `uint16_t`, `float`	—
`IntegralType`	`int32_t`, `int64_t`, `double`	`sizeof(IntegralType) ≥ sizeof(SrcType)`

CPP Version

cpp

template<class SrcType, class IntegralType>
int integral(
    const SrcType* srcImage, IntegralType* integral,
    int width, int height,
    int srcStride = 0);

Parameter	Type	Meaning	Default
`srcImage`	`const SrcType*`	Input image (single channel)	non-null
`integral`	`IntegralType*`	Output integral image, size `(width+1) × (height+1)`	non-null
`width`, `height`	`int`	Input image size	> 0
`srcStride`	`int`	Bytes per row	`0` = auto

Row 0 and column 0 of integral are always 0 (implementation sentinel row and column to simplify boundary queries).

NEON Version

cpp

template<class SrcType, class IntegralType>
int integral(
    const SrcType* srcImage, IntegralType* integralImage,
    int width, int height,
    int srcStride = 0);

Example

cpp

uint8_t srcImage[1920*1080];
int32_t integ[(1920+1)*(1080+1)];

acl::neon::analysis::integral<uint8_t, int32_t>(
    srcImage, integ, 1920, 1080);

// O(1) rectangle (x0,y0)-(x1,y1) sum
auto rectSum = [&](int x0, int y0, int x1, int y1) {
    int W = 1921;
    return integ[(y1+1)*W + (x1+1)]
         - integ[(y1+1)*W + x0]
         - integ[y0*W + (x1+1)]
         + integ[y0*W + x0];
};

histogram

Compute the pixel-value histogram.

Tier: Starter+
Channels: 1ch (packed via runtime hcn × vcn parameters when needed)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`ST`	`{uint8_t, uint16_t}`	—
`HT`	`{int, long}` (histogram bin count type)	—

CPP Version

cpp

template<class ST, class HT>
int histogram(
    const ST* srcImage, HT* hist,
    int width, int height,
    int srcStride, int histLen,
    int hcn = 1, int vcn = 1);

Parameter	Type	Meaning	Default
`srcImage`	`const ST*`	Input image	non-null
`hist`	`HT*`	Output histogram (length `histLen`, zeroed by the caller)	non-null
`srcStride`	`int`	Bytes per row	`0` = auto
`histLen`	`int`	Number of histogram bins (u8 → `256`, u16 → `65536`)	—
`hcn`, `vcn`	`int`	Horizontal / vertical channel packing	`1`, `1`

NEON Version (`uint8_t` only, fixed `histLen = 256`)

cpp

int histogram(
    const uint8_t* srcImage, int* hist,
    int width, int height,
    int srcStride = 0);

hist has a fixed length of 256 int bins; for non-uint8_t or non-256 bins, use the CPP version.

histMatch

Histogram matching (normalization) — adjusts the pixel distribution of src so that it matches the histogram of ref.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`ST`	`uint8_t, uint16_t`	all params must be the same type
`DT`	`uint8_t, uint16_t`	all params must be the same type
`RT`	`uint8_t, uint16_t`	all params must be the same type

CPP Signature

cpp

template<class ST, class DT, class RT>
int histMatch(
    const ST* srcImage, DT* dstImage, const RT* refImage,
    int width, int height,
    int srcStride, int dstStride, int refStride,
    int srcHistLen, int refHistLen,
    double MATCH_TH = 0.0,
    int hcn = 1, int vcn = 1);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const ST` / `DT`	input / output	non-null
`refImage`	`const RT*`	Reference image (its histogram is used as the target distribution)	non-null
`srcHistLen`, `refHistLen`	`int`	Source / reference bin counts	u8: `256`
`MATCH_TH`	`double`	Match tolerance threshold `[0, 1]`	`0.0`
`hcn`, `vcn`	`int`	Horizontal / vertical channel packing	`1`, `1`

equalizeHist

Histogram equalization.

Tier: Starter+
Channels: 1ch
Inplace: supported
Types:

Template parameter	Allowed types	Constraint
`T` (CPP)	`uint8_t, uint16_t`	—
`T` (NEON)	`uint8_t`	NEON-only

CPP Version

cpp

template<class T>
int equalizeHist(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0);

NEON Version (`uint8_t` only)

cpp

int equalizeHist(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0);

clahe

Contrast Limited Adaptive Histogram Equalization — divides the image into tilesX × tilesY tiles, performs histogram equalization in each tile and clips the contrast upper bound, then bilinearly interpolates (blends) the results, avoiding the over-contrast from global equalization.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

int clahe(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    double clipLimit = 40.0,
    int tilesX = 8, int tilesY = 8);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const uint8_t` / `uint8_t`	input / output	non-null
`width`, `height`	`int`	Image size	must satisfy `width ≥ tilesX`, `height ≥ tilesY`
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`clipLimit`	`double`	Contrast upper bound (higher = stronger contrast)	`40.0` (OpenCV default)
`tilesX`, `tilesY`	`int`	Horizontal / vertical tile count	`8` × `8`

minMaxLoc

Find the minimum / maximum value in the image and their locations.

Tier: Starter+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP / NEON Signature (identical)

cpp

template<class T>
int minMaxLoc(
    const T* srcImage, int width, int height, int srcStride,
    T* minVal, T* maxVal,
    int* minLocX, int* minLocY,
    int* maxLocX, int* maxLocY);

Parameter	Type	Meaning	Default
`srcImage`	`const T*`	Input image	non-null
`srcStride`	`int`	Bytes per row	`0` = auto
`minVal`, `maxVal`	`T*`	Output min / max values (may be `nullptr`)	—
`minLocX`, `minLocY`, `maxLocX`, `maxLocY`	`int*`	Output corresponding coordinates (may be `nullptr`)	—

When any output pointer is nullptr, the corresponding result is skipped.

moments

Spatial moments (orders 0~3). Single-channel image; outputs a Moments struct containing 10 double raw moments: m00, m10, m01, m20, m11, m02, m30, m21, m12, m03.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T` (CPP)	`uint8_t, uint16_t, float`	—

CPP Signature

cpp

struct Moments {
    double m00, m10, m01, m20, m11, m02, m30, m21, m12, m03;
};

template<class T>
int moments(
    const T* srcImage, int width, int height, int srcStride,
    Moments& m,
    bool binaryImage = false);

Parameter	Type	Meaning	Default
`srcImage`	`const T*`	Input image	non-null
`m`	`Moments&`	Output moments struct	filled by the function
`binaryImage`	`bool`	`true` = treat all non-zero pixels as 1 (binary evaluation)	`false`

copyMakeBorder

Add a border around the image, supporting multiple border modes. Typical use: padding before convolution.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—
BorderType: `BORDER_REPLICATE` / `BORDER_REFLECT` / `BORDER_REFLECT_101` / `BORDER_WRAP` / `BORDER_CONSTANT`, etc.

CPP Version

cpp

template<class T>
int copyMakeBorder(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight, int channelNum,
    int srcStride, int dstStride,
    int top, int bottom, int left, int right,
    const T* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`	`const T*`	Input image	non-null
`dstImage`	`T*`	Output image (size `(srcWidth + left + right) × (srcHeight + top + bottom)`)	non-null
`channelNum`	`int`	Channel count	—
`top`, `bottom`, `left`, `right`	`int`	Padding width in each of the four directions	≥ 0
`constant`	`const T*`	`BORDER_CONSTANT` fill-value array (length `channelNum`)	`nullptr`
`bt`	`acl::BorderType`	Border-handling mode	`BORDER_REFLECT_101`

NEON Version

cpp

template<class T>
int copyMakeBorder(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight, int channelNum,
    int srcStride = 0, int dstStride = 0,
    int top = 0, int bottom = 0, int left = 0, int right = 0,
    const T* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

count

Count the number of pixels satisfying a condition.

Tier: Starter+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP Signature (3 entry points)

cpp

// == threshold
template<class T>
int countEQ(const T* srcImage, int width, int height, int stride, const T& threshold);

// <= threshold
template<class T>
int countLET(const T* srcImage, int width, int height, int stride, const T& threshold);

// < threshold
template<class T>
int countLT(const T* srcImage, int width, int height, int stride, const T& threshold);

The return value is the count (not an error code); srcImage == nullptr returns 0.

mean

Per-pixel mean of two or N images: dst[i] = (A[i] + B[i] + …) / N.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`AT`	`uint8_t, uint16_t, float`	—
`BT`	`uint8_t, uint16_t, float`	—
`DT`	`uint8_t, uint16_t, float`	—

CPP Signature (two overloads: 2-image + N-image)

cpp

// 2-image
template<class AT, class BT, class DT>
int mean(
    const AT* src1Image, const BT* src2Image, DT* dstImage,
    int width, int height, int cn = 1,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

// N-image
template<class ST, class DT>
int mean(
    const ST* const* srcImages, int srcNum, DT* dstImage,
    int width, int height, int cn = 1,
    int srcStride = 0, int dstStride = 0);

matchTemplate

Template matching (6 similarity metrics). FFT-accelerated; suitable for large image × small template.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T` (CPP)	`{uint8_t, float}`	—
`T` (NEON)	`uint8_t`	NEON-only

TemplateMatchMethod:

TM_SQDIFF — sum of squared differences
TM_SQDIFF_NORMED — normalized squared differences
TM_CCORR — cross-correlation
TM_CCORR_NORMED — normalized cross-correlation
TM_CCOEFF — correlation coefficient
TM_CCOEFF_NORMED — normalized correlation coefficient

CPP Version

cpp

template<class T>
int matchTemplate(
    const T* srcImage, int srcW, int srcH, int srcStride,
    const T* templ, int templW, int templH, int templStride,
    float* result, int resultStride,
    acl::TemplateMatchMethod tm = acl::TemplateMatchMethod::TM_SQDIFF);

NEON Version (`uint8_t` only)

cpp

template<class T>
int matchTemplate(
    const T* srcImage, int srcW, int srcH, int srcStride,
    const T* templ, int templW, int templH, int templStride,
    float* result, int resultStride,
    acl::TemplateMatchMethod tm = acl::TemplateMatchMethod::TM_SQDIFF);

Parameter	Type	Meaning	Default
`srcImage`, `templ`	`const T*`	Search image, template image	non-null, `templW ≤ srcW`, `templH ≤ srcH`
`result`	`float*`	Output score map, size `(srcW - templW + 1) × (srcH - templH + 1)`	non-null
`*Stride`	`int`	Bytes per row (result counted as `float`)	`0` = auto
`tm`	`acl::TemplateMatchMethod`	Similarity metric	`TM_SQDIFF`

connectedComponent_8n_dfs

8-connected component labeling (DFS). Takes a binary image as input; outputs the pixel coordinate list for each connected component.

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t`	—
`LabelType`	`int`	—

CPP Signature

cpp

template<class T>
int connectedComponent_8n_dfs(
    T* binaryImage, int width, int height, int stride,
    std::vector<std::vector<std::pair<int, int>>>& regions,
    int minArea, int maxArea,
    int frontFlag = 255, int backFlag = 0);

Parameter	Type	Meaning
`binaryImage`	`T*`	Input binary image (the algorithm may overwrite pixels as markers)
`regions`	`vector<vector<pair<int, int>>>&`	Output list of `(x, y)` pixel coordinates for each connected component
`minArea`, `maxArea`	`int`	Filter: only retain components with `area ∈ [minArea, maxArea]`
`frontFlag`, `backFlag`	`int`	Foreground / background pixel value used as DFS markers (defaults `255` / `0`)

connectedComponentLabeling

Connected component labeling with a label image (union-find). Outputs a label image (per-pixel label) + a label list sorted by area.

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`DataInType`	`uint8_t, uint16_t`	—
`LabelType`	`int`	—

CPP Signature

cpp

template<class DataInType, class LabelType>
int connectedComponentLabeling(
    DataInType* dataIn, LabelType* label,
    std::vector<std::pair<LabelType, int>>& sortLabelHist,
    DataInType threshold,
    int topAreaCnt, int minArea,
    int width, int height,
    int inStride = 0, int labelStride = 0);

Parameter	Type	Meaning
`dataIn`	`DataInType*`	Input image (binarized via `threshold`)
`label`	`LabelType*`	Output label image
`sortLabelHist`	`vector<pair<LabelType, int>>&`	Output `(label, area)` list sorted by descending area
`threshold`	`DataInType`	Input binarization threshold
`topAreaCnt`	`int`	Only retain the `topAreaCnt` components with the largest area

findContours

Find contours in a binary image (Suzuki-Abe algorithm, OpenCV compatible).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP Signature

cpp

enum ContourRetrMode {
    CONTOUR_RETR_EXTERNAL = 0,  // only the outermost layer
    CONTOUR_RETR_LIST     = 1,  // all contours, no hierarchy
    CONTOUR_RETR_CCOMP    = 2,  // two layers (outer + inner holes)
    CONTOUR_RETR_TREE     = 3   // full hierarchy tree
};

enum ContourApproxMethod {
    CONTOUR_CHAIN_APPROX_NONE   = 1,  // retain all contour points
    CONTOUR_CHAIN_APPROX_SIMPLE = 2   // compress intermediate points on horizontal / vertical / diagonal segments
};

struct Point2i { int x, y; };
struct HierarchyEntry { int next, prev, first_child, parent; };

int findContours(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<std::vector<Point2i>>& contours,
    std::vector<HierarchyEntry>* hierarchy = nullptr,
    ContourRetrMode mode = CONTOUR_RETR_LIST,
    ContourApproxMethod method = CONTOUR_CHAIN_APPROX_SIMPLE,
    int offsetX = 0, int offsetY = 0);

Parameter	Type	Meaning
`contours`	`vector<vector<Point2i>>&`	Output contours; each contour is an array of `Point2i`
`hierarchy`	`vector<HierarchyEntry>*`	(optional) hierarchy info `[next, prev, first_child, parent]`
`offsetX`, `offsetY`	`int`	Offset added to all contour point coordinates

distanceTransform

Distance transform — each pixel outputs the distance to its nearest 0 pixel (L1 / L2 / L∞).

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
Input	`uint8_t`	—
Output	`float`	—

DistanceType: DIST_L1 (Manhattan) / DIST_L2 (Euclidean, exact algorithm) / DIST_LINF (chessboard)

CPP Signature (2 entry points: float or u8 output)

cpp

// float output (high precision)
int distanceTransform(
    const uint8_t* srcImage, float* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    DistanceType distType = DIST_L2);

// u8 output (normalized to 0-255, suitable for visualization)
int distanceTransformU8(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    DistanceType distType = DIST_L2);

DIST_L2 uses the Felzenszwalb-Huttenlocher exact Euclidean algorithm (not an approximation).

blockAverage

Take the mean over each U × V block as output (image downsampling). U = V = 2 corresponds to 2×2 averaging.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

Commercial package type availability:

Tier	Callable `T` in delivered `<acl/api.h>`
Starter	`uint8_t`
Pro	`uint8_t`, `uint16_t`
Business	`uint8_t`, `uint16_t`, `float`

CPP / NEON Signature (identical)

cpp

template<class T>
int blockAverage(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight,
    int U, int V,
    int srcStride = 0, int dstStride = 0,
    bool round = true,
    int hcn = 1, int vcn = 1);

Parameter	Type	Meaning	Default
`U`, `V`	`int`	Block horizontal / vertical size	—
`round`	`bool`	`true` = round to nearest, `false` = truncate	`true`
`hcn`, `vcn`	`int`	Horizontal / vertical channel packing	`1`, `1`

extractBlockPixels

Extract the pixel at (u, v) from every U × V block (i.e. downsample while specifying the sampling point).

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`ST`	`uint8_t, uint16_t, float`	—
`DT`	`uint8_t, uint16_t, float`	—

CPP Signature

cpp

template<class ST, class DT>
int extractBlockPixels(
    const ST* srcImage, DT* dstImage,
    int srcWidth, int srcHeight,
    int srcStride, int dstStride,
    int U, int V, int u, int v);

Parameter	Type	Meaning
`U`, `V`	`int`	Block size
`u`, `v`	`int`	Pixel position sampled from each block (`0 ≤ u < U`, `0 ≤ v < V`); auto-clamped

Arithmetic

Namespace: acl::arithmetic (CPP) / acl::neon::arithmetic (NEON)

addImg

Per-pixel sum of two images dst = src1 + src2; input operand types and output type are decoupled (AT / BT → DT).

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (dst == src1 or dst == src2)
Types:

Template parameter	Allowed types	Constraint
`AT, BT, DT` (CPP)	`{uint8_t, uint16_t, float}`	—
`T` (NEON)	`{uint8_t, uint16_t}`	NEON-only

Variant entry points (CPP):

Entry point	Purpose
`addImg`	Sum of two images, no saturation
`addImgClamp`	Sum of two images, result clamped to `[minValue, maxValue]`
`add_Imgs`	N-image accumulation (`srcImages[0] + srcImages[1] + …`)
`add_ImgsClamp`	N-image accumulation + clamp
`add_ImgsManual`	N-image accumulation with manually specified intermediate accumulation type `ACC_T`
`add_ImgsManualClamp`	N-image accumulation + manual ACC_T + clamp

CPP Version

cpp

template<class AT, class BT, class DT>
int addImg(
    const AT* src1Image, const BT* src2Image, DT* dstImage,
    int width, int height, int cn = 1,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

template<class AT, class BT, class DT>
int addImgClamp(
    const AT* src1Image, const BT* src2Image, DT* dstImage,
    int width, int height, int cn,
    int src1Stride, int src2Stride, int dstStride,
    DT minValue, DT maxValue);

template<class ST, class DT>
int add_Imgs(
    const ST* const* srcImages, int srcNum, DT* dstImage,
    int width, int height, int cn = 1,
    int srcStride = 0, int dstStride = 0);

Parameter	Type	Meaning	Default
`src1Image`, `src2Image`	`const AT`, `const BT`	Inputs	non-null
`dstImage`	`DT*`	Output	non-null
`width`, `height`	`int`	Image size	> 0
`cn`	`int`	Channel count	1
`*Stride`	`int`	Bytes per row	`0` = auto
`minValue`, `maxValue`	`DT`	(clamp variant only) output upper/lower bounds	—

NEON Version

cpp

template<class T>
int addImg(
    const T* src1, const T* src2, T* dstImage,
    int width, int height, int cn = 1,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

NEON provides only addImg itself (single template across uint8_t / float, etc., same type for both inputs and output); variants such as N-image accumulation, clamp, and manual ACC_T are via the CPP version.

Example

cpp

uint8_t a[1920*1080], b[1920*1080], dstImage[1920*1080];

// Sum of two images (u8)
acl::neon::arithmetic::addImg(a, b, dstImage, 1920, 1080);

// Saturating clamp to [0, 200] (CPP)
acl::arithmetic::addImgClamp<uint8_t, uint8_t, uint8_t>(
    a, b, dstImage, 1920, 1080, 1, 0, 0, 0,
    /*minValue=*/0, /*maxValue=*/200);

// Accumulate 10 images into u16 to avoid u8 overflow
const uint8_t* imgs[10] = { /* ... */ };
uint16_t acc[1920*1080];
acl::arithmetic::add_Imgs<uint8_t, uint16_t>(
    imgs, 10, acc, 1920, 1080);

absDiff

Per-pixel absolute difference of two images dst = |src1 - src2|.

Tier: Starter+
Channels: any
Inplace: supported
Types:

Template parameter	Allowed types	Constraint
`ST1`	`uint8_t, uint16_t, float`	—
`ST2`	`uint8_t, uint16_t, float`	—
`DT`	`uint8_t, uint16_t, float`	—

cpp

template<class T>
int absDiff(
    const T* src1, const T* src2, T* dstImage,
    int width, int height, int cn,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

Available as both acl::arithmetic::absDiff (CPP) and acl::neon::arithmetic::absDiff (NEON) — identical signature, different namespace. NEON path supports uint8_t / float / double; uint16_t falls through to scalar.

Example

cpp

uint8_t a[1920*1080], b[1920*1080], diff[1920*1080];

// Frame differencing (motion detection)
acl::neon::arithmetic::absDiff<uint8_t>(a, b, diff, 1920, 1080, 1);

addWeighted

Weighted sum dst = alpha*src1 + beta*src2 + gamma. Typical uses: image transitions, exposure fusion.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP Version

cpp

template<class ST1, class ST2, class DT>
int addWeighted(
    const ST1* src1, const ST2* src2, DT* dstImage,
    int width, int height, int cn,
    int src1Stride, int src2Stride, int dstStride,
    double alpha, double beta, double gamma);

NEON Version (`T` ∈ `{uint8_t, uint16_t, float}`, same type for inputs and output)

cpp

template<class T>
int addWeighted(
    const T* src1, const T* src2, T* dstImage,
    int width, int height,
    int cn = 1,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0,
    double alpha = 1.0, double beta = 1.0, double gamma = 0.0);

Example

cpp

uint8_t a[1920*1080*3], b[1920*1080*3], dstImage[1920*1080*3];

// 50% blend: dstImage = 0.5 * a + 0.5 * b
acl::neon::arithmetic::addWeighted(
    a, b, dstImage, 1920, 1080, 3, 0, 0, 0, 0.5, 0.5, 0.0);

alphaImgFusion

Alpha blending C = alpha * A + (1 - alpha) * B.

Tier: Starter+
Channels: 1ch (the width*cn parameter is in bytes)
Inplace: supported
Types:

Template parameter	Allowed types	Constraint
`ST1`	`uint8_t, uint16_t, float`	—
`ST2`	`uint8_t, uint16_t, float`	—
`DT`	`uint8_t, uint16_t, float`	—

CPP / NEON Signature (identical)

cpp

template<class T>
int alphaImgFusion(
    const T* A, const T* B, T* C,
    int width, int height,
    int AStride, int BStride, int CStride,
    float alpha);

Parameter	Type	Meaning	Default
`A`, `B`	`const T*`	The two input images	non-null
`C`	`T*`	Output blended image	non-null
`AStride`, `BStride`, `CStride`	`int`	Bytes per row	`0` = auto
`alpha`	`float`	Weight of A, `[0, 1]`	—

mul

Per-pixel multiplication C = A * B. When saturateCast = true the result saturates to the CT type (e.g. uint8_t is clamped to [0, 255]).

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported
Types:

Template parameter	Allowed types	Constraint
`AT, BT, CT` (CPP)	`{uint8_t, uint16_t, float}`	—
`T` (NEON)	`{uint8_t, uint16_t}`	NEON-only

CPP Version

cpp

template<class AT, class BT, class CT>
int mul(
    const AT* A, const BT* B, CT* C,
    int width, int height, int cn,
    int AStride, int BStride, int CStride,
    bool saturateCast = false);

NEON Version (`uint8_t` / `uint16_t` / `float`, same type for input and output)

cpp

template<class T>
int mul(
    const T* A, const T* B, T* C,
    int width, int height, int cn,
    int AStride, int BStride, int CStride,
    bool saturateCast = false);

NEON requires AT == BT == CT (i.e. T); for heterogeneous types, use the CPP version.

threshold

Fixed-threshold binarization / truncation.

Tier: Starter+
Channels: 1ch
Inplace: supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—
ThreshMode: `THRESH_BINARY` / `THRESH_BINARY_INV` / `THRESH_TRUNC` / `THRESH_TOZERO` / `THRESH_TOZERO_INV` / `THRESH_OTSU` (u8 only)

CPP Version

cpp

template<class ST, class DT>
int threshold(
    const ST* srcImage, DT* dstImage,
    int width, int height,
    ST threshold,
    DT maxVal = 255, DT minVal = 0,
    int srcStride = 0, int dstStride = 0,
    acl::ThreshMode tm = acl::ThreshMode::THRESH_BINARY);

THRESH_OTSU mode automatically computes the optimal threshold (the passed-in threshold parameter is ignored); only uint8_t / uint16_t are supported.

NEON Version (`uint8_t` only for input and output)

cpp

template<class T>
int threshold(
    const T* srcImage, T* dstImage,
    int width, int height,
    T threshold,
    T maxVal = 255, T minVal = 0,
    int srcStride = 0, int dstStride = 0,
    acl::ThreshMode tm = acl::ThreshMode::THRESH_BINARY);

Example

cpp

uint8_t gray[1920*1080], bin[1920*1080];

// Fixed-threshold binarization
acl::neon::arithmetic::threshold<uint8_t>(
    gray, bin, 1920, 1080, /*threshold=*/128,
    /*maxVal=*/255, /*minVal=*/0, 0, 0,
    acl::ThreshMode::THRESH_BINARY);

// Otsu automatic threshold (CPP only)
acl::arithmetic::threshold<uint8_t, uint8_t>(
    gray, bin, 1920, 1080, /*threshold=*/0,
    /*maxVal=*/255, /*minVal=*/0, 0, 0,
    acl::ThreshMode::THRESH_OTSU);   // threshold parameter is ignored

adaptiveThreshold

Adaptive threshold — each pixel's threshold = local_mean(src, blockSize) - C (or Gaussian-weighted mean).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—
Method: `ADAPTIVE_THRESH_MEAN_C` / `ADAPTIVE_THRESH_GAUSSIAN_C`

CPP Version

cpp

template<class T>
int adaptiveThreshold(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    T maxVal,
    int blockSize,
    double C,
    acl::AdaptiveThreshMethod am = acl::AdaptiveThreshMethod::ADAPTIVE_THRESH_MEAN_C);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`maxVal`	`T`	Output value for pixels above threshold	typically `255`
`blockSize`	`int`	Local window size (must be odd and ≥ 3)	typically `11` / `25`
`C`	`double`	Threshold adjustment constant	typically `2` ~ `10`
`am`	`acl::AdaptiveThreshMethod`	`ADAPTIVE_THRESH_MEAN_C` / `ADAPTIVE_THRESH_GAUSSIAN_C`	`MEAN_C`

NEON Version (`uint8_t` only)

cpp

template<class T>
int adaptiveThreshold(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    T maxVal,
    int blockSize,
    double C,
    acl::AdaptiveThreshMethod am = acl::AdaptiveThreshMethod::ADAPTIVE_THRESH_MEAN_C);

bitwise

Bitwise operations AND / NOT / XOR.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported
Types:

Template parameter	Allowed types	Constraint
`T` (CPP)	`{uint8_t, uint16_t}`	—
`T` (NEON)	`uint8_t`	NEON-only

No bitwiseOr entry point (OR can be composed as NOT + AND + NOT; the current version does not provide it separately).

CPP / NEON Signature (identical)

cpp

// AND
template<class T>
int bitwiseAnd(
    const T* src1, const T* src2, T* dstImage,
    int width, int height, int cn,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

// NOT
template<class T>
int bitwiseNot(
    const T* srcImage, T* dstImage,
    int width, int height, int cn,
    int srcStride = 0, int dstStride = 0);

// XOR
template<class T>
int bitwiseXor(
    const T* src1, const T* src2, T* dstImage,
    int width, int height, int cn,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

lut

Lookup-table transform dst[i] = table[src[i]]. Commonly used for gamma correction, color mapping, curve adjustments.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported
Types:

Template parameter	Allowed types	Constraint
`ST` ∈ `{uint8_t}`, `DT` ∈ `{uint8_t, uint16_t, float}` (CPP)	tested combinations	—
`T` (NEON)	`uint8_t`	NEON-only

CPP Version: 1ch, `ST → DT`

cpp

template<class ST, class DT>
int lut(
    const ST* srcImage, DT* dstImage,
    const DT* table,
    int width, int height,
    int srcStride = 0, int dstStride = 0);

table length must cover all possible values of ST (uint8_t → 256 entries, uint16_t → 65536 entries).

NEON Version: 1ch / 3ch / 4ch, `uint8_t` only

cpp

int lut(
    const uint8_t* srcImage, uint8_t* dstImage,
    const uint8_t* table,
    int width, int height,
    int cn = 1,
    int srcStride = 0, int dstStride = 0);

The NEON version additionally supports multi-channel (applies the same 256-entry table to all channels).

Example

cpp

uint8_t srcImage[1920*1080*3], dstImage[1920*1080*3];

// Gamma 2.2 LUT
uint8_t gamma_lut[256];
for (int i = 0; i < 256; ++i)
    gamma_lut[i] = (uint8_t)(std::pow(i / 255.0, 1.0 / 2.2) * 255.0);

// 3-channel gamma correction (NEON)
acl::neon::arithmetic::lut(srcImage, dstImage, gamma_lut, 1920, 1080, 3);

convertScaleAbs

Scalar multiplication + offset + absolute value + convert to u8: dst = saturate_cast<u8>(|alpha*src + beta|).

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (requires ST == uint8_t)
Types:

Template parameter	Allowed types	Constraint
`ST`	`uint8_t, int16_t, uint16_t, float`	output fixed to `uint8_t`

CPP / NEON Signature (identical)

cpp

template<class ST>
int convertScaleAbs(
    const ST* srcImage, uint8_t* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    double alpha, double beta);

Typical pairing: visualize a int16_t gradient from Sobel / Scharr by calling convertScaleAbs<int16_t>.

inRange

Range check: if each channel simultaneously satisfies low[c] ≤ src[c] ≤ high[c], output 255, otherwise 0. Commonly used for HSV color segmentation.

Tier: Starter+
Channels: input 1ch / 3ch / 4ch (per-channel evaluation), output 1ch binary mask
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP / NEON Signature (identical)

cpp

template<class T>
int inRange(
    const T* srcImage, uint8_t* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    const T* low, const T* high);

Parameter	Type	Meaning
`srcImage`	`const T*`	Input image (multi-channel, interleaved)
`dstImage`	`uint8_t*`	Output 1ch binary mask
`low`, `high`	`const T*`	Upper/lower-bound arrays of length `cn`

Example

cpp

uint8_t hsv[1920*1080*3], mask[1920*1080];

// Extract red: H∈[0,10] S∈[100,255] V∈[100,255]
uint8_t low[3]  = {0, 100, 100};
uint8_t high[3] = {10, 255, 255};
acl::neon::arithmetic::inRange(hsv, mask, 1920, 1080, 3, 0, 0, low, high);

normalize

Normalization — linearly map pixel values to the target range.

Tier: Starter+
Channels: 1ch
Inplace: supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—
NormType: `NORM_MINMAX` (maps to `[alpha, beta]`)

CPP / NEON Signature (identical)

cpp

template<class T>
int normalize(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    acl::NormType nt = acl::NormType::NORM_MINMAX,
    double alpha = 0.0, double beta = 255.0);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`srcStride`, `dstStride`	`int`	Bytes per row	required (`0` = auto)
`nt`	`acl::NormType`	Normalization mode	`NORM_MINMAX`
`alpha`	`double`	Target lower bound	`0.0`
`beta`	`double`	Target upper bound (for `NORM_MINMAX` only)	`255.0`

linearTransform2x2

2×2 pixel-block linear transform: the caller provides a 2×2 coefficient matrix [[v00, v01], [v10, v11]], applied to each 2×2 pixel block; commonly used for Bayer color correction, etc.

Tier: Business
Channels: 1ch (input is interpreted in a 2×2 block structure)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—
`DT`	`uint8_t, uint16_t, float`	—

CPP Signature

cpp

template<class T, class DT>
int linearTransform2x2(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int minValue, int maxValue,
    const DT& v00, const DT& v01, const DT& v10, const DT& v11);

Parameter	Type	Meaning
`minValue`, `maxValue`	`int`	Output clamp upper/lower bounds
`v00`, `v01`, `v10`, `v11`	`DT`	2×2 matrix coefficients

phaseMagnitude

Compute phase angle and magnitude from Sobel / Scharr gradients (dx, dy).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`GT` (CPP)	`{int16_t, int32_t}` (template also accepts `float`)	—
`T` (NEON)	`int16_t`	NEON-only

CPP Version: two independent entry points

cpp

// Phase angle (radians or degrees)
template<class GT>
int phase(
    const GT* dx, const GT* dy, float* angle,
    int width, int height,
    int dxStride, int dyStride, int angleStride,
    bool angleInDegrees = true);

// Magnitude (L2 norm)
template<class GT>
int magnitude(
    const GT* dx, const GT* dy, float* mag,
    int width, int height,
    int dxStride, int dyStride, int magStride);

GT ∈ {short, int, float}. Output is fixed to float.

NEON Version (`short` input only, non-templated)

cpp

int magnitude(
    const short* dx, const short* dy, float* mag,
    int width, int height,
    int dxStride, int dyStride, int magStride);

int phase(
    const short* dx, const short* dy, float* angle,
    int width, int height,
    int dxStride, int dyStride, int angleStride,
    bool angleInDegrees = true);

Typical pairing: sobel3x3<short> outputs a short gradient that feeds directly into NEON magnitude / phase.

Example

cpp

uint8_t srcImage[1920*1080];
short dx[1920*1080], dy[1920*1080];
float mag[1920*1080], angle[1920*1080];

// Sobel → gradient magnitude + phase
acl::neon::filter::sobel3x3<uint8_t, short>(srcImage, dx, dy, 1920, 1080);
acl::neon::arithmetic::magnitude(dx, dy, mag, 1920, 1080, 0, 0, 0);
acl::neon::arithmetic::phase(dx, dy, angle, 1920, 1080, 0, 0, 0, /*degrees=*/true);

Color Conversion

Namespace: acl::cvtcolor (CPP) / acl::neon::cvtcolor (NEON)

RGB2Gray / RGBA2Gray

RGB(A) → grayscale image. Supports 5 grayscale strategies (luma BT.601, max, min, average, weighted).

Tier: Starter+
Channels: input 3ch (RGB) or 4ch (RGBA), output 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP Version: `acl::cvtcolor::RGB2Gray` / `RGBA2Gray`

cpp

template<class T>
int RGB2Gray(
    const T* rgbImage, T* grayImage,
    int width, int height,
    int rgbStride = 0, int grayStride = 0,
    acl::ColorCvtGrayMode mode = acl::ColorCvtGrayMode::GRAY_LUMA,
    float cR = 0.299f, float cG = 0.587f, float cB = 0.114f);

template<class T>
int RGBA2Gray(
    const T* rgbaImage, T* grayImage,
    int width, int height,
    int rgbaStride = 0, int grayStride = 0,
    acl::ColorCvtGrayMode mode = acl::ColorCvtGrayMode::GRAY_LUMA,
    float cR = 0.299f, float cG = 0.587f, float cB = 0.114f);

Parameter	Type	Meaning	Default
`rgbImage` / `rgbaImage`	`const T*`	Input RGB / RGBA image	non-null
`grayImage`	`T*`	Output grayscale image	non-null
`width`, `height`	`int`	Image size	> 0
`rgbStride`/`rgbaStride`, `grayStride`	`int`	Bytes per row	`0` = auto
`mode`	`acl::ColorCvtGrayMode`	Grayscale strategy	`GRAY_LUMA`
`cR`, `cG`, `cB`	`float`	Weights used only in `GRAY_WEIGHTED` mode	BT.601 defaults

mode values (see acl::ColorCvtGrayMode):

GRAY_LUMA (default) — BT.601 luma: 0.299R + 0.587G + 0.114B
GRAY_WEIGHTED — uses custom weights cR/cG/cB
GRAY_MIN / GRAY_MAX / GRAY_AVG — take min / max / mean across channels

NEON Version: `acl::neon::cvtcolor::RGB2Gray` / `RGBA2Gray` (`uint8_t` only)

cpp

int RGB2Gray(   // or RGBA2Gray
    const uint8_t* rgbImage, uint8_t* grayImage,
    int width, int height,
    int rgbStride = 0, int grayStride = 0,
    acl::ColorCvtGrayMode mode = acl::ColorCvtGrayMode::GRAY_LUMA,
    float cR = 0.299f, float cG = 0.587f, float cB = 0.114f);

The element type is fixed to uint8_t; other parameter semantics match the CPP version.

Example

cpp

uint8_t rgb[1920*1080*3], gray[1920*1080];

// BT.601 luma grayscale (default)
acl::neon::cvtcolor::RGB2Gray(rgb, gray, 1920, 1080);

// Channel-wise max mode (u16)
acl::cvtcolor::RGB2Gray<uint16_t>(
    src16, gray16, 1920, 1080,
    /*rgbStride=*/0, /*grayStride=*/0,
    acl::ColorCvtGrayMode::GRAY_MAX);

Channel Swap (BGR / RGB / RGBA interchange)

Channel-order swap (3ch ↔ 3ch, 3ch ↔ 4ch, 4ch ↔ 3ch). One template entry point channelSwap<Mode>() covers every direction; only the Mode tag changes.

Tier: Starter+
Channels: 3ch ↔ 3ch / 3ch ↔ 4ch / 4ch ↔ 3ch
Inplace: not supported (when channel counts differ)
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Version (identical signatures, `uint8_t` only)

All channel-swap directions are dispatched through a single channelSwap<Mode>() template — the Mode template parameter is an empty tag struct selecting the conversion direction. The 5 tag structs cover all 10 useful directions (each tag handles a pair of equivalent swaps).

cpp

template<class Mode>
int channelSwap(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height, int srcStride = 0, int dstStride = 0);

// Mode tags (declared in typeDef.h, namespace acl::):
//   3ch ↔ 3ch:   BGR2RGB    (also covers RGB → BGR — same byte layout)
//   3ch → 4ch:   BGR2BGRA   (also covers RGB → RGBA)
//                BGR2RGBA   (also covers RGB → BGRA — swap R/B + add alpha)
//   4ch → 3ch:   BGRA2BGR   (also covers RGBA → RGB)
//                BGRA2RGB   (also covers RGBA → BGR — swap R/B + drop alpha)

CPP lives in acl::cvtcolor::, NEON in acl::neon::cvtcolor::, with identical signatures.

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const uint8_t` / `uint8_t`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto

Example

cpp

uint8_t bgr[1920*1080*3], rgba[1920*1080*4];

// BGR → RGBA (NEON) — pick the Mode tag that matches the direction
acl::neon::cvtcolor::channelSwap<acl::BGR2RGBA>(bgr, rgba, 1920, 1080);

RGB ↔ YUV (fixed-point)

Conversion between RGB/RGBA and YUV (NV21 / YV12 / YUV444), using integer fixed-point arithmetic. Supports BT.601 / BT.709 / BT.2020 + full-range / limited-range combinations via the unified YUVConvertParams struct.

Tier: Starter+
Channels: 3ch RGB ↔ NV21 / NV12 / YV12 / YUV444
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`RGB_T`	`uint8_t, uint16_t`	—
`YUV_T`	`uint8_t, uint16_t`	—

NV21 and NV12 share the same entry point; switch via YUVConvertParams::nv21_fmt (true = NV21, false = NV12).

CPP Version: `acl::cvtcolor::rgb2_fixed` (RGB → YUV, 6 entry points)

NV21 / NV12 (Y plane + interleaved UV plane)

cpp

template<class RGB_T, class YUV_T>
int rgb2NV21_fixed(
    const RGB_T* rgbImage, YUV_T* dstYImage, YUV_T* dstUVImage,
    int width, int height,
    int rgbStride = 0, int yStride = 0, int uvStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T>
int rgba2NV21_fixed(
    const RGB_T* rgbaImage, YUV_T* dstYImage, YUV_T* dstUVImage,
    int width, int height,
    int rgbaStride = 0, int yStride = 0, int uvStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

YV12 (three independent planes Y / U / V; U/V are each width/2 × height/2)

cpp

template<class RGB_T, class YUV_T>
int rgb2YV12_fixed(
    const RGB_T* rgbImage,
    YUV_T* dstYImage, YUV_T* dstUImage, YUV_T* dstVImage,
    int width, int height,
    int rgbStride = 0, int yStride = 0, int uStride = 0, int vStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T>
int rgba2YV12_fixed(
    const RGB_T* rgbaImage,
    YUV_T* dstYImage, YUV_T* dstUImage, YUV_T* dstVImage,
    int width, int height,
    int rgbaStride = 0, int yStride = 0, int uStride = 0, int vStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

YUV444 (single-plane interleaved)

cpp

template<class RGB_T, class YUV_T>
int rgb2YUV444_fixed(
    const RGB_T* rgbImage, YUV_T* dstYUVImage,
    int width, int height,
    int rgbStride = 0, int yuvStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T>
int rgba2YUV444_fixed(
    const RGB_T* rgbaImage, YUV_T* dstYUVImage,
    int width, int height,
    int rgbaStride = 0, int yuvStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

CPP Version: `acl::cvtcolor::*2RGB_fixed` (YUV → RGB, 6 entry points)

NV21 / NV12 → RGB / RGBA

cpp

template<class YUV_T, class RGB_T>
int nv212RGB_fixed(
    const YUV_T* srcYImage, const YUV_T* srcUVImage, RGB_T* rgbImage,
    int width, int height,
    int yStride = 0, int uvStride = 0, int rgbStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class YUV_T, class RGB_T>
int nv212RGBA_fixed(
    const YUV_T* srcYImage, const YUV_T* srcUVImage, RGB_T* rgbaImage,
    int width, int height,
    int yStride = 0, int uvStride = 0, int rgbaStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

YV12 → RGB / RGBA (three input planes)

cpp

template<class YUV_T, class RGB_T>
int yv122RGB_fixed(
    const YUV_T* srcYImage, const YUV_T* srcUImage, const YUV_T* srcVImage,
    RGB_T* rgbImage,
    int width, int height,
    int yStride = 0, int uStride = 0, int vStride = 0, int rgbStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class YUV_T, class RGB_T>
int yv122RGBA_fixed(
    const YUV_T* srcYImage, const YUV_T* srcUImage, const YUV_T* srcVImage,
    RGB_T* rgbaImage,
    int width, int height,
    int yStride = 0, int uStride = 0, int vStride = 0, int rgbaStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

YUV444 → RGB / RGBA (single input plane)

cpp

template<class YUV_T, class RGB_T>
int yuv4442RGB_fixed(
    const YUV_T* srcYUVImage, RGB_T* rgbImage,
    int width, int height,
    int yuvStride = 0, int rgbStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class YUV_T, class RGB_T>
int yuv4442RGBA_fixed(
    const YUV_T* srcYUVImage, RGB_T* rgbaImage,
    int width, int height,
    int yuvStride = 0, int rgbaStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

Parameter	Meaning
`rgbImage` / `rgbaImage`	RGB / RGBA plane, 3 or 4 bytes/pixel
`dstYImage`, `dstUVImage` / `dstUImage`, `dstVImage` / `dstYUVImage`	YUV output planes (NV merges UV, YV12 has separate U/V, YUV444 is interleaved)
`width`, `height`	Image size (NV/YV12 require even values)
`*Stride`	Bytes per row, `0` = auto
`cvMatrix`	Custom 3×3 conversion matrix (used only when `p.yuv_std = YUVEncodeStandard::STD_CUSTOM`)
`p`	`YUVConvertParams` — selects standard, channel order, bit depth, range. Defaults to BT.601 8-bit full-range RGB.

NEON Version: `acl::neon::cvtcolor::rgb2_fixed` (`uint8_t` only)

cpp

template<class RGB_T, class YUV_T>
int rgb2NV21_fixed(
    const RGB_T* rgbImage, YUV_T* dstYImage, YUV_T* dstUVImage,
    int width, int height,
    int rgbStride = 0, int yStride = 0, int uvStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

// Corresponds to 12 NEON entry points (signatures match the CPP versions):
//   rgb2NV21_fixed / rgba2NV21_fixed / rgb2YV12_fixed / rgba2YV12_fixed
//   rgb2YUV444_fixed / rgba2YUV444_fixed
//   nv212RGB_fixed / nv212RGBA_fixed / yv122RGB_fixed / yv122RGBA_fixed
//   yuv4442RGB_fixed / yuv4442RGBA_fixed

NEON RGB/YUV input/output types are uint8_t only; all other parameters (including YUVConvertParams) match the CPP version exactly.

Example

cpp

uint8_t rgb[1920*1080*3];
uint8_t y[1920*1080], uv[1920*540*2];

// BT.601 full-range RGB → NV21 (default params)
acl::neon::cvtcolor::rgb2NV21_fixed<uint8_t, uint8_t>(
    rgb, y, uv, 1920, 1080);

// BT.709 limited-range NV12 (override defaults)
acl::YUVConvertParams p;
p.yuv_std         = acl::YUVEncodeStandard::STD_BT709;
p.nv21_fmt        = false;   // NV12
p.yuv_full_range  = false;   // limited range [16, 235/240]
acl::cvtcolor::rgb2NV21_fixed<uint8_t, uint8_t>(
    rgb, y, uv, 1920, 1080,
    /*rgbStride=*/0, /*yStride=*/0, /*uvStride=*/0,
    /*cvMatrix=*/nullptr, p);

RGB ↔ YUV (float, CPP only)

Floating-point implementation (higher precision, slightly slower than the fixed-point path; suitable for precision-sensitive scenarios). NEON does not provide this path.

Tier: Starter+
Channels: 3ch RGB ↔ NV21 / NV12 / YV12 / YUV444
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`RGB_T`	`uint8_t, uint16_t, float`	—
`YUV_T`	`uint8_t, uint16_t, float`	—

CPP Signature (one-to-one correspondence with the `_fixed` version; only the `_fixed` suffix is removed)

Signatures, parameters, and the trailing YUVConvertParams& p are completely identical to the fixed version; the _float variants prioritise numerical accuracy, the _fixed variants prioritise throughput. 12 entry points:

RGB → YUV

cpp

template<class RGB_T, class YUV_T>
int rgb2NV21(const RGB_T* rgbImage, YUV_T* dstYImage, YUV_T* dstUVImage,
             int width, int height,
             int rgbStride = 0, int yStride = 0, int uvStride = 0,
             const float* cvMatrix = nullptr,
             const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T> int rgba2NV21(/* same as above with rgba prefix */);

template<class RGB_T, class YUV_T>
int rgb2YV12(const RGB_T* rgbImage,
             YUV_T* dstYImage, YUV_T* dstUImage, YUV_T* dstVImage,
             int width, int height,
             int rgbStride = 0, int yStride = 0, int uStride = 0, int vStride = 0,
             const float* cvMatrix = nullptr,
             const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T> int rgba2YV12(/* same as above */);

template<class RGB_T, class YUV_T>
int rgb2YUV444(const RGB_T* rgbImage, YUV_T* dstYUVImage,
               int width, int height,
               int rgbStride = 0, int yuvStride = 0,
               const float* cvMatrix = nullptr,
               const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T> int rgba2YUV444(/* same as above */);

YUV → RGB

cpp

template<class YUV_T, class RGB_T> int nv212RGB (...);   // signatures mirror nv212RGB_fixed
template<class YUV_T, class RGB_T> int nv212RGBA(...);
template<class YUV_T, class RGB_T> int yv122RGB (...);
template<class YUV_T, class RGB_T> int yv122RGBA(...);
template<class YUV_T, class RGB_T> int yuv4442RGB (...);
template<class YUV_T, class RGB_T> int yuv4442RGBA(...);

Parameter lists and defaults are completely identical to the corresponding _fixed entry points; only the function name drops _fixed.

Example

cpp

// float RGB → NV21 (float precision)
float rgb_f[1920*1080*3];
float y_f[1920*1080], uv_f[1920*540*2];

acl::cvtcolor::rgb2NV21<float, float>(
    rgb_f, y_f, uv_f, 1920, 1080);

Bayer Demosaic

Bayer raw image → RGB / RGBA. Supports 4 Bayer patterns (RGGB / GRBG / GBRG / BGGR).

Tier: Starter+
Channels: 1ch Bayer → 3ch RGB / 4ch RGBA
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T` (CPP)	`uint8_t, uint16_t`	—
`T` (NEON)	`uint8_t`	NEON-only

CPP Version: `acl::cvtcolor::bayer2RGB` / `bayer2RGBA`

cpp

template<class ST, class DT>
int bayer2RGB(
    const ST* bayerImage, DT* rgbImage,
    int width, int height,
    int bayerStride, int rgbStride,
    int borderMode = 1,
    int bayerDataBit = 8,
    int RGBDataBit = 8,
    acl::BayerPattern pattern = acl::BayerPattern::GBRG);

// RGBA output (4ch):
template<class ST, class DT>
int bayer2RGBA(
    const ST* bayerImage, DT* rgbaImage,
    int width, int height,
    int bayerStride, int rgbaStride,
    int borderMode = 1,
    int bayerDataBit = 8,
    int RGBDataBit = 8,
    acl::BayerPattern pattern = acl::BayerPattern::GBRG);

Parameter	Type	Meaning	Default
`bayerImage`	`const ST*`	Input Bayer raw image	non-null
`rgbImage` / `rgbaImage`	`DT*`	Output RGB / RGBA image	non-null
`width`, `height`	`int`	Image size	> 0, even
`bayerStride`, `rgbStride` / `rgbaStride`	`int`	Bytes per row	`0` = auto
`borderMode`	`int`	`0` = replicate the inner ring; `1` = reflect_101	`1`
`bayerDataBit`	`int`	Effective Bayer data bits	`8`
`RGBDataBit`	`int`	Effective RGB data bits	`8`
`pattern`	`acl::BayerPattern`	Bayer mosaic pattern (`RGGB` / `GRBG` / `GBRG` / `BGGR`)	`GBRG`

NEON Version: `acl::neon::cvtcolor::bayer2RGB` (`uint8_t` only, 3ch only)

cpp

int bayer2RGB(
    const uint8_t* bayerImage, uint8_t* rgbImage,
    int width, int height,
    acl::BayerPattern pattern,
    int bayerStride = 0, int rgbStride = 0);

NEON has no bayer2RGBA entry point, and borderMode is fixed to reflect_101.

Example

cpp

uint8_t bayer[1920*1080], rgb[1920*1080*3];

// NEON (runtime pattern)
acl::neon::cvtcolor::bayer2RGB(bayer, rgb, 1920, 1080,
    acl::BayerPattern::RGGB);

// CPP (u16 bayer → u8 RGB, RGGB pattern)
uint16_t bayer16[1920*1080];
acl::cvtcolor::bayer2RGB<uint16_t, uint8_t>(
    bayer16, rgb, 1920, 1080,
    /*bayerStride=*/0, /*rgbStride=*/0,
    /*borderMode=*/1, /*bayerDataBit=*/10, /*RGBDataBit=*/8,
    acl::BayerPattern::RGGB);

RGB ↔ HSV

Conversion between RGB/BGR and HSV.

Tier: Pro+
Channels: 3ch ↔ 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

// CPP and NEON have identical names and signatures; only the namespace differs
int bgr2HSV(const uint8_t* bgrImage, uint8_t* hsvImage,
            int width, int height, int srcStride = 0, int dstStride = 0);
int rgb2HSV(const uint8_t* rgbImage, uint8_t* hsvImage, ...);
int hsv2BGR(const uint8_t* hsvImage, uint8_t* bgrImage, ...);   // CPP only

NEON provides only the two entry points bgr2HSV / rgb2HSV; there is no hsv2BGR. Use the CPP version for the reverse conversion.

Parameter	Type	Meaning	Default
`bgrImage` / `rgbImage` / `hsvImage`	`const uint8_t` / `uint8_t`	input / output planes	non-null
`width`, `height`	`int`	Image size	> 0
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto

HSV encoding: H ∈ [0, 180] (OpenCV-compatible), S, V ∈ [0, 255].

Example

cpp

uint8_t rgb[1920*1080*3], hsv[1920*1080*3];

acl::neon::cvtcolor::rgb2HSV(rgb, hsv, 1920, 1080);

RGB ↔ Lab

Conversion between RGB/BGR and CIE Lab.

Tier: Pro+
Channels: 3ch ↔ 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

int bgr2Lab(const uint8_t* bgrImage, uint8_t* labImage,
            int width, int height, int srcStride = 0, int dstStride = 0);
int rgb2Lab(const uint8_t* rgbImage, uint8_t* labImage, ...);
int lab2BGR(const uint8_t* labImage, uint8_t* bgrImage, ...);   // CPP only

NEON provides only bgr2Lab / rgb2Lab; there is no lab2BGR.

Lab encoding: L ∈ [0, 255] (mapping for L* 0-100), a, b ∈ [0, 255] (centered on 128).

Example

cpp

uint8_t rgb[1920*1080*3], lab[1920*1080*3];
acl::neon::cvtcolor::rgb2Lab(rgb, lab, 1920, 1080);

gammaTransform

Gamma transform: dst = A * base * (src/base)^gamma; used for display gamma correction, exposure compression, etc.

Tier: Starter+
Channels: 1ch
Inplace: supported
Types:

Template parameter	Allowed types	Constraint
`T`	`{uint8_t, uint16_t}`	—
`GT` (computation) ∈ `{float, double}`	—	—

CPP Signature

cpp

template<class T, class GT = float>
int gammaTransform(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    const GT& gamma,
    int normalizeBase,
    bool if_round = false,
    const GT& A = GT{1});

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`srcStride`, `dstStride`	`int`	Bytes per row	required (`0` = auto)
`gamma`	`GT`	Gamma exponent	`2.2` (display) / `1/2.2` (inverse gamma)
`normalizeBase`	`int`	Base value used to normalize to [0, 1]	u8: `255` / u16: `1023`, etc.
`if_round`	`bool`	Whether to round the result (otherwise floor)	`false`
`A`	`GT`	Linear scaling coefficient	`1`

Example

cpp

uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Display gamma correction (2.2)
acl::cvtcolor::gammaTransform<uint8_t, float>(
    srcImage, dstImage, 1920, 1080, 0, 0, 2.2f, 255);

Filter

Namespace: acl::filter (cpp) / acl::neon::filter (NEON)

gaussianBlur

Gaussian blur (low-pass filter), accelerated with a separable kernel (row kernel × column kernel).

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`ST`	`uint8_t, uint16_t, float`	—
`DT`	`uint8_t, uint16_t, float`	—

CPP Version: `acl::filter::gaussianBlur`

cpp

template<class ST, class DT>
int gaussianBlur(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    int kRadiusX, int kRadiusY,
    double sigmaX = 0.0, double sigmaY = 0.0,
    ST* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const ST` / `DT`	input / output; `dstImage` must be pre-allocated	non-null
`width`, `height`	`int`	Image size (pixels)	> 0
`cn`	`int`	Channel count	1 or 3
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`kRadiusX`, `kRadiusY`	`int`	Kernel radius; kernel size = 2r+1	≥ 1
`sigmaX`, `sigmaY`	`double`	Gaussian sigma	`0` = auto (`σ = 0.15·kSize + 0.35`)
`constant`	`ST*`	`BORDER_CONSTANT` fill-value pointer	`nullptr`
`bt`	`acl::BorderType`	Border handling	`BORDER_REFLECT_101`

NEON Version: `acl::neon::filter::gaussianBlur*` (`uint8_t` only)

The NEON layer provides fixed-kernel and generic entry points:

Entry point	Kernel	Channels	Tier	sigma configurable
`gaussianBlur3x3`	3×3	1	Starter+	—
`gaussianBlur3x3_3ch`	3×3	3	Starter+	—
`gaussianBlur5x5`	5×5	1	Starter+	—
`gaussianBlur11x11`	11×11	1	Starter+	—
`gaussianBlur` (generic)	any 2r+1	1	Starter+	✅
`gaussianBlur5x5_3ch`	5×5	3	Starter+	—

Trial package note: Trial does not include gaussianBlur. Use the resize wrappers acl::trial::resizeBilinear2xDown_cpp(const uint8_t*, uint8_t*) or acl::trial::resizeBilinear2xDown_neon(const uint8_t*, uint8_t*) for the Trial demo surface.

Fixed-kernel signature (3x3 / 5x5 / 11x11 / 3x3_3ch / 5x5_3ch share the same signature):

cpp

int gaussianBlur3x3(   // or gaussianBlur5x5 / gaussianBlur11x11 / gaussianBlur3x3_3ch / gaussianBlur5x5_3ch
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    uint8_t constant = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Generic signature (supports arbitrary radius / sigma):

cpp

int gaussianBlur(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int kRadiusX, int kRadiusY,
    double sigmaX = 0.0, double sigmaY = 0.0,
    int constant = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Smart dispatch: when kSize ∈ {3, 5, 11} and sigma = 0, the generic version delivers the same throughput as the corresponding fixed-kernel variant; other (kSize, sigma) combinations fall back to the dynamic sepFilter2D performance profile.

Example

cpp

#include <acl/acl.h>
#include <acl/api.h>
acl::init("license.dat");

uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Case 1: 3×3 fixed kernel (Starter+ paid API)
acl::neon::filter::gaussianBlur3x3(srcImage, dstImage, 1920, 1080);

// Case 2: 5×5 + custom sigma (Starter+)
acl::neon::filter::gaussianBlur(srcImage, dstImage, 1920, 1080, 0, 0, 2, 2, 1.5, 1.5);

boxFilter

Box (mean) filter; all pixels within the kernel are summed with equal weight. Optional normalization (when normalize=true the result is divided by the kernel size, which is the standard mean filter).

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`ST, DT` (CPP)	`{uint8_t, uint16_t, float}`	—
`DT` (NEON)	`{uint8_t, int}` (src is `uint8_t`)	NEON-only

CPP Version: `acl::filter::boxFilter`

cpp

template<class ST, class DT>
int boxFilter(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    int kRadius,
    bool isNormalize = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const ST` / `DT`	input / output	non-null
`width`, `height`	`int`	Image size (pixels)	> 0
`cn`	`int`	Channel count	1 or 3
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`kRadius`	`int`	Kernel radius (kSize = 2r+1)	≥ 1
`isNormalize`	`bool`	`true` → mean (divide by `kSize²`); `false` → sum	`true`
`bt`	`acl::BorderType`	Border handling	`BORDER_REFLECT_101`

NEON Version: `acl::neon::filter::boxFilter*` (`uint8_t` only)

Entry point	Kernel	Channels	Tier
`boxFilter3x3`	3×3	1	Starter+
`boxFilter5x5`	5×5	1	Starter+
`boxFilter` (generic)	any 2r+1	1	Starter+

There is no NEON entry point for 3 channels yet; use the CPP version acl::filter::boxFilter<uint8_t,uint8_t>(..., cn=3).

Fixed-kernel signature (boxFilter3x3 / boxFilter5x5):

cpp

template<class DT>
int boxFilter3x3(   // or boxFilter5x5
    const uint8_t* srcImage, DT* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int constant = 0,
    bool isNormalize = true,
    int cn = 1,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Generic signature:

cpp

template<class DT>
int boxFilter(
    const uint8_t* srcImage, DT* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int kRadius, int constant,
    bool isNormalize = true,
    int cn = 1,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

DT supports uint8_t (normalize) / uint32_t (no normalize, overflow-safe). cn is a runtime parameter (1 or 3).

Example

cpp

uint8_t srcImage[1920*1080], dstImage[1920*1080];
uint32_t dstSum[1920*1080];

// 3×3 mean filter
acl::neon::filter::boxFilter3x3<uint8_t>(srcImage, dstImage, 1920, 1080, 0, 0);

// 5×5 sum (not normalized, u32 output)
acl::neon::filter::boxFilter5x5<uint32_t>(
    srcImage, dstSum, 1920, 1080, 0, 0,
    /*constant=*/0, /*isNormalize=*/false, /*cn=*/1,
    acl::BorderType::BORDER_REPLICATE);

filter2D

Generic 2D convolution (arbitrary kernel). Internally detects separable kernels; if separable, automatically converts to sepFilter2D for speedup.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`ST`	`uint8_t, uint16_t, float`	—
`DT`	`uint8_t, uint16_t, float`	—
`KT`	`uint8_t, uint16_t, float`	—

CPP Version: `acl::filter::filter2D`

cpp

template<class ST, class DT, class KT>
int filter2D(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    const KT* kernel, int kRadiusX, int kRadiusY,
    const ST* constant = nullptr,
    bool isNormalize = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const ST` / `DT`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`cn`	`int`	Channel count	1 or 3
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`kernel`	`const KT*`	Kernel data, row-major, size `(2rX+1)×(2rY+1)`	non-null
`kRadiusX`, `kRadiusY`	`int`	Kernel radius	≥ 1
`constant`	`const ST*`	`BORDER_CONSTANT` fill-value pointer	`nullptr`
`isNormalize`	`bool`	When `true`, output is divided by sum of kernel elements	`true`
`bt`	`acl::BorderType`	Border handling	`BORDER_REFLECT_101`

NEON Version: `acl::neon::filter::filter2D` (`uint8_t` input only)

cpp

template<class DT, class KT>
int filter2D(
    const uint8_t* srcImage, DT* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    const KT* kernel, int kRadiusX, int kRadiusY,
    int constant = 0,
    bool isNormalize = true,
    int cn = 1,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

DT / KT: typically (uint8_t, float) or (int32_t, int32_t).

Example

cpp

uint8_t srcImage[1920*1080], dstImage[1920*1080];

// 5×5 Laplacian kernel (not normalized, sharpen)
int K[25] = { 0, 0,-1, 0, 0,
              0,-1,-2,-1, 0,
             -1,-2,17,-2,-1,
              0,-1,-2,-1, 0,
              0, 0,-1, 0, 0 };
acl::filter::filter2D<uint8_t, uint8_t, int>(
    srcImage, dstImage, 1920, 1080, 1, 0, 0, K, 2, 2,
    nullptr, /*isNormalize=*/false, acl::BorderType::BORDER_REPLICATE);

sepFilter2D

Separable 2D convolution: first convolves along rows with kernelX, then along columns with kernelY. Faster than filter2D: O(k) → O(2k).

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`ST`	`uint8_t, uint16_t, float`	—
`DT`	`uint8_t, uint16_t, float`	—
`KT`	`uint8_t, uint16_t, float`	—

CPP Version: `acl::filter::sepFilter2D`

cpp

template<class ST, class DT, class KT>
int sepFilter2D(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    const KT* kernelX, const KT* kernelY,
    int kRadiusX, int kRadiusY,
    const ST* constant = nullptr,
    bool isNormalize = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const ST` / `DT`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`cn`	`int`	Channel count	1 or 3
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`kernelX`, `kernelY`	`const KT*`	Row / column 1D kernels, lengths `2rX+1` / `2rY+1` respectively	non-null
`kRadiusX`, `kRadiusY`	`int`	Kernel radius	≥ 1
`constant`	`const ST*`	`BORDER_CONSTANT` fill-value pointer	`nullptr`
`isNormalize`	`bool`	If `true` divide by kernel sum	`true`
`bt`	`acl::BorderType`	Border-handling mode	`BORDER_REFLECT_101`

NEON Version: `acl::neon::filter::sepFilter2D` (`uint8_t` only, 1ch / 3ch)

cpp

template<class KT>
int sepFilter2D(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    const KT* kernelX, const KT* kernelY,
    int kRadiusX, int kRadiusY,
    int constant = 0,
    int cn = 1,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

cn = 1 or 3 (channel count selected at runtime). KT: usually float.

Example

cpp

uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Decompose Gaussian 5×5 into [1,4,6,4,1]/16 × [1,4,6,4,1]/16
float kx[5] = {1,4,6,4,1}, ky[5] = {1,4,6,4,1};
acl::neon::filter::sepFilter2D<float>(
    srcImage, dstImage, 1920, 1080, 0, 0, kx, ky, 2, 2, /*constant=*/0, /*cn=*/1);

sobel3x3

3×3 Sobel edge-detection operator. The runtime flag isGradX selects whether to compute Gx (horizontal gradient) or Gy (vertical gradient).

Tier: Starter+
Channels: 1ch (call separately per channel or use filter2D)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`ST`	`{uint8_t, uint16_t, float}`	—
`DT`	typically `int16_t` / `int32_t` / `float`	—
Output type: `int16_t` (short) (since gradient values may be negative)

CPP Version: `acl::filter::sobel3x3`

cpp

template<class ST, class DT>
int sobel3x3(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride = 0, int dstStride = 0,
    const ST* constant = nullptr,
    bool isGradX = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const ST` / `DT`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`cn`	`int`	Channel count	1 or 3
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`constant`	`const ST*`	`BORDER_CONSTANT` fill value	`nullptr`
`isGradX`	`bool`	`true`: Gx (horizontal); `false`: Gy (vertical)	`true`
`bt`	`acl::BorderType`	Border-handling mode	`BORDER_REFLECT_101`

Both directions require two separate calls.

NEON Version: `acl::neon::filter::sobel3x3` (`uint8_t → int16_t` only, 1ch)

cpp

int sobel3x3(
    const uint8_t* srcImage, int16_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    int constant = 0,
    bool isGradX = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Example

cpp

int16_t gx[1920*1080], gy[1920*1080];
acl::neon::filter::sobel3x3(src_u8, gx, 1920, 1080, 0, 0, 0, /*isGradX=*/true);   // Gx
acl::neon::filter::sobel3x3(src_u8, gy, 1920, 1080, 0, 0, 0, /*isGradX=*/false);  // Gy
// Gradient magnitude can be composed via acl::arithmetic::phaseMagnitude

scharr

3×3 Scharr edge-detection operator. Offers better rotational symmetry than Sobel and slightly higher numerical precision.

Tier: Starter+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`ST`	`{uint8_t, uint16_t, float}`	—
`DT`	typically `int16_t` / `int32_t` / `float`	—
Output type: `int16_t`

CPP Version: `acl::filter::scharr`

cpp

template<class ST, class DT>
int scharr(
    const ST* srcImage, DT* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    const ST* constant = nullptr,
    bool isGradX = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter semantics are the same as sobel3x3 (only the kernel coefficients are Scharr [-3, -10, -3; 0, 0, 0; 3, 10, 3] instead of Sobel).

NEON Version: `acl::neon::filter::scharr` (`uint8_t → int16_t` only)

cpp

int scharr(
    const uint8_t* srcImage, int16_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    bool isGradX = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Example

cpp

int16_t gx[1920*1080];
acl::neon::filter::scharr(src_u8, gx, 1920, 1080, 0, 0, /*isGradX=*/true);   // Gx (Scharr)

laplacian

Laplacian operator (second-order gradient), used for edge detection or sharpening. Internally performs two convolutions with a 3×3 or larger kernel.

Tier: Starter+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`ST`	`{uint8_t, uint16_t, float}`	—
`DT`	typically `int16_t` / `int32_t` / `float`	—
Output type: `int16_t` (second-order gradient may be negative)

CPP Version: `acl::filter::laplacian`

cpp

template<class ST, class DT>
int laplacian(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride = 0, int dstStride = 0,
    int ksize = 1, double scale = 1.0, double delta = 0.0,
    const ST* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const ST` / `DT`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`cn`	`int`	Channel count	1
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`ksize`	`int`	Kernel aperture size	`1` (= 3×3 standard Laplacian)
`scale`	`double`	Output scaling factor	`1.0`
`delta`	`double`	Output offset	`0.0`
`constant`	`const ST*`	`BORDER_CONSTANT` fill value	`nullptr`
`bt`	`acl::BorderType`	Border-handling mode	`BORDER_REFLECT_101`

NEON Version: `acl::neon::filter::laplacian` (`uint8_t → int16_t` only)

cpp

int laplacian(
    const uint8_t* srcImage, int16_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    int ksize = 1, int constant = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

The NEON version omits scale / delta (fixed at 1.0 / 0.0). For scaling, use the CPP version.

Example

cpp

uint8_t srcImage[1920*1080];
int16_t dstImage[1920*1080];
acl::neon::filter::laplacian(srcImage, dstImage, 1920, 1080);

canny

Canny edge detection. Typical pipeline: Gaussian blur → gradient → non-maximum suppression → double-threshold linking.

Tier: Starter+
Channels: 1ch (grayscale input)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`Src_T`	`uint8_t`, `uint16_t`	CPP backend
`Src_T`	`uint8_t`	NEON backend

NEON Version: `acl::neon::filter::canny` (`uint8_t` only)

cpp

int canny(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int low_thresh, int high_thresh,
    int aperture_size = 3,
    int srcStride = 0, int dstStride = 0,
    bool l2GradFlag = true);

Parameter	Type	Meaning	Default
`srcImage`	`const uint8_t*`	Input grayscale image	non-null
`dstImage`	`uint8_t*`	Output binary edge image (0 / 255)	non-null
`width`, `height`	`int`	Image size	> 0
`low_thresh`, `high_thresh`	`int`	Low / high double thresholds	`low < high`
`aperture_size`	`int`	Sobel kernel aperture	`3` (only 3 is supported)
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`l2GradFlag`	`bool`	`true`: L2 Euclidean gradient; `false`: L1 gradient (faster)	`true`

Example

cpp

uint8_t srcImage[1920*1080], dstImage[1920*1080];
acl::neon::filter::canny(srcImage, dstImage, 1920, 1080, 50, 150);

morphology (erode / dilate)

Basic morphological operators: erosion (erode, taking the minimum over kernel coverage) and dilation (dilate, taking the maximum). Uses the O(N) van Herk / Gil-Werman algorithm; runtime is independent of kernel size.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—
Kernel shape: square (determined by `radius`; actual kernel size = 2r+1)

CPP Version: `acl::filter::erode` / `acl::filter::dilate`

cpp

template<class T>
int erode(
    const T* srcImage, T* dstImage,
    int width, int height, int cn, int radius,
    int srcStride = 0, int dstStride = 0);

template<class T>
int dilate(
    const T* srcImage, T* dstImage,
    int width, int height, int cn, int radius,
    int srcStride = 0, int dstStride = 0);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`cn`	`int`	Channel count	1 or 3
`radius`	`int`	Structuring-element radius (square kernel, size = 2r+1)	≥ 1
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto

NEON Version: `acl::neon::filter::erode` / `acl::neon::filter::dilate` (`uint8_t` only)

cpp

int erode(   // or dilate
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height, int cn, int radius,
    int srcStride = 0, int dstStride = 0);

Parameter semantics match the CPP version.

Example

cpp

uint8_t srcImage[1920*1080], dstImage[1920*1080];

// 3×3 erosion (radius=1)
acl::filter::erode<uint8_t>(srcImage, dstImage, 1920, 1080, 1, 1);

// 11×11 dilation (radius=5; the O(N) algorithm is unaffected by size)
acl::filter::dilate<uint8_t>(srcImage, dstImage, 1920, 1080, 1, 5);

medianFilter

3×3 median filter — outputs the median of the 9 pixels within the kernel. Typical use: salt-and-pepper noise removal.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: supported (src == dst OK)
Types:

Template parameter	Allowed types	Constraint
`DT`	`uint8_t, uint16_t, float`	—
Kernel size: 3×3 only (5×5 and larger are not implemented)

CPP Version: `acl::filter::medianFilter3x3`

cpp

template<class DT>
int medianFilter3x3(
    const DT* srcImage, DT* dstImage,
    int width, int height,
    int cn = 1,
    int srcStride = 0, int dstStride = 0,
    DT borderValue = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const DT` / `DT`	input / output (inplace supported)	non-null
`width`, `height`	`int`	Image size	> 0
`cn`	`int`	Channel count	`1`
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`borderValue`	`DT`	`BORDER_CONSTANT` fill value	`0`
`bt`	`acl::BorderType`	Border-handling mode	`BORDER_REFLECT_101`

NEON Version: `acl::neon::filter::medianFilter3x3` / `medianFilter3x3_3ch` (`uint8_t` only)

Entry point	Channels	Tier
`medianFilter3x3`	1	Starter+
`medianFilter3x3_3ch`	3	Starter+

cpp

int medianFilter3x3(   // or medianFilter3x3_3ch
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    uint8_t borderValue = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Example

cpp

uint8_t srcImage[1920*1080];

// Salt-and-pepper denoise (in-place)
acl::neon::filter::medianFilter3x3(srcImage, srcImage, 1920, 1080);

bilateralFilter

Edge-preserving smoothing — bilateral filter; simultaneously considers spatial distance and pixel-value difference, denoising while preserving edges.

Tier: Pro+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T` (CPP)	`uint8_t, uint16_t, float`	—
`T` (NEON)	`uint8_t`	NEON-only

CPP Version: `acl::filter::bilateralFilter`

cpp

template<class T = uint8_t>
int bilateralFilter(
    const T* srcImage, T* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    int d, double sigmaColor, double sigmaSpace,
    const T* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`cn`	`int`	Channel count	1 or 3
`srcStride`, `dstStride`	`int`	Bytes per row	required
`d`	`int`	Filter radius (kernel = 2d+1)	≥ 1
`sigmaColor`	`double`	Standard deviation in color space	typically 10-100
`sigmaSpace`	`double`	Standard deviation in coordinate space	typically 10-100
`constant`	`const T*`	`BORDER_CONSTANT` fill value	`nullptr`
`bt`	`acl::BorderType`	Border handling	`BORDER_REFLECT_101`

NEON Version: `acl::neon::filter::bilateralFilter` (`uint8_t` only, 1ch / 3ch)

cpp

int bilateralFilter(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int d, double sigmaColor, double sigmaSpace,
    int constant = 0,
    int cn = 1,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

cn = 1 or 3 (channel count, runtime).

Example

cpp

uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Edge-preserving denoising, d=5, sigmaColor=sigmaSpace=30
acl::neon::filter::bilateralFilter(
    srcImage, dstImage, 1920, 1080, 0, 0, 5, 30.0, 30.0);

nlMeansDenoising

Non-Local Means denoising — searches for similar patches within the entire search window and forms a weighted average, denoising while preserving details.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T` (CPP)	`uint8_t, uint16_t, float`	—
`T` (NEON)	`uint8_t`	NEON-only

CPP Version: `acl::filter::nlMeansDenoising`

cpp

template<class T = uint8_t>
int nlMeansDenoising(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    float h,
    int patchRadius = 3,
    int searchRadius = 10);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`srcStride`, `dstStride`	`int`	Bytes per row	required
`h`	`float`	Denoising strength (larger = smoother; typically 5-15)	required
`patchRadius`	`int`	Patch radius (patch = 2r+1)	`3` (= 7×7)
`searchRadius`	`int`	Search-window radius	`10` (= 21×21)

NEON Version: `acl::neon::filter::nlMeansDenoising` (`uint8_t` only)

cpp

int nlMeansDenoising(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    float h,
    int patchRadius = 3,
    int searchRadius = 10);

Non-templated; the signature matches the CPP version with the template <T> removed.

Example

cpp

uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Light denoising (h=10, patch 7×7, search 21×21 — defaults)
acl::neon::filter::nlMeansDenoising(srcImage, dstImage, 1920, 1080, 0, 0, 10.0f);

// Stronger denoising + larger search window
acl::neon::filter::nlMeansDenoising(srcImage, dstImage, 1920, 1080, 0, 0, 15.0f, 3, 15);

guidedFilter

Guided filter — edge-aware smoothing of the input image based on the guide image (guideImage). O(N) complexity (does not grow with kernel size).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP Version: `acl::filter::guidedFilter`

cpp

template<class T>
int guidedFilter(
    const T* guideImage,
    const T* srcImage,
    T* dstImage,
    int width, int height,
    int guideStride, int srcStride, int dstStride,
    int radius, double eps);

Parameter	Type	Meaning	Default
`guideImage`	`const T*`	Guide image (often identical to srcImage; another image also works)	non-null
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`guideStride`, `srcStride`, `dstStride`	`int`	Respective bytes per row	required
`radius`	`int`	Window radius (window size = 2r+1)	≥ 1
`eps`	`double`	Regularization parameter	typically `0.01` (integer images `1.0`-`100.0`)

NEON Version: `acl::neon::filter::guidedFilter` (`uint8_t` only)

cpp

int guidedFilter(
    const uint8_t* guideImage,
    const uint8_t* srcImage,
    uint8_t* dstImage,
    int width, int height,
    int guideStride, int srcStride, int dstStride,
    int radius, double eps);

Example

cpp

// Use the source image itself as the guide; radius=8, eps=1000
acl::filter::guidedFilter<uint8_t>(
    srcImage, srcImage, dstImage, 1920, 1080, 1920, 1920, 1920, 8, 1000.0);

stackBlur

O(1) approximate Gaussian blur (complexity independent of kernel size); suitable for large-kernel scenarios. The effect is close to Gaussian but slightly different; used where exact Gaussian is not required (e.g. UI blurred backgrounds).

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP Version: `acl::filter::stackBlur`

cpp

template<class T>
int stackBlur(
    const T* srcImage, T* dstImage,
    int width, int height, int cn,
    int srcStride = 0, int dstStride = 0,
    int kSizeX = 3, int kSizeY = 3);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`cn`	`int`	Channel count	1 or 3
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`kSizeX`, `kSizeY`	`int`	Horizontal / vertical kernel size (must be odd)	`3`

NEON Version: `acl::neon::filter::stackBlur` (`uint8_t` only)

cpp

int stackBlur(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height, int cn,
    int srcStride = 0, int dstStride = 0,
    int kSizeX = 3, int kSizeY = 3);

Example

cpp

// 21×21 blur (well beyond Gaussian 11×11; the O(1) algorithm does not degrade)
acl::filter::stackBlur<uint8_t>(srcImage, dstImage, 1920, 1080, 1, 0, 0, 21, 21);

unsharpMask

Unsharp Mask — subtracts a blurred image from the source to produce a sharpening enhancement. amount controls the sharpening strength.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP Version: `acl::filter::unsharpMask`

cpp

template<class T>
int unsharpMask(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int kRadius, double sigma, float amount);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`srcStride`, `dstStride`	`int`	Bytes per row	required
`kRadius`	`int`	Gaussian kernel radius	≥ 1
`sigma`	`double`	Gaussian sigma (controls blur strength)	typically `1.0` ~ `2.0`
`amount`	`float`	Sharpening strength	typically `0.5` ~ `2.0` (`1.0` = original strength)

NEON Version: `acl::neon::filter::unsharpMask` (`uint8_t` only)

cpp

int unsharpMask(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int kRadius, double sigma, float amount);

Example

cpp

// Light sharpening (radius=2, sigma=1.5, amount=1.2)
acl::filter::unsharpMask<uint8_t>(
    srcImage, dstImage, 1920, 1080, 0, 0, 2, 1.5, 1.2f);

gaborFilter

Gabor filter — sinusoidal × Gaussian-modulated kernel used for texture analysis and direction-sensitive edge detection.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP Version: `acl::filter::gaborFilter`

cpp

template<class T>
int gaborFilter(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int ksize, double sigma, double theta,
    double lambd, double gamma, double psi);

Parameter	Type	Meaning	Typical value
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`width`, `height`	`int`	Image size	> 0
`srcStride`, `dstStride`	`int`	Bytes per row	required
`ksize`	`int`	Kernel size (typically odd)	`21` / `31`
`sigma`	`double`	Gaussian envelope sigma	`4.0`-`8.0`
`theta`	`double`	Direction (radians)	`0` (horizontal) ~ `π`
`lambd`	`double`	Sinusoidal wavelength	`10.0`
`gamma`	`double`	Spatial aspect ratio	`0.5`
`psi`	`double`	Phase offset	`0`

NEON Version: `acl::neon::filter::gaborFilter` (`uint8_t` only)

cpp

int gaborFilter(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int ksize, double sigma, double theta,
    double lambd, double gamma, double psi);

Example

cpp

uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Horizontal-direction Gabor kernel
acl::neon::filter::gaborFilter(srcImage, dstImage, 1920, 1080, 0, 0,
    21, 4.0, 0.0, 10.0, 0.5, 0.0);

edgePreservingFilter / detailEnhance

Edge-preserving filtering — O(N) edge-preserving smoothing based on recursive domain transforms (Gastal & Oliveira, SIGGRAPH 2011).
edgePreservingFilter smooths while preserving edges; detailEnhance uses it in reverse to enhance details.

Tier: Business
Channels: 3ch (RGB)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP Signature

cpp

int edgePreservingFilter(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    float sigmaS = 60.0f, float sigmaR = 0.4f,
    int numIter = 3);

int detailEnhance(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    float sigmaS = 10.0f, float sigmaR = 0.15f);

Parameter	Type	Meaning	Default
`sigmaS`	`float`	Spatial sigma (larger → larger smoothing range)	60.0 (edge-preserving) / 10.0 (detail)
`sigmaR`	`float`	Color-value sigma	0.4 / 0.15
`numIter`	`int`	Number of iterations (edge-preserving)	3

Example

cpp

uint8_t rgbSrc[1920*1080*3], rgbDst[1920*1080*3];

// Cartoon-style effect
acl::filter::edgePreservingFilter(rgbSrc, rgbDst, 1920, 1080, 0, 0, 60.0f, 0.4f, 3);

// Detail enhancement
acl::filter::detailEnhance(rgbSrc, rgbDst, 1920, 1080);

tonemap

HDR tone mapping — compresses a float HDR image into the uint8_t LDR display space. Provides 3 classic algorithms.

Tier: Business
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
Input	`float`	—
Output	`uint8_t`	—

CPP Signature

cpp

// Linear exposure (simple gamma)
int tonemapLinear(
    const float* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    float gamma = 2.2f, float exposure = 1.0f);

// Reinhard (global key control)
int tonemapReinhard(
    const float* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    float gamma = 2.2f,
    float key = 0.18f,
    float lWhite = 0.0f);

// Drago (logarithmic mapping, suitable for high dynamic range)
int tonemapDrago(
    const float* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    float gamma = 2.2f,
    float saturation = 1.0f,
    float bias = 0.85f);

Parameter	Meaning	Typical value
`gamma`	Output gamma correction	`2.2`
`exposure` (Linear)	Exposure multiplier	`1.0`
`key` (Reinhard)	Mean-luminance key	`0.18`
`lWhite` (Reinhard)	White point (`0` = auto-take max)	`0.0`
`saturation` (Drago)	Saturation	`1.0`
`bias` (Drago)	Bias	`0.85`

Example

cpp

float hdr[1920*1080];
uint8_t ldr[1920*1080];

// Simplest: linear + gamma
acl::filter::tonemapLinear(hdr, ldr, 1920, 1080);

// Use Reinhard when the scene is too bright
acl::filter::tonemapReinhard(hdr, ldr, 1920, 1080, 0, 0, 2.2f, 0.18f);

mergeMertens

Multi-exposure image fusion (Mertens et al. 2007) — fuses multiple images at different exposures into a single balanced-exposure output. A simpler alternative to the HDR + tonemap pipeline.

Tier: Business
Channels: 3ch (RGB)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP Signature

cpp

int mergeMertens(
    const uint8_t** images, int numImages,
    uint8_t* dstImage,
    int width, int height,
    const int* strides = nullptr,
    int dstStride = 0,
    float wContrast = 1.0f,
    float wSaturation = 1.0f,
    float wExposure = 1.0f);

Parameter	Meaning	Typical value
`images`	Input image pointer array (3ch RGB)	`numImages` images
`numImages`	Number of input images	typically `3` (underexposed / normal / overexposed)
`strides`	Per-image stride array; when `nullptr`, all treated as `width*3`	optional
`wContrast`	Contrast weight	`1.0`
`wSaturation`	Saturation weight	`1.0`
`wExposure`	Exposure-quality weight	`1.0`

Example

cpp

uint8_t under[1920*1080*3], normal[1920*1080*3], over[1920*1080*3];
uint8_t fused[1920*1080*3];

const uint8_t* imgs[3] = { under, normal, over };
acl::filter::mergeMertens(imgs, 3, fused, 1920, 1080);

Geometric

Namespace: acl::geometric (CPP) / acl::neon::geometric (NEON)

resize

Image resize, supporting 4 interpolation modes (NEAREST / LINEAR2D / CUBIC4x4 / AREA_AVG).

Tier: Starter+
Channels: 1ch / 3ch / 4ch (runtime via hcn / cn)
Inplace: supported only when srcWidth == dstWidth && srcHeight == dstHeight (degenerate copy / crop case)
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP Version: `acl::geometric::resize`

cpp

template<class T, class OT = float>
int resize(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    int hcn = 1, int vcn = 1,
    acl::InterpMode im = acl::InterpMode::LINEAR2D);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`srcWidth`, `srcHeight`	`int`	Source image size	> 0
`dstWidth`, `dstHeight`	`int`	Destination image size	> 0
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`hcn`	`int`	Horizontal channel count (grayscale 1 / Bayer 2 / RGB 3 / RGBA 4)	`1`
`vcn`	`int`	Vertical channel count	`1`
`im`	`acl::InterpMode`	`NEAREST` / `LINEAR2D` (default) / `CUBIC4x4` / `AREA_AVG`	`LINEAR2D`

Template parameters:

T — input / output element type
OT — intermediate interpolation type (float default, double high-precision)

NEON Version: `acl::neon::geometric::resize` (`uint8_t` / `uint16_t`)

cpp

template<class T>
int resize(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    acl::InterpMode im = acl::InterpMode::LINEAR2D,
    int cn = 1,
    int shiftRight = 0);

Parameters:

T — uint8_t / uint16_t
im — same as the CPP version
cn — channel count (1 / 3 / 4)
shiftRight — number of bits to right-shift the data for P010 format (set when 10-bit is in the high bits of 16-bit)

Example

cpp

uint8_t srcImage[1920*1080], dstImage[960*540];

// NEON LINEAR2D 1ch, 2× downscale
acl::neon::geometric::resize<uint8_t>(
    srcImage, dstImage, 1920, 1080, 960, 540);

// CPP LINEAR2D 3ch upscale
uint8_t rgbSrc[640*480*3], rgbDst[1280*960*3];
acl::geometric::resize<uint8_t>(
    rgbSrc, rgbDst, 640, 480, 1280, 960,
    /*srcStride=*/0, /*dstStride=*/0,
    /*hcn=*/3, /*vcn=*/1,
    acl::InterpMode::LINEAR2D);

// float CUBIC4x4 (CPP only)
float fSrc[640*480], fDst[1280*960];
acl::geometric::resize<float>(
    fSrc, fDst, 640, 480, 1280, 960,
    0, 0, 1, 1, acl::InterpMode::CUBIC4x4);

rotate

Image rotation / flip / transpose. Supports ROT_0 / ROT_180 / ROT_CW_90 / ROT_CCW_90 / FLIP_H / FLIP_V / XPOSE.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported for ROT_180 / FLIP_H / FLIP_V (srcImage == dstImage)
Types:

Template parameter	Allowed types	Constraint
`T` (CPP)	`{uint8_t, uint16_t, float}`	—
`uint8_t` only (NEON) (—)	—	—

Channel / type support matrix

Entry point	Channels	Types
CPP `rotate<T>`	any (packed via runtime `blockW` / `blockH`)	`T` any (`uint8_t` / `uint16_t` / `float` / …)
NEON `rotate<T>`	1ch only	`uint8_t` only
NEON `rotateNV`	NV21 / NV12 (Y + UV)	`uint8_t`
NEON `rotateYV12`	YV12 / I420 (Y + U + V)	`uint8_t`
NEON `rotateYUV444`	YUV444 (3ch interleaved)	`uint8_t`

Recommended approach for RGB / RGBA rotation:
NEON-accelerated path: there is no direct NEON rotate entry for RGB, but since pure rotation is essentially memory movement, the CPP blockW=3/4 path already uses memcpy + inlined reads/writes, so for RGB/RGBA the performance is equivalent.
For multi-channel, use CPP rotate<uint8_t> with blockW=3 (RGB) or blockW=4 (RGBA); see the example below.

CPP Version: `acl::geometric::rotate`

cpp

template<class T>
int rotate(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    acl::RotateOrient ori = acl::RotateOrient::ROT_CW_90,
    int blockW = 1, int blockH = 1);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`srcWidth`, `srcHeight`	`int`	Source image size, counted in pixel blocks (not pixels)	must be divisible by `blockW/H`
`dstWidth`, `dstHeight`	`int`	Destination size, same as above	same as above
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`ori`	`acl::RotateOrient`	Rotation direction	`ROT_CW_90`
`blockW`, `blockH`	`int`	Pixel-block size (unit). `1,1` handles single-channel scalar; `blockW=3` packs one RGB pixel as an indivisible unit; `blockW=4` packs RGBA.	`1, 1`

Note that srcWidth / dstWidth are given in block count: a grayscale image 1920 wide → srcWidth=1920; RGB 1920 wide → srcWidth=1920 (block count is still 1920, not 5760) with blockW=3.

NEON Version: `acl::neon::geometric::rotate` (`uint8_t` only, 1ch)

cpp

template<class T>
int rotate(
    const T* srcImage, T* dstImage,
    int srcW, int srcH,
    int srcStride = 0, int dstStride = 0,
    acl::RotateOrient ori = acl::RotateOrient::ROT_CW_90);

Signature differs slightly: no dstWidth/dstHeight; the output size is derived from ori (CW/CCW/XPOSE → swap width and height, otherwise → unchanged).

Example

cpp

// 1) 1ch u8 goes through NEON
uint8_t srcImage[1920*1080], dstImage[1080*1920];
acl::neon::geometric::rotate<uint8_t>(
    srcImage, dstImage, 1920, 1080,
    /*srcStride=*/0, /*dstStride=*/0,
    acl::RotateOrient::ROT_CW_90);

// 2) RGB (3ch) goes through CPP; blockW=3 packs the pixel
uint8_t rgbSrc[1920*1080*3], rgbDst[1080*1920*3];
acl::geometric::rotate<uint8_t>(
    rgbSrc, rgbDst,
    /*srcWidth=*/1920, /*srcHeight=*/1080,   // block count, not 1920*3
    /*dstWidth=*/1080, /*dstHeight=*/1920,
    /*srcStride=*/1920*3, /*dstStride=*/1080*3,
    acl::RotateOrient::ROT_CW_90,
    /*blockW=*/3, /*blockH=*/1);

// 3) float 1ch 180° in-place
float img[512*512];
acl::geometric::rotate<float>(
    img, img, 512, 512, 512, 512,
    0, 0,
    acl::RotateOrient::ROT_180);

// 4) YUV NV21 rotation uses the dedicated NEON entry (see the "YUV rotate" section)

pyrDown

2× downsample (Gaussian pyramid, going down): first 5×5 Gaussian smoothing, then 2×2 downsampling.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t, uint16_t, float`	—

CPP Version: `acl::geometric::pyrDown`

cpp

template<class T>
int pyrDown(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight, int cn,
    int srcStride = 0, int dstStride = 0,
    int dstWidth = 0, int dstHeight = 0,
    const T* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const T` / `T`	input / output	non-null
`srcWidth`, `srcHeight`	`int`	Source image size	≥ 2
`cn`	`int`	Channel count	1 or 3
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`dstWidth`, `dstHeight`	`int`	Destination size (`(src+1)/2` when `0`)	OpenCV-compatible: `\|dstW*2 - srcW\| ≤ 2`
`constant`	`const T*`	`BORDER_CONSTANT` fill value	`nullptr`
`bt`	`acl::BorderType`	Border-handling mode	`BORDER_REFLECT_101`

NEON Version: `acl::neon::geometric::pyrDown` (`uint8_t` only)

cpp

template<class T>
int pyrDown(...)   // signature is completely identical to the CPP version

The type is fixed to uint8_t; parameter semantics match the CPP version.

pyrUp

2× upsample (Gaussian pyramid, going up): interpolated upscaling followed by 5×5 Gaussian smoothing.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T` (CPP)	`uint8_t` (only `uint8_t` is currently supported)	—
`uint8_t` only (NEON) (—)	—	—

CPP / NEON Signature (identical)

cpp

template<class T>
int pyrUp(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight, int cn,
    int srcStride = 0, int dstStride = 0,
    int dstWidth = 0, int dstHeight = 0,
    const T* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter semantics are the same as pyrDown; default output is srcWidth*2 × srcHeight*2.

buildPyramid

Build a multi-level Gaussian pyramid (accumulated successive pyrDown). pyramid[0] = pyrDown(srcImage), pyramid[1] = pyrDown(pyramid[0]), ...

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T` (CPP)	`uint8_t` (only `uint8_t` is currently supported)	—
`uint8_t` only (NEON) (—)	—	—

CPP / NEON Signature (identical)

cpp

template<class T>
int buildPyramid(
    const T* srcImage,
    T** pyramid,
    int* widths, int* heights,
    int srcWidth, int srcHeight,
    int cn, int numLevels,
    int srcStride = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter	Type	Meaning	Default
`srcImage`	`const T*`	Input image (level 0)	non-null
`pyramid`	`T**`	Output pointer array (`numLevels` entries, pre-allocated by the caller)	non-null
`widths`, `heights`	`int*`	Output sizes at each level	non-null
`srcWidth`, `srcHeight`	`int`	Source image size	≥ 2
`cn`	`int`	Channel count	1 or 3
`numLevels`	`int`	Number of levels	≥ 1
`srcStride`	`int`	Source bytes per row	`0` = auto

Example

cpp

uint8_t srcImage[1920*1080];
uint8_t l0[960*540], l1[480*270], l2[240*135];
uint8_t* pyr[3] = { l0, l1, l2 };
int widths[3], heights[3];

acl::neon::geometric::buildPyramid<uint8_t>(
    srcImage, pyr, widths, heights, 1920, 1080, 1, 3);

YUV resize (NV21 / NV12 / YV12 / YUV444)

Native YUV resize; avoids the YUV ↔ RGB conversion. Each channel is resized independently (the NV plane is split → resize → merge to avoid mixing U/V).

Tier: Starter+
Channels: Y plane (1ch) + UV plane (interleaved or separate)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

NEON Signature

cpp

// NV21 / NV12 — full-resolution Y; UV plane at 1/4 resolution
int resizeNV(
    const uint8_t* srcYImage, const uint8_t* srcUVImage,
    uint8_t* dstYImage, uint8_t* dstUVImage,
    int srcW, int srcH, int dstW, int dstH,
    int srcYStride = 0, int srcUVStride = 0,
    int dstYStride = 0, int dstUVStride = 0,
    bool nv21Fmt = true);

// YV12 / I420 — full-resolution Y; U / V each at 1/4 resolution
int resizeYV12(
    const uint8_t* srcYImage, const uint8_t* srcUImage, const uint8_t* srcVImage,
    uint8_t* dstYImage, uint8_t* dstUImage, uint8_t* dstVImage,
    int srcW, int srcH, int dstW, int dstH,
    int srcYStride = 0, int srcUVStride = 0,
    int dstYStride = 0, int dstUVStride = 0);

// YUV444 — 3 channels interleaved, no sub-sampling
int resizeYUV444(
    const uint8_t* srcImage, uint8_t* dstImage,
    int srcW, int srcH, int dstW, int dstH,
    int srcStride = 0, int dstStride = 0);

nv21Fmt = true → NV21 (V/U order); false → NV12 (U/V order).
NV / YV12 require input sizes to be even.

YUV rotate (NV21 / NV12 / YV12 / YUV444)

Native YUV rotation; avoids the YUV ↔ RGB conversion.

Tier: Starter+
Channels: Y plane (1ch) + UV plane (interleaved or separate)
Inplace: supported for ROT_180 / FLIP_H / FLIP_V
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

NEON Signature

cpp

// NV21 / NV12
int rotateNV(
    const uint8_t* srcYImage, const uint8_t* srcUVImage,
    uint8_t* dstYImage, uint8_t* dstUVImage,
    int srcW, int srcH,
    int* dstW = nullptr, int* dstH = nullptr,
    acl::RotateOrient ori = acl::RotateOrient::ROT_CW_90,
    bool nv21Fmt = true);

// YV12 / I420
int rotateYV12(
    const uint8_t* srcYImage, const uint8_t* srcUImage, const uint8_t* srcVImage,
    uint8_t* dstYImage, uint8_t* dstUImage, uint8_t* dstVImage,
    int srcW, int srcH,
    int* dstW = nullptr, int* dstH = nullptr,
    acl::RotateOrient ori = acl::RotateOrient::ROT_CW_90);

// YUV444 (single-plane interleaved, 3ch)
int rotateYUV444(
    const uint8_t* srcImage, uint8_t* dstImage,
    int srcW, int srcH,
    int* dstW = nullptr, int* dstH = nullptr,
    acl::RotateOrient ori = acl::RotateOrient::ROT_CW_90);

dstW / dstH are optional output parameters — the caller passes in pointers and the function fills in the rotated destination size (for ROT_CW_90 / ROT_CCW_90 / XPOSE, this is a width/height swap). Pass nullptr if this info is not needed.

Example

cpp

uint8_t srcY[1920*1280], srcUV[1920*640*2];
uint8_t dstY[1280*1920], dstUV[1280*640*2];

// NV21 clockwise 90°
int dstW = 0, dstH = 0;
acl::neon::geometric::rotateNV(
    srcY, srcUV, dstY, dstUV, 1920, 1280,
    &dstW, &dstH,
    acl::RotateOrient::ROT_CW_90,
    /*nv21Fmt=*/true);
// dstW == 1280, dstH == 1920

Feature Detection

Namespace: acl::feature (CPP) / acl::neon::feature (NEON)

Common data structures (in the acl:: namespace):

cpp

struct KeyPoint      { int x, y; float response; };
struct KeyPointORB   { float x, y, response, scale, angle; uint8_t descriptor[32]; };
struct KeyPointExt   { float x, y, response, scale, angle; float descriptor[128]; };
struct Point2f       { float x, y; };
struct DMatch        { int queryIdx, trainIdx; float distance; };
struct Vec2f         { float val[2]; };    // (rho, theta)
struct Vec3f         { float val[3]; };    // (cx, cy, radius)
struct Vec4i         { int val[4]; };      // (x1, y1, x2, y2)

FAST Corner Detection

FAST corner detection (Bresenham 16-pixel circle comparison).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

int fastCornerDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    int threshold = 20,
    bool nonmaxSuppress = true,
    int type = 9);

Parameter	Type	Meaning	Default
`srcImage`	`const uint8_t*`	Input grayscale image	non-null
`srcStride`	`int`	Bytes per row	`0` = `width`
`keypoints`	`vector<KeyPoint>&`	Output corners	—
`threshold`	`int`	Grayscale difference threshold	`20`
`nonmaxSuppress`	`bool`	Whether to perform NMS	`true`
`type`	`int`	FAST-N (`9` or `12`; N is the number of consecutive pixels)	`9`

Harris Corner Detection

Harris corner detection. Response R = det(M) - k * trace(M)^2, where M is the structure tensor.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

// Detect corners (including NMS)
int harrisCornerDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    int blockSize = 2,
    float k = 0.04f,
    float threshold = 1e6f);

// Emit the Harris response map only (caller does the threshold / NMS) — CPP only
int harrisResponse(
    const uint8_t* srcImage, float* dstImage,
    int width, int height, int srcStride,
    int blockSize = 2, float k = 0.04f);

Parameter	Type	Meaning	Default
`blockSize`	`int`	Structure-tensor neighborhood radius	`2`
`k`	`float`	Harris free parameter	`0.04`
`threshold`	`float`	Minimum response threshold	`1e6`

Shi-Tomasi Corner Detection

Shi-Tomasi corners (Good Features to Track): response = min(λ1, λ2) (the structure tensor's eigenvalues).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

int shiTomasiCornerDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    int maxCorners = 500,
    float qualityLevel = 0.01f,
    float minDistance = 10.0f,
    int blockSize = 2);

// Alias (same semantics as shiTomasiCornerDetect)
int shiTomasiDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& corners,
    int maxCorners = 500,
    float qualityLevel = 0.01f,
    float minDistance = 10.0f,
    int blockSize = 2);

// Emit the min-eigenvalue response map only — CPP only
int minEigenValResponse(
    const uint8_t* srcImage, float* dstImage,
    int width, int height, int srcStride,
    int blockSize = 2);

Parameter	Type	Meaning	Default
`maxCorners`	`int`	Upper bound on returned corners (`0` = no limit)	`500`
`qualityLevel`	`float`	Minimum quality ratio relative to the strongest response	`0.01`
`minDistance`	`float`	Minimum Euclidean distance between adjacent corners	`10.0`
`blockSize`	`int`	Structure-tensor neighborhood radius	`2`

ORB (Oriented FAST and Rotated BRIEF)

ORB detection + 256-bit rBRIEF descriptors.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

// Detect + compute descriptors
int orbDetectAndCompute(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPointORB>& keypoints,
    int maxKeypoints = 500,
    float scaleFactor = 1.2f,
    int nLevels = 8,
    int fastThreshold = 20);

// Detection only (discard descriptors, output KeyPoint)
int orbDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    int maxKeypoints = 500,
    float scaleFactor = 1.2f,
    int nLevels = 8,
    int fastThreshold = 20);

// Hamming distance between two 256-bit descriptors (0-256)
int orbHammingDistance(const uint8_t desc1[32], const uint8_t desc2[32]);

Parameter	Type	Meaning	Default
`maxKeypoints`	`int`	Target keypoint count	`500`
`scaleFactor`	`float`	Inter-level pyramid scale factor (> `1.0`)	`1.2`
`nLevels`	`int`	Number of pyramid levels	`8`
`fastThreshold`	`int`	FAST internal threshold	`20`

srcWidth / srcHeight must be ≥ 32.

SIFT (Scale-Invariant Feature Transform)

SIFT scale-invariant features + 128-D descriptors.

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

int siftDetectAndCompute(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPointExt>& keypoints,
    int nOctaves = 0,
    int nScalesPerOctave = 3,
    float contrastThresh = 0.04f,
    float edgeThresh = 10.0f,
    float sigma = 1.6f);

// Detection only
int siftDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    int nOctaves = 0, int nScalesPerOctave = 3,
    float contrastThresh = 0.04f, float edgeThresh = 10.0f,
    float sigma = 1.6f);

Parameter	Type	Meaning	Default
`nOctaves`	`int`	Number of pyramid octaves (`0` = auto `log2(min(w,h)) - 2`)	`0`
`nScalesPerOctave`	`int`	Scales per octave	`3`
`contrastThresh`	`float`	DoG extremum contrast threshold	`0.04`
`edgeThresh`	`float`	Edge-response rejection threshold	`10.0`
`sigma`	`float`	Initial Gaussian sigma	`1.6`

srcWidth / srcHeight must be ≥ 16.

SURF (Speeded-Up Robust Features)

SURF accelerated features. Integral image + Hessian determinant; outputs 128-D descriptors (the first 64 dimensions of keypoints.descriptor are used).

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

int surfDetectAndCompute(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPointExt>& keypoints,
    float hessianThresh = 100.0f,
    int nOctaves = 4,
    int nOctaveLayers = 3);

// Detection only
int surfDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    float hessianThresh = 100.0f,
    int nOctaves = 4, int nOctaveLayers = 3);

srcWidth / srcHeight must be ≥ 24.

HOG (Histogram of Oriented Gradients)

HOG descriptor — histogram of gradient orientations; commonly used for pedestrian detection and classification features.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

struct HOGParams {
    int cellSize;    // default 8
    int blockSize;   // default 2 (block = blockSize × blockSize cells)
    int nbins;       // default 9
    int blockStride; // default 1 (in units of cells)
};

int computeHOG(
    const uint8_t* srcImage, int width, int height, int srcStride,
    float* descriptors, int& descriptorSize,
    const HOGParams& params = HOGParams());

Parameter	Type	Meaning	Default
`descriptors`	`float*`	Output descriptor array (pre-allocated by the caller)	non-null
`descriptorSize`	`int&`	Returns the actual number of floats written	—
`params`	`const HOGParams&`	HOG parameters	default-constructed

Output size = blocksX * blocksY * (blockSize * blockSize * nbins). You can estimate this in advance using the default parameters or by calling once to read descriptorSize.

houghLines / houghLinesP

Standard Hough line detection (houghLines outputs (rho, theta)) and probabilistic Hough (houghLinesP outputs line-segment endpoints).

Tier: Pro+
Channels: 1ch binary edge map (typically produced by canny)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

// Standard Hough
int houghLines(
    const uint8_t* edgeImage, int width, int height, int stride,
    std::vector<acl::Vec2f>& lines,
    float rho,
    float theta,
    int threshold);

// Probabilistic Hough (returns line segments)
int houghLinesP(
    const uint8_t* edgeImage, int width, int height, int stride,
    std::vector<acl::Vec4i>& lines,
    float rho,
    float theta,
    int threshold,
    double minLineLength,
    double maxLineGap);

Parameter	Type	Meaning	Recommended
`edgeImage`	`const uint8_t*`	Input binary edge image (non-zero = edge)	non-null
`rho`	`float`	Distance resolution (pixels)	`1.0`
`theta`	`float`	Angle resolution (radians)	`M_PI/180`
`threshold`	`int`	Accumulator vote threshold	standard `100` / probabilistic `50`
`minLineLength`	`double`	(probabilistic only) minimum line length	`0`
`maxLineGap`	`double`	(probabilistic only) maximum gap between two points on the same line	`10`

Output:

houghLines → Vec2f(rho, theta)
houghLinesP → Vec4i(x1, y1, x2, y2)

houghCircles

Hough circle detection (21HT gradient method).

Tier: Pro+
Channels: 1ch grayscale image (internally performs Canny + gradient, no pre-binarization needed)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

int houghCircles(
    const uint8_t* grayImage, int width, int height, int stride,
    std::vector<acl::Vec3f>& circles,
    float dp = 1.0f,
    float minDist = 20.0f,
    float param1 = 100.0f,
    float param2 = 100.0f,
    int minRadius = 0,
    int maxRadius = 0);

Parameter	Type	Meaning	Default
`grayImage`	`const uint8_t*`	Input grayscale image	non-null
`circles`	`vector<Vec3f>&`	Output circles `(cx, cy, radius)`	—
`dp`	`float`	Accumulator-to-image resolution ratio	`1.0`
`minDist`	`float`	Minimum distance between adjacent circle centers	`20.0`
`param1`	`float`	Canny upper threshold (lower threshold auto = `param1 / 2`)	`100.0`
`param2`	`float`	Accumulator threshold for circle-center detection	`100.0`
`minRadius`	`int`	Minimum radius	`0`
`maxRadius`	`int`	Maximum radius (`0` = `max(width, height)`)	`0`

opticalFlowLK

Sparse pyramid Lucas-Kanade optical flow.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

CPP / NEON Signature (identical)

cpp

int opticalFlowLK(
    const uint8_t* prevImage, const uint8_t* nextImage,
    int width, int height, int stride,
    const acl::Point2f* prevPts, acl::Point2f* nextPts,
    uint8_t* status, float* error,
    int numPoints,
    int winSize = 21,
    int maxLevel = 3,
    int maxIter = 30,
    float epsilon = 0.01f);

Parameter	Type	Meaning	Default
`prevImage`, `nextImage`	`const uint8_t*`	Previous / next frames	non-null
`prevPts`	`const Point2f*`	Points in the previous frame to be tracked	non-null
`nextPts`	`Point2f*`	Tracked points in the next frame (caller pre-allocates `numPoints`)	non-null
`status`	`uint8_t*`	Per-point status (`1` = tracking succeeded, `0` = lost; pre-allocated for `numPoints`)	non-null
`error`	`float*`	Per-point tracking error (may be `nullptr`; when non-null, pre-allocated for `numPoints`)	nullable
`numPoints`	`int`	Number of tracked points	—
`winSize`	`int`	Search window size	`21`
`maxLevel`	`int`	Maximum pyramid level	`3`
`maxIter`	`int`	Max iterations per level	`30`
`epsilon`	`float`	Convergence threshold	`0.01`

descriptorMatch (bfMatch / bfMatchBinary / bfKnnMatch / bfKnnMatchBinary)

Brute-force descriptor matching: bfMatch* returns 1 nearest neighbor per query; bfKnn* returns K nearest neighbors. The float version uses L2 distance; the Binary version uses Hamming distance.

Tier: Pro+
Channels: N/A (descriptor vectors)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	binary descriptors (e.g. ORB)
`T`	`float`	real-valued descriptors (e.g. SIFT/SURF)

CPP / NEON Signature (identical)

cpp

// float descriptors, L2 distance, 1-NN
int bfMatch(
    const float* queryDescs, int queryCount, int descDim,
    const float* trainDescs, int trainCount,
    std::vector<acl::DMatch>& matches);

// Binary descriptors (e.g. ORB), Hamming distance, 1-NN
int bfMatchBinary(
    const uint8_t* queryDescs, int queryCount, int descBytes,
    const uint8_t* trainDescs, int trainCount,
    std::vector<acl::DMatch>& matches);

// float descriptors, L2, K-NN
int bfKnnMatch(
    const float* queryDescs, int queryCount, int descDim,
    const float* trainDescs, int trainCount,
    std::vector<std::vector<acl::DMatch>>& matches,
    int k = 2);

// Binary descriptors, Hamming, K-NN
int bfKnnMatchBinary(
    const uint8_t* queryDescs, int queryCount, int descBytes,
    const uint8_t* trainDescs, int trainCount,
    std::vector<std::vector<acl::DMatch>>& matches,
    int k = 2);

Parameter	Type	Meaning
`queryDescs`, `trainDescs`	`const float` / `const uint8_t`	Row-major descriptors, length `Count × (descDim or descBytes)`
`descDim`	`int`	float descriptor dimension (SIFT `128`; the SURF implementation uses `64`)
`descBytes`	`int`	Binary descriptor byte count (ORB `32`)
`matches`	`vector<DMatch>&` or `vector<vector<DMatch>>&`	Output matches; each `DMatch` contains `queryIdx, trainIdx, distance`
`k`	`int`	K for K-NN (actually returns `min(k, trainCount)`)

Example

cpp

uint8_t srcImage[1920*1080];
std::vector<acl::KeyPointORB> kps;

// 1) ORB detection + description
acl::neon::feature::orbDetectAndCompute(
    srcImage, 1920, 1080, 0, kps);

// 2) ORB matching between two images (Binary + Hamming)
std::vector<uint8_t> qDescs(kps.size() * 32), tDescs(/*...*/);
for (size_t i = 0; i < kps.size(); ++i)
    memcpy(qDescs.data() + i*32, kps[i].descriptor, 32);

std::vector<acl::DMatch> matches;
acl::neon::feature::bfMatchBinary(
    qDescs.data(), (int)kps.size(), 32,
    tDescs.data(), /*trainCount=*/200,
    matches);

Transform

Namespace: acl::transform (CPP) / acl::neon::transform (NEON)

Transform is split into two categories:

Matrix computation (compute the transform matrix from point pairs): getRotationMatrix2D / getAffineTransform / getPerspectiveTransform / findHomography
Applying the transform (resample the image using the matrix): warpAffine / warpPerspective / remap / yuvRemap

getRotationMatrix2D

Construct a 2×3 affine matrix from rotation around (cx, cy) + scaling.

Tier: Starter+
Channels: N/A (matrix builder)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`double`	—

CPP Signature

cpp

int getRotationMatrix2D(
    double cx, double cy,
    double angle, double scale,
    double* M);

Parameter	Type	Meaning	Default
`cx`, `cy`	`double`	Rotation center coordinates	—
`angle`	`double`	Rotation angle (degrees, counter-clockwise positive)	—
`scale`	`double`	Scaling factor	—
`M`	`double*`	Output 2×3 matrix, row-major, 6 doubles	non-null

getAffineTransform

Compute a 2×3 affine transform matrix from 3 point pairs.

Tier: Starter+
Channels: N/A (matrix builder)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`double`	—

CPP Signature

cpp

int getAffineTransform(
    const double* srcImage, const double* dstImage,
    double* M);

Parameter	Type	Meaning
`srcImage`	`const double*`	3 source points `(x0,y0, x1,y1, x2,y2)`, 6 doubles total
`dstImage`	`const double*`	3 destination points (same format as above)
`M`	`double*`	Output 2×3 matrix, row-major

getPerspectiveTransform

Compute a 3×3 perspective transform matrix from 4 point pairs.

Tier: Starter+
Channels: N/A (matrix builder)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`double`	—

CPP / NEON Signature (identical)

cpp

int getPerspectiveTransform(
    const double* srcImage, const double* dstImage,
    double* M);

Parameter	Type	Meaning
`srcImage`	`const double*`	4 source points, 8 doubles total
`dstImage`	`const double*`	4 destination points
`M`	`double*`	Output 3×3 matrix, row-major, 9 doubles (finally normalized so `M[8] = 1`)

findHomography

Compute a 3×3 homography from N ≥ 4 point pairs; supports least-squares or RANSAC.

Tier: Pro+
Channels: N/A (point sets)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`double`	—

CPP / NEON Signature (identical)

cpp

int findHomography(
    const double* srcPts, const double* dstPts,
    int numPts, double* H,
    int method = 0,
    double ransacThreshold = 3.0);

Parameter	Type	Meaning	Default
`srcPts`	`const double*`	N source points `(x0,y0, x1,y1, …)`, `2*N` doubles total	non-null
`dstPts`	`const double*`	N destination points	non-null
`numPts`	`int`	Number of point pairs, must be ≥ 4	—
`H`	`double*`	Output 3×3 homography, row-major, normalized `H[8] = 1`	non-null
`method`	`int`	`0` = least-squares DLT; `1` = RANSAC	`0`
`ransacThreshold`	`double`	RANSAC inlier distance threshold	`3.0`

When numPts == 4, getPerspectiveTransform is used automatically (exact solution).

dltHomography

Low-level / algorithm-customization API. Most users should call findHomography; use this only if you need direct access to the linear DLT solver and will handle outlier rejection yourself.

Solve a 3×3 homography from numPts ≥ 4 point pairs using a single normalized Direct Linear Transform (DLT) least-squares pass — no RANSAC, no robust filtering.

Tier: Pro+
Channels: N/A (point sets)
Inplace: not supported

CPP Signature (`acl::transform` only)

This helper is not declared under acl::neon::transform; NEON packages expose findHomography, getPerspectiveTransform, warpAffine, and warpPerspective, but not the DLT helper.

cpp

int dltHomography(
    const double* srcPts, const double* dstPts,
    int numPts, double* H);

Parameter	Type	Meaning
`srcPts`	`const double*`	N source points `(x0,y0, x1,y1, …)`, `2*N` doubles
`dstPts`	`const double*`	N destination points, same layout
`numPts`	`int`	Number of point pairs, must be ≥ 4
`H`	`double*`	Output 3×3 homography, row-major, normalized `H[8] = 1`

Returns: 0 on success, non-zero on failure (degenerate point configuration, numPts < 4).

homographyError

Low-level / algorithm-customization API. Mostly useful when implementing custom outlier rejection or scoring loops on top of dltHomography.

Project a single point through a homography and return the squared Euclidean distance to its observed correspondence — i.e. the per-correspondence reprojection error used inside RANSAC scoring.

Tier: Pro+
Channels: N/A (scalar math)
Inplace: N/A

CPP Signature (`acl::transform` only)

This helper is not declared under acl::neon::transform; call acl::transform::homographyError from paid packages.

cpp

double homographyError(
    const double* H,
    double sx, double sy,
    double dx, double dy);

Parameter	Type	Meaning
`H`	`const double*`	Row-major 3×3 homography
`sx`, `sy`	`double`	Source point coordinates
`dx`, `dy`	`double`	Observed destination point coordinates

Returns: Squared Euclidean distance between H · (sx, sy, 1) (normalized to z=1) and (dx, dy).

warpAffine

Apply a 2×3 affine transform to an image. Uses inverse mapping: destination pixel (x', y') looks up source coordinates (x, y) = M * (x', y', 1)^T.

Tier: Starter+
Channels: 1ch / 3ch / 4ch (via runtime cn parameter)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`IMG_T`	`{uint8_t, uint16_t, float}`	—
`MAT_T`	`{float, double}`	—

CPP / NEON Signature (identical)

cpp

template<class IMG_T, class MAT_T = double>
int warpAffine(
    const IMG_T* srcImage, IMG_T* dstImage,
    const MAT_T* M,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    acl::InterpMode interpMode = acl::InterpMode::LINEAR2D,
    int cn = 1);

Parameter	Type	Meaning	Default
`srcImage`, `dstImage`	`const IMG_T` / `IMG_T`	input / output	non-null
`M`	`const MAT_T*`	2×3 affine matrix, row-major	non-null
`srcWidth`, `srcHeight`	`int`	Source image size	> 0
`dstWidth`, `dstHeight`	`int`	Destination image size	> 0
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto
`interpMode`	`acl::InterpMode`	`NEAREST` / `LINEAR2D`	`LINEAR2D`
`cn`	`int`	Channel count (1 / 3 / 4)	`1`

warpPerspective

Apply a 3×3 perspective transform to an image. Inverse mapping; homogeneous coordinates are divided by w.

Tier: Starter+
Channels: 1ch / 3ch / 4ch (via runtime cn parameter)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`IMG_T`	`{uint8_t, uint16_t, float}`	—
`MAT_T`	`{float, double}`	—

CPP / NEON Signature (identical)

cpp

template<class IMG_T, class MAT_T = double>
int warpPerspective(
    const IMG_T* srcImage, IMG_T* dstImage,
    const MAT_T* M,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    acl::InterpMode interpMode = acl::InterpMode::LINEAR2D,
    int cn = 1);

Runtime parameters are the same as warpAffine; M is a 3×3 matrix (9 MAT_T).

Example (warpAffine + warpPerspective chain)

cpp

uint8_t srcImage[1920*1080], dstImage[1920*1080];

// 1) Build a 30° rotation matrix around the image center
double M_affine[6];
acl::transform::getRotationMatrix2D(960.0, 540.0, 30.0, 1.0, M_affine);

// 2) Apply affine (NEON-accelerated, LINEAR2D)
acl::neon::transform::warpAffine<uint8_t, double>(
    srcImage, dstImage, M_affine, 1920, 1080, 1920, 1080,
    0, 0, acl::InterpMode::LINEAR2D, 1);

// 3) 4-point perspective matrix
double srcPts[8] = { 0,0,  1920,0,  1920,1080,  0,1080 };
double dstPts[8] = { 100,50,  1820,0,  1920,1080,  50,1050 };
double H[9];
acl::transform::getPerspectiveTransform(srcPts, dstPts, H);
acl::neon::transform::warpPerspective<uint8_t, double>(
    srcImage, dstImage, H, 1920, 1080, 1920, 1080,
    0, 0, acl::InterpMode::LINEAR2D, 1);

remap

Generic pixel remap: dst(x, y) = src(mapX(x, y), mapY(x, y)). Can be used to implement fisheye correction, arbitrary distortion correction, etc.

Tier: Starter+
Channels: 1ch / 3ch / 4ch (via runtime cn parameter)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`IMG_T`	`{uint8_t, uint16_t, float}`	—
`MAP_T`	`{float, double}`	—

CPP Signature

cpp

template<class IMG_T, class MAP_T>
int remap(
    const IMG_T* srcImage, IMG_T* dstImage,
    const MAP_T* mapX, const MAP_T* mapY,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    int mapStride = 0,
    acl::InterpMode interpMode = acl::InterpMode::LINEAR2D,
    int cn = 1);

Parameter	Type	Meaning	Default
`mapX`, `mapY`	`const MAP_T*`	Source coordinate map, size `dstWidth × dstHeight` (one `(x, y)` pair per pixel)	non-null
`mapStride`	`int`	Map bytes per row (shared by `mapX` and `mapY`)	`0` = auto
`interpMode`	`acl::InterpMode`	Interpolation mode	`LINEAR2D`
`cn`	`int`	Channel count	`1`

yuvRemap

Remap an NV21 / NV12 image directly while preserving the Y / UV plane structure (equivalent to remap followed by restoration of the YUV sampling relationship).

Tier: Business
Channels: NV21 / NV12 (Y plane + interleaved UV plane)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`IMG_T`	`uint8_t`	—
`MAP_T`	`{float, double}`	—

CPP Signature

cpp

template<class IMG_T, class MAP_T>
int yuvRemap(
    const IMG_T* srcYImage, const IMG_T* srcUVImage,
    IMG_T* dstYImage, IMG_T* dstUVImage,
    const MAP_T* mapX, const MAP_T* mapY,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0, int mapStride = 0,
    acl::InterpMode interpMode = acl::InterpMode::LINEAR2D);

Parameter	Type	Meaning
`srcYImage`, `srcUVImage`	`const IMG_T*`	Source NV21 / NV12 planes
`dstYImage`, `dstUVImage`	`IMG_T*`	Destination NV21 / NV12 planes
`mapX`, `mapY`	`const MAP_T*`	Source coordinate map based on the Y plane size (UV is automatically sampled at half resolution)

Math

Namespace: acl::neon::math

Discrete Fourier Transform (DFT / IDFT / complex spectrum multiplication). matchTemplate also uses this API set internally.

Tier: Pro+
NEON only (there is no corresponding standalone CPP API)

DftFlags (see acl::DftFlags):

Flag	Value	Meaning
`DFT_FORWARD`	`0`	Forward transform (default)
`DFT_INVERSE`	`1`	Inverse transform
`DFT_SCALE`	`2`	Divide the result by `N` for normalization

Flags can be OR-combined, e.g. DFT_INVERSE \| DFT_SCALE.

dft1d

1D complex → complex DFT. Tier: Pro+
Channels: N/A (1-D signal)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`float`	—

cpp

int dft1d(
    const float* srcRe, const float* srcIm,
    float* dstRe, float* dstIm,
    int n,
    int flags = acl::DFT_FORWARD);

Parameter	Type	Meaning	Default
`srcRe`	`const float*`	Input real part (`n` elements)	non-null
`srcIm`	`const float*`	Input imaginary part (`n` elements; may be `nullptr` for real-valued input)	nullable
`dstRe`, `dstIm`	`float*`	Output real / imaginary parts (`n` elements each)	non-null
`n`	`int`	Transform length	> 0
`flags`	`int`	`DFT_FORWARD` / `DFT_INVERSE`, optionally OR'd with `DFT_SCALE`	`DFT_FORWARD`

dftReal1d

1D real → complex forward DFT (output is a half-spectrum, n/2 + 1 complex coefficients, exploiting conjugate symmetry). Tier: Pro+
Channels: N/A (1-D signal)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`float`	—

cpp

int dftReal1d(
    const float* srcImage,
    float* dstRe, float* dstIm,
    int n);

Parameter	Type	Meaning	Default
`srcImage`	`const float*`	Input real-valued array (`n` elements)	non-null
`dstRe`, `dstIm`	`float*`	Output real / imaginary parts (`n/2 + 1` elements each)	non-null
`n`	`int`	Input length	even and a power of 2

idftReal1d

1D complex (CCS half-spectrum) → real inverse DFT. Symmetric to dftReal1d: input is n/2 + 1 complex coefficients, output is n real values. Tier: Pro+
Channels: N/A (1-D signal)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`float`	—

cpp

int idftReal1d(
    const float* srcRe, const float* srcIm,
    float* dstImage,
    int n);

Parameter	Type	Meaning	Default
`srcRe`, `srcIm`	`const float*`	Input real / imaginary parts (`n/2 + 1` elements each, CCS format)	non-null
`dstImage`	`float*`	Output real array (`n` elements)	non-null
`n`	`int`	Output length	even and a power of 2

dft2d

2D complex → complex DFT (row-wise + column-wise, two 1D FFTs). Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
`T`	`float`	—

cpp

int dft2d(
    const float* srcRe, const float* srcIm,
    float* dstRe, float* dstIm,
    int width, int height,
    int flags = acl::DFT_FORWARD);

Parameter	Type	Meaning	Default
`srcRe`	`const float*`	Input real part (`width * height`, row-major)	non-null
`srcIm`	`const float*`	Input imaginary part (same; may be `nullptr` for real input)	nullable
`dstRe`, `dstIm`	`float*`	Output real / imaginary parts (same size)	non-null
`width`, `height`	`int`	Columns / rows	> 0
`flags`	`int`	Same as `dft1d`	`DFT_FORWARD`

mulSpectrums

Per-element complex multiplication: C = A * B or C = A * conj(B). Commonly used for frequency-domain cross-correlation / convolution. Tier: Pro+
Channels: N/A (complex spectra)
Inplace: supported (aRe / aIm may equal dstRe / dstIm)
Types:

Template parameter	Allowed types	Constraint
`T`	`float`	—

cpp

int mulSpectrums(
    const float* aRe, const float* aIm,
    const float* bRe, const float* bIm,
    float* cRe, float* cIm,
    int n,
    bool conjB = false);

Parameter	Type	Meaning	Default
`aRe`, `aIm`	`const float*`	Real / imaginary parts of complex array A	non-null
`bRe`, `bIm`	`const float*`	Real / imaginary parts of complex array B	non-null
`cRe`, `cIm`	`float*`	Real / imaginary parts of the output product C	non-null
`n`	`int`	Number of complex elements	> 0
`conjB`	`bool`	`true` = take the conjugate of B before multiplying	`false`

Example

cpp

int n = 1024;   // input length: even and a power of 2
std::vector<float> srcImage(n, 0.0f), re(n), im(n);
std::vector<float> back(n);

// 1) real → half-spectrum
acl::neon::math::dftReal1d(srcImage.data(), re.data(), im.data(), n);

// 2) Frequency-domain processing (example: pass-through)

// 3) half-spectrum → real restoration
acl::neon::math::idftReal1d(re.data(), im.data(), back.data(), n);

Drawing

Namespace: acl::draw

Draw lines / rectangles / circles / text on an image. All entry points are non-templated, uint8_t only, supporting 1ch / 3ch / 4ch.

Tier: Starter+
Channels: 1 / 3 / 4
Inplace: drawn directly on the source image (img is both input and output)

drawLine

Draw a line segment using the Bresenham algorithm. Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (writes onto the provided image buffer)
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

cpp

int drawLine(
    uint8_t* img,
    int width, int height, int cn, int stride,
    int x0, int y0, int x1, int y1,
    const uint8_t* color,
    int thickness = 1);

Parameter	Type	Meaning	Default
`img`	`uint8_t*`	Destination image (drawn in place)	non-null
`width`, `height`	`int`	Image size	> 0
`cn`	`int`	Channel count	`1` / `3` / `4`
`stride`	`int`	Bytes per row	`0` = `width * cn`
`x0`, `y0`, `x1`, `y1`	`int`	Line segment start / end points	—
`color`	`const uint8_t*`	Color-value array of length `cn`	non-null
`thickness`	`int`	Line width (pixels)	`1`

drawRect

Draw a rectangle (outline or filled). Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (writes onto the provided image buffer)
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

cpp

int drawRect(
    uint8_t* img,
    int imgW, int imgH, int cn, int stride,
    int x, int y, int w, int h,
    const uint8_t* color,
    int thickness = 1);

Parameter	Type	Meaning	Default
`x`, `y`	`int`	Rectangle top-left	—
`w`, `h`	`int`	Rectangle width / height	—
`thickness`	`int`	Line width; `-1` = filled rectangle	`1`

Other parameters are the same as drawLine.

drawCircle

Draw a circle using the midpoint circle algorithm (outline or filled). Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (writes onto the provided image buffer)
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

cpp

int drawCircle(
    uint8_t* img,
    int width, int height, int cn, int stride,
    int cx, int cy, int radius,
    const uint8_t* color,
    int thickness = 1);

Parameter	Type	Meaning	Default
`cx`, `cy`	`int`	Circle center	—
`radius`	`int`	Radius	≥ 0
`thickness`	`int`	Line width; `-1` = filled circle	`1`

putText

Render ASCII text using the built-in 8×16 bitmap font. Supports newlines \n; the font size scales linearly via scale. Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (writes onto the provided image buffer)
Types:

Template parameter	Allowed types	Constraint
`T`	`uint8_t`	—

cpp

int putText(
    uint8_t* img,
    int width, int height, int cn, int stride,
    const char* text,
    int x, int y,
    const uint8_t* color,
    int scale = 1);

Parameter	Type	Meaning	Default
`text`	`const char*`	`\0`-terminated ASCII string	non-null
`x`, `y`	`int`	Text top-left coordinates	—
`scale`	`int`	Font-size multiplier (`1` = 8×16, `2` = 16×32 …)	`1`

Example

cpp

uint8_t img[1920*1080*3] = {};
uint8_t red[3]  = { 255, 0, 0 };
uint8_t blue[3] = { 0, 0, 255 };

acl::draw::drawLine(img, 1920, 1080, 3, 0, 100, 100, 500, 500, red, 2);
acl::draw::drawRect(img, 1920, 1080, 3, 0, 200, 200, 300, 150, blue, -1);
acl::draw::drawCircle(img, 1920, 1080, 3, 0, 960, 540, 80, red, 3);
acl::draw::putText(img, 1920, 1080, 3, 0, "Hello\nWorld", 50, 50, blue, 2);

Contour Analysis

Namespace: acl::contour

Contour geometry analysis. Input is std::vector<acl::Point2i> (typically produced by findContours).

Tier: Pro+
NEON: none (geometric computation, pure CPP)

Related structs

cpp

struct acl::Point2i      { int x, y; };                 // input point type
struct acl::Rect         { int x, y, width, height; };
struct acl::Point2f      { float x, y; };
struct acl::Size2f       { float width, height; };
struct acl::RotatedRect  { Point2f center; Size2f size; float angle; };

contourArea

Compute polygon area via the Shoelace formula. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
Input	`std::vector<acl::Point2i>`	—
Output	`double`	—

cpp

double contourArea(
    const std::vector<acl::Point2i>& contour,
    bool oriented = false);

Parameter	Type	Meaning	Default
`contour`	point array	Contour (at least 3 points, otherwise returns `0`)	—
`oriented`	`bool`	`true` = signed area (positive = counter-clockwise, negative = clockwise); `false` = absolute value	`false`

The return value is the area (not an error code).

arcLength

Contour perimeter (sum of Euclidean distances between consecutive points). Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
Input	`std::vector<acl::Point2i>`	—
Output	`double`	—

cpp

double arcLength(
    const std::vector<acl::Point2i>& contour,
    bool closed = true);

Parameter	Type	Meaning	Default
`closed`	`bool`	`true` = closed (including last point → first point); `false` = open	`true`

The return value is the length (not an error code).

boundingRect

Compute the axis-aligned bounding rectangle of a contour. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
Input	`std::vector<acl::Point2i>`	—
Output	`acl::Rect`	—

cpp

acl::Rect boundingRect(
    const std::vector<acl::Point2i>& contour);

Returns a Rect (not an error code). An empty contour returns Rect(0, 0, 0, 0).

convexHull

Compute the convex hull using Andrew's monotone chain algorithm. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
input/output	`std::vector<acl::Point2i>`	—

cpp

int convexHull(
    const std::vector<acl::Point2i>& points,
    std::vector<acl::Point2i>& hull);

The output hull is arranged in counter-clockwise order.

approxPolyDP

Simplify a polygon (reduce the number of vertices) using the Douglas-Peucker algorithm. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
input/output	`std::vector<acl::Point2i>`	—

cpp

int approxPolyDP(
    const std::vector<acl::Point2i>& curve,
    std::vector<acl::Point2i>& approx,
    double epsilon,
    bool closed = true);

Parameter	Type	Meaning	Default
`epsilon`	`double`	Maximum distance between the original curve and the simplified curve	—
`closed`	`bool`	Treat as a closed contour	`true`

minAreaRect

Compute the minimum-area rotated bounding rectangle using rotating calipers. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
Input	`std::vector<acl::Point2i>`	—
Output	`acl::RotatedRect`	—

cpp

int minAreaRect(
    const std::vector<acl::Point2i>& points,
    acl::RotatedRect& result);

The input requires at least 3 points (0 points returns an error; 1-2 points returns a degenerate Rect).

fitEllipse

Fit an ellipse via Direct Least Squares; outputs the ellipse's rotated-rectangle description. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameter	Allowed types	Constraint
Input	`std::vector<acl::Point2i>`	≥5
Output	`acl::RotatedRect`	—

cpp

int fitEllipse(
    const std::vector<acl::Point2i>& points,
    acl::RotatedRect& result);

The input requires at least 5 points.

Utilities

cropRect

Copy a rectangular region between two buffers; supports inplace stride reduction.

Namespace: acl::memory

cpp

template<class T>
int cropRect(
    const T* srcBuffer, T* dstBuffer,
    int copyWidth, int copyHeight,
    int srcStride = 0, int dstStride = 0,
    int srcLeft = 0, int srcTop = 0,
    int dstLeft = 0, int dstTop = 0);

Parameter	Type	Meaning	Default
`copyWidth`, `copyHeight`	`int`	Size of the region to copy	—
`srcLeft`, `srcTop`	`int`	Starting position in the source buffer	`0`
`dstLeft`, `dstTop`	`int`	Starting position in the destination buffer	`0`
`srcStride`, `dstStride`	`int`	Bytes per row	`0` = auto

Supports inplace stride reduction when srcBuffer == dstBuffer (compresses in place when srcStride ≥ dstStride).

Namespace Availability Matrix

Per-category distribution across acl::neon::* and the scalar CPP path (acl::{module}::*). When an operator exists in both, the signatures are identical unless the operator's own section calls out a difference.

Category	Both `neon::` + CPP	CPP only
Analysis	integral, histogram, equalizeHist, clahe, minMaxLoc, copyMakeBorder, matchTemplate, blockAverage	histMatch, moments, count, mean, connectedComponent_8n_dfs, connectedComponentLabeling, findContours, distanceTransform, extractBlockPixels
Arithmetic	addImg, absDiff, addWeighted, alphaImgFusion, mul, threshold, adaptiveThreshold, bitwise (And/Not/Xor), lut, convertScaleAbs, inRange, normalize, phaseMagnitude	linearTransform2x2
Color Conversion	RGB2Gray / RGBA2Gray, channelSwap (5 Mode tags), bayer2RGB, rgb2YUV_fixed family, yuv2RGB_fixed family, rgb2HSV / bgr2HSV, rgb2Lab / bgr2Lab	hsv2BGR, lab2BGR, rgb2YUV_float family, yuv2RGB_float family, bayer2RGBA, gammaTransform
Filter	gaussianBlur, boxFilter, filter2D, sepFilter2D, sobel3x3, scharr, laplacian, morphology (erode/dilate), canny, medianFilter3x3, bilateralFilter, nlMeansDenoising, guidedFilter, unsharpMask, stackBlur, gaborFilter	edgePreservingFilter, detailEnhance, tonemap (Linear/Reinhard/Drago), mergeMertens
Geometric	resize, rotate (NEON 1ch u8 only), pyrDown, pyrUp, buildPyramid	—
Geometric (NEON only)	resizeNV, resizeYV12, resizeYUV444, rotateNV, rotateYV12, rotateYUV444	—
Feature Detection	FAST, Harris (detect), Shi-Tomasi (detect) / shiTomasiDetect, ORB (detect + describe), SIFT, SURF, HOG, houghLines / houghLinesP, houghCircles, opticalFlowLK, descriptorMatch (bfMatch / bfMatchBinary and K-NN variants)	harrisResponse, minEigenValResponse
Transform	getPerspectiveTransform, findHomography, warpAffine, warpPerspective	getRotationMatrix2D, getAffineTransform, remap, yuvRemap
Math	—	—
Math (NEON only)	dft1d / dft2d / dftReal1d / idftReal1d / mulSpectrums	—
Drawing	—	drawLine, drawRect, drawCircle, putText
Contour Analysis	—	contourArea, arcLength, boundingRect, convexHull, approxPolyDP, minAreaRect, fitEllipse
Utilities	—	cropRect

On Android arm64-v8a, prefer the NEON variant whenever the operator exists under acl::neon::. Concrete speedups over OpenCV vary by operator and image size — see the Performance Whitepaper.

ACL Pack API Reference ​

Table of Contents ​

Operator Catalog ​

Commercial Tier And Type Policy ​

Getting Started ​

Initialization ​

License Initialization ​

Architecture ​

Conventions ​

Error Codes ​

Type Definitions ​

Enums ​

RotateOrient ​

InterpMode ​

BorderType ​

BayerPattern ​

ColorCvtGrayMode ​

ThreshMode ​

YUVEncodeStandard ​

NormType ​

AdaptiveThreshMethod ​

MorphOp ​

ValueRange ​

TemplateMatchMethod ​

DftFlags ​

Data Structures (Geometry / Hough / Features / Color / Contour) ​

Analysis ​

integral ​

CPP Version ​

NEON Version ​

Example ​

histogram ​

CPP Version ​

NEON Version (uint8_t only, fixed histLen = 256) ​

histMatch ​

CPP Signature ​

equalizeHist ​

CPP Version ​

NEON Version (uint8_t only) ​

clahe ​

CPP / NEON Signature (identical) ​

minMaxLoc ​

CPP / NEON Signature (identical) ​

moments ​

CPP Signature ​

copyMakeBorder ​

CPP Version ​

NEON Version ​

count ​

CPP Signature (3 entry points) ​

mean ​

CPP Signature (two overloads: 2-image + N-image) ​

matchTemplate ​

CPP Version ​

NEON Version (uint8_t only) ​

connectedComponent_8n_dfs ​

CPP Signature ​

connectedComponentLabeling ​

CPP Signature ​

findContours ​

CPP Signature ​

distanceTransform ​

CPP Signature (2 entry points: float or u8 output) ​

blockAverage ​

CPP / NEON Signature (identical) ​

extractBlockPixels ​

CPP Signature ​

Arithmetic ​

addImg ​

CPP Version ​

NEON Version ​

Example ​

absDiff ​

Example ​

addWeighted ​

CPP Version ​

NEON Version (T ∈ {uint8_t, uint16_t, float}, same type for inputs and output) ​

Example ​

alphaImgFusion ​

CPP / NEON Signature (identical) ​

ACL Pack API Reference

Table of Contents

Operator Catalog

Commercial Tier And Type Policy

Getting Started

Initialization

License Initialization

Architecture

Conventions

Error Codes

Type Definitions

Enums

RotateOrient

InterpMode

BorderType

BayerPattern

ColorCvtGrayMode

ThreshMode

YUVEncodeStandard

NormType

AdaptiveThreshMethod

MorphOp

ValueRange

TemplateMatchMethod

DftFlags

Data Structures (Geometry / Hough / Features / Color / Contour)

Analysis

integral

CPP Version

NEON Version

Example

histogram

CPP Version

NEON Version (`uint8_t` only, fixed `histLen = 256`)

histMatch

CPP Signature

equalizeHist

CPP Version

NEON Version (`uint8_t` only)

clahe

CPP / NEON Signature (identical)

minMaxLoc

CPP / NEON Signature (identical)

moments

CPP Signature

copyMakeBorder

CPP Version

NEON Version

count

CPP Signature (3 entry points)

mean

CPP Signature (two overloads: 2-image + N-image)

matchTemplate

CPP Version

NEON Version (`uint8_t` only)

connectedComponent_8n_dfs

CPP Signature

connectedComponentLabeling

CPP Signature

findContours

CPP Signature

distanceTransform

CPP Signature (2 entry points: float or u8 output)

blockAverage

CPP / NEON Signature (identical)

extractBlockPixels

CPP Signature

Arithmetic

addImg

CPP Version

NEON Version

Example

absDiff

Example

addWeighted

CPP Version

NEON Version (`T` ∈ `{uint8_t, uint16_t, float}`, same type for inputs and output)

Example

alphaImgFusion

CPP / NEON Signature (identical)