Skip to content

ACL Pack API Reference

Version: 1.0.3
Platform: Android arm64-v8a · Linux aarch64
Language: C++17
Delivery: Static library (libacl.a) + Headers

Table of Contents

  1. Operator Catalog
  2. Getting Started
  3. Error Codes
  4. Type Definitions
  5. Analysis
  6. Arithmetic
  7. Color Conversion
  8. Filter
  9. Geometric
  10. Feature Detection
  11. Transform
  12. Math
  13. Drawing
  14. Contour Analysis
  15. Utilities
  16. Namespace Availability Matrix

Operator Catalog

One-page panorama of every operator ACL Pack ships, grouped by category. Tier suffix tells you which license unlocks the call:

  • No suffix — Starter tier
  • [Pro] — requires Pro or Business license
  • [Business] — requires Business license

Per-operator signatures and supported types live in the category sections below.

CategoryOperators
FilterGaussianBlur, BoxFilter, Filter2D, SepFilter2D, Sobel, Scharr, Laplacian, Canny, MedianFilter, BilateralFilter [Pro], NLMeansDenoising [Pro], GuidedFilter [Pro], UnsharpMask [Pro], StackBlur, GaborFilter [Pro], Erode / Dilate, EdgePreservingFilter [Business], MergeMertens [Business], Tonemap [Business]
ColorRGB2Gray, BGR↔RGB / BGRA / RGBA (10 channel-swap variants), RGB↔HSV [Pro], RGB↔Lab [Pro], RGB↔YUV (NV21 / YV12 / YUV444, BT601 / 709 / 2020), Bayer demosaic, GammaTransform
GeometricResize (NEAREST / LINEAR2D / AREA_AVG / CUBIC4x4), Rotate (0 / 180 / CW90 / CCW90 / FLIP_V / FLIP_H / XPOSE), PyrDown / PyrUp / buildPyramid, NEON resizeYUV / rotateYUV
ArithmeticAddImg, AbsDiff, AddWeighted, AlphaImgFusion, Multiply, Threshold, AdaptiveThreshold [Pro], Bitwise AND / NOT / XOR, LUT, ConvertScaleAbs, InRange, Normalize, Phase [Pro], Magnitude [Pro], LinearTransform2x2 [Business]
AnalysisIntegral, Histogram, BlockAverage, EqualizeHist, CopyMakeBorder, CLAHE [Pro], HistMatch [Pro], MinMaxLoc, Mean, Count, MatchTemplate [Pro], Moments [Pro], FindContours [Pro], ExtractBlockPixels [Business], DistanceTransform [Business], ConnectedComponentLabeling / connectedComponent_8n_dfs [Business]
FeatureFAST, Harris, Shi-Tomasi (+ Detect variant), ORB (detect + detectAndCompute), HOG, HoughLines / HoughLinesP / HoughCircles, OpticalFlowLK, bfMatch / bfMatchBinary / bfKnnMatch / bfKnnMatchBinary — all [Pro]; SIFT, SURF [Business]
TransformWarpAffine, WarpPerspective, Remap (CPP only), GetAffineTransform, GetPerspectiveTransform, GetRotationMatrix2D, FindHomography [Pro], yuvRemap [Business]
Math (NEON)DFT (dft1d / dft2d / dftReal1d / idftReal1d), mulSpectrums, getOptimalDFTSize — all [Pro]
Draw (CPP)drawLine, drawRect, drawCircle, putText (u8 only)
Contour (CPP)contourArea, arcLength, boundingRect, convexHull, approxPolyDP, minAreaRect, fitEllipse — all [Pro]

Commercial Tier And Type Policy

The customer-facing tier names are Starter, Pro, and Business. Older labels such as Core, Advanced, and Full are not used by the commercial headers or package metadata.

TierOperator availabilityImage pixel types admitted by the commercial header
StarterStarter operatorsuint8_t
ProStarter + Pro operatorsuint8_t, uint16_t
BusinessStarter + Pro + Business operatorsuint8_t, uint16_t, float

Each operator's Types table describes the implementation-level template support. In commercial packages, the actually callable type set is the intersection of that table and the tier type policy above. Unsupported combinations are rejected by the delivered <acl/api.h> at compile time, often through explicitly deleted template specializations. For example, blockAverage<uint16_t> and blockAverage<float> are deleted in Starter; blockAverage<float> is deleted in Pro; Business exposes all three listed pixel types.

The Trial package is separate from the generic acl::* / acl::neon::* API surface. It exposes only two fixed-parameter wrappers under acl::trial: resizeBilinear2xDown_cpp and resizeBilinear2xDown_neon. Trial users should not call the generic namespaces documented for paid tiers.

cpp
namespace acl::trial {
int resizeBilinear2xDown_cpp(const uint8_t* srcImage, uint8_t* dstImage);
int resizeBilinear2xDown_neon(const uint8_t* srcImage, uint8_t* dstImage);
}

These Trial wrappers use the fixed Trial input size 1920x1280; the 2x downscale wrapper writes 960x640.

Getting Started

Initialization

cpp
#include <acl/acl.h>

// Initialize with license file
int result = acl::init("/path/to/license.dat");

// Check the result
if (result == 0) {
    // Success — all operators available
} else {
    // Failure — see error codes below
}

// Get library version
const char* ver = acl::version();  // "1.0.3"

License Initialization

cpp
namespace acl {
    int init(const char* licensePath, JNIEnv* env = nullptr, jobject context = nullptr);
    const char* version();   // returns "1.0.3"
}
ParameterTypeDescription
licensePathconst char*Absolute path to license.dat on the device
envJNIEnv*JNI environment, optional. Reserved in the ABI for future use; the current implementation does not read it. Pass nullptr for pure native calls
contextjobjectAndroid Context, same as env

init() reads the license file and verifies its integrity and tier. Within the scope of your purchase agreement, the version you received continues to work. Call it once at process start before invoking any operator; calling it again later in the same process returns the same status without re-reading the file.

Return values:

  • 0 — success
  • -1001 — License invalid (file missing, signature corrupted, tampered with, or init() not called)
  • -1005 — Tier mismatch: license.tier does not match the compiled library tier (at init stage), or the operator is not in the current tier (at runtime)
  • -1006 — Resolution does not match the Trial fixed size (1920×1280)

Architecture

ACL Pack provides two parallel implementations:

  • acl::{module}::* — Portable C++ scalar implementation. Many operators are templated for the standard image pixel types (uint8_t / uint16_t / float; see each operator's description for exact combinations). All scalar operators sit directly under acl::{module}:: (no cpp segment). All declarations ship in a single header — <acl/api.h>.
  • acl::neon::{module}::* — ARM NEON hand-vectorized implementation. Most operators target uint8_t first; uint16_t and float support is operator-dependent and may use scalar fallback. Typical speedup is 2-25× over the scalar layer, peaking at 50×+.

The two API signatures are almost identical. On Android arm64-v8a the neon:: version is recommended; if a given entry point is only provided in scalar (the docs will note this explicitly), use the corresponding acl::{module}:: entry point.

short / int16_t, int, int64_t, and double are supported in specific auxiliary roles such as gradient outputs, labels, integral accumulators, transform matrices, moments, and parameters. They are not general image pixel input types. <acl/typeDef.h> defines shared public structs/enums; it is not a guarantee that every enum value or datatype is implemented by every operator.

Conventions

  • Image data is passed as raw pointers (const T* input, T* output)
  • stride is in bytes (not pixels). When 0 is passed it is auto-computed as width × channels × sizeof(T) (requires contiguous memory)
  • cn is the channel count (1 = grayscale, 3 = RGB, 4 = RGBA)
  • Returns 0 (ACL_OK) on success; negative values are error codes
  • Memory is allocated and freed by the caller; the library has zero implicit allocation
  • Most operators do not support inplace (src == dst) and require a separately allocated output buffer; the few that support inplace are noted at the operator
  • Operator calls require acl::init() to have returned 0; otherwise they return -1001 without performing any computation

Error Codes

cpp
#include <acl/err.h>
CodeMacroDescription
0ACL_OKOperation completed successfully
-1ACL_ERR_GENERICUnclassified failure
-2ACL_ERR_INVALInvalid parameter (null ptr, zero size, out-of-range enum)
-3ACL_ERR_NOMEMOut of memory / allocation failed
-4ACL_ERR_NOSUPUnsupported type / parameter combination
-5ACL_ERR_IOFile / port open or I/O failure
-1001ACL_ERR_LICENSE_INVALIDLicense file missing, corrupt, tampered with, or acl::init() has not yet succeeded
-1005ACL_ERR_NOT_LICENSEDTier mismatch: license.tier does not match the compiled library tier (detected by acl::init), or the requested operator is not available in the current tier (detected at call site)
-1006ACL_ERR_RESOLUTION_LIMITResolution does not match the Trial fixed size (1920×1280)

-1002, -1003, and -1004 are reserved in the ABI but never returned at runtime. All macros are defined in <acl/err.h>. See License Guide for detail on how -1001 / -1005 are raised from the license layer.

Type Definitions

cpp
#include <acl/typeDef.h>

Enums

RotateOrient

cpp
enum class RotateOrient {
    ROT_0,      // No rotation (copy)
    ROT_180,    // 180-degree rotation
    ROT_CW_90,  // Clockwise 90 degrees
    ROT_CCW_90, // Counter-clockwise 90 degrees
    FLIP_V,     // Vertical mirror (flip top-bottom)
    FLIP_H,     // Horizontal mirror (flip left-right)
    XPOSE       // Matrix transpose
};

InterpMode

cpp
enum class InterpMode {
    NEAREST,    // Nearest neighbor
    LINEAR2D,   // Bilinear interpolation
    AREA_AVG,   // Area-average (for downscaling)
    CUBIC4x4    // Bicubic (4x4 neighborhood)
};

BorderType

Border handling modes — how to fill out-of-bounds pixels when the kernel extends past the image. Example sequence abcdefgh (input):

cpp
enum class BorderType {
    BORDER_CONSTANT,    // 'iiiiii|abcdefgh|iiiiii' — fill with the constant parameter
    BORDER_REPLICATE,   // 'aaaaaa|abcdefgh|hhhhhh' — replicate edge pixels
    BORDER_REFLECT,     // 'fedcba|abcdefgh|hgfedc' — reflection including the edge pixel
    BORDER_WRAP,        // 'cdefgh|abcdefgh|abcdef' — wrap-around
    BORDER_REFLECT_101, // 'gfedcb|abcdefgh|gfedcb' — reflection excluding the edge pixel
    BORDER_DEFAULT = BORDER_REFLECT_101
};

BayerPattern

cpp
enum class BayerPattern { RGGB, GRBG, GBRG, BGGR };

ColorCvtGrayMode

cpp
enum class ColorCvtGrayMode {
    GRAY_LUMA,     // BT.601 luma (0.299R + 0.587G + 0.114B)
    GRAY_MAX,      // Per-pixel max of R, G, B
    GRAY_MIN,      // Per-pixel min of R, G, B
    GRAY_AVG,      // Simple average (R + G + B) / 3
    GRAY_WEIGHTED  // User-supplied weights (cR * R + cG * G + cB * B)
};

ThreshMode

cpp
enum class ThreshMode {
    THRESH_BINARY,     // dst = (src > thresh) ? maxVal : 0
    THRESH_BINARY_INV, // dst = (src > thresh) ? 0 : maxVal
    THRESH_TRUNC,      // dst = (src > thresh) ? thresh : src
    THRESH_TOZERO,     // dst = (src > thresh) ? src : 0
    THRESH_TOZERO_INV, // dst = (src > thresh) ? 0 : src
    THRESH_OTSU        // Automatic threshold (Otsu's method)
};

YUVEncodeStandard

cpp
enum class YUVEncodeStandard {
    STD_BT601,    // ITU-R BT.601 (SDTV)
    STD_BT709,    // ITU-R BT.709 (HDTV)
    STD_BT2020,   // ITU-R BT.2020 (UHDTV)
    STD_CUSTOM    // Caller-supplied 3x3 conversion matrix
};

NormType

cpp
enum class NormType { NORM_INF, NORM_L1, NORM_L2, NORM_MINMAX };

AdaptiveThreshMethod

cpp
enum class AdaptiveThreshMethod {
    ADAPTIVE_THRESH_MEAN_C,     // Mean within block
    ADAPTIVE_THRESH_GAUSSIAN_C  // Gaussian-weighted mean within block
};

MorphOp

cpp
enum class MorphOp {
    ERODE,   // Erosion (take minimum over kernel coverage)
    DILATE   // Dilation (take maximum over kernel coverage)
};

ValueRange

Used to specify the output value range (e.g. by normalize):

cpp
enum class ValueRange {
    STD_NEG1_TO_POS1,    // Result falls in [-1, 1]
    UNIT_INTERVAL,       // Result falls in [0, 1]
    NATIVE_FULL_SCALE    // Result falls in the full positive range of the output type (e.g. u8 → [0, 255])
};

TemplateMatchMethod

cpp
enum class TemplateMatchMethod {
    TM_SQDIFF, TM_SQDIFF_NORMED,
    TM_CCORR,  TM_CCORR_NORMED,
    TM_CCOEFF, TM_CCOEFF_NORMED
};

DftFlags

cpp
enum DftFlags { DFT_FORWARD = 0, DFT_INVERSE = 1, DFT_SCALE = 2 };

Data Structures (Geometry / Hough / Features / Color / Contour)

These structs are passed to operators as parameter bundles or returned as result containers. Grouped by purpose:

  • Geometry primitivesPoint2f, Point2i, Size2f, RotatedRect (used by minAreaRect / fitEllipse / findContours)
  • Hough resultsVec2f, Vec3f, Vec4i (output formats for houghLines / houghCircles / houghLinesP)
  • Features & MatchingKeyPoint, KeyPointORB, KeyPointExt, DMatch, HOGParams (Harris / FAST / ORB / SIFT / SURF / HOG / bfMatch / bfKnnMatch)
  • Color conversionYUVConvertParams (a single bundle that drives every rgb2YUV / yuv2RGB operator)
  • Contour & MomentsHierarchyEntry, Moments (output of findContours / moments)

Common top-level acl:: namespace (<acl/typeDef.h>):

cpp
namespace acl {
    // 2D floating-point point (e.g. RotatedRect center)
    struct Point2f {
        float x, y;
        Point2f();
        Point2f(float x, float y);
    };

    // 2D floating-point size (e.g. RotatedRect size)
    struct Size2f {
        float width, height;
        Size2f();
        Size2f(float w, float h);
    };

    // Rotated rectangle — used by minAreaRect / fitEllipse
    struct RotatedRect {
        Point2f center;
        Size2f  size;       // Note: the order of width/height is determined by minAreaRect; the reader should not assume a size ordering
        float   angle;      // Rotation angle (degrees)
        RotatedRect();
        RotatedRect(Point2f c, Size2f s, float a);
    };

    // Descriptor match result — used by bfMatch / bfKnnMatch
    struct DMatch {
        int   queryIdx;     // Query descriptor index (default -1)
        int   trainIdx;     // Train descriptor index (default -1)
        float distance;     // Descriptor distance
        DMatch();
        DMatch(int q, int t, float d);
        bool operator<(const DMatch&) const;   // Sorted by distance
    };

    // Small vector types — used by houghLines / houghCircles / houghLinesP
    struct Vec2f { float val[2]; };   // (rho, theta) — houghLines
    struct Vec3f { float val[3]; };   // (cx, cy, radius) — houghCircles
    struct Vec4i { int   val[4]; };   // (x1, y1, x2, y2) — houghLinesP line segment

    // YUV ↔ RGB conversion parameter bundle — used by every rgb2YUV / yuv2RGB operator
    struct YUVConvertParams {
        YUVEncodeStandard yuv_std        = YUVEncodeStandard::STD_BT601;
        bool              yuv444_fmt     = true;   // true = 4:4:4 packed, false = 4:4:4 planar (only for the 4:4:4 ops)
        bool              nv21_fmt       = true;   // true = NV21 (V before U), false = NV12 (U before V) (only for the NV-series ops)
        bool              rgb_fmt        = true;   // true = RGB, false = BGR
        int               bit_depth      = 0;      // 0 = auto-detect from the pixel type (u8 = 8, u16 = 16)
        int               shift_right    = 0;      // optional output bit-shift (used by 16-bit pipelines)
        int               shift_left     = 0;      // optional input bit-shift (used by 16-bit pipelines)
        bool              yuv_full_range = true;   // true = full range (0–255), false = limited range (16–235)
        bool              rgb_full_range = true;   // true = full range, false = limited range
    };
}

The defaults match the common BT.601 8-bit full-range RGB pipeline. Pass an empty {} to take all defaults, or override only the fields that differ from your pipeline. See the Color Conversion section for examples.

All public structs and enums live in the single top-level acl:: namespace (no sub-namespaces):

cpp
namespace acl {
    // ── Geometry ────────────────────────────────────────────
    struct Point2i { int x, y; };

    // ── Contour / Moments ───────────────────────────────────
    struct HierarchyEntry { int next, prev, first_child, parent; };   // default -1
    struct Moments { double m00, m10, m01, m20, m11, m02, m30, m21, m12, m03; };
    enum DistanceType { DIST_L1 = 1, DIST_L2 = 2, DIST_LINF = 3 };
    enum ContourRetrMode {
        CONTOUR_RETR_EXTERNAL = 0,
        CONTOUR_RETR_LIST     = 1,
        CONTOUR_RETR_CCOMP    = 2,
        CONTOUR_RETR_TREE     = 3
    };
    enum ContourApproxMethod {
        CONTOUR_CHAIN_APPROX_NONE   = 1,
        CONTOUR_CHAIN_APPROX_SIMPLE = 2
    };

    // ── Features ────────────────────────────────────────────
    struct KeyPoint { int x, y; float response; };
    struct KeyPointORB { float x, y, response, scale, angle; uint8_t descriptor[32]; };
    struct KeyPointExt { float x, y, response, scale, angle; float descriptor[128]; };
    struct HOGParams {
        int cellSize, blockSize, nbins, blockStride;
        // Defaults: cellSize=8, blockSize=2, nbins=9, blockStride=1
    };

    // ── Filter mode tags ────────────────────────────────────
    enum EdgePreservingType {
        EDGE_PRESERVING_RECURSIVE = 1,  // Fast
        EDGE_PRESERVING_NORMCONV  = 2   // High quality
    };
}

Analysis

Namespace: acl::analysis (CPP) / acl::neon::analysis (NEON)

integral

Integral image (Summed Area Table): I(x, y) = ∑ src(i, j), 0 ≤ i ≤ x, 0 ≤ j ≤ y. Used for O(1) rectangular region sums (boxFilter, Haar features, etc.).

Tier: Starter+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
SrcTypeuint8_t, uint16_t, float
IntegralTypeint32_t, int64_t, doublesizeof(IntegralType) ≥ sizeof(SrcType)

CPP Version

cpp
template<class SrcType, class IntegralType>
int integral(
    const SrcType* srcImage, IntegralType* integral,
    int width, int height,
    int srcStride = 0);
ParameterTypeMeaningDefault
srcImageconst SrcType*Input image (single channel)non-null
integralIntegralType*Output integral image, size (width+1) × (height+1)non-null
width, heightintInput image size> 0
srcStrideintBytes per row0 = auto

Row 0 and column 0 of integral are always 0 (implementation sentinel row and column to simplify boundary queries).


NEON Version

cpp
template<class SrcType, class IntegralType>
int integral(
    const SrcType* srcImage, IntegralType* integralImage,
    int width, int height,
    int srcStride = 0);

Example

cpp
uint8_t srcImage[1920*1080];
int32_t integ[(1920+1)*(1080+1)];

acl::neon::analysis::integral<uint8_t, int32_t>(
    srcImage, integ, 1920, 1080);

// O(1) rectangle (x0,y0)-(x1,y1) sum
auto rectSum = [&](int x0, int y0, int x1, int y1) {
    int W = 1921;
    return integ[(y1+1)*W + (x1+1)]
         - integ[(y1+1)*W + x0]
         - integ[y0*W + (x1+1)]
         + integ[y0*W + x0];
};

histogram

Compute the pixel-value histogram.

Tier: Starter+
Channels: 1ch (packed via runtime hcn × vcn parameters when needed)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
ST{uint8_t, uint16_t}
HT{int, long} (histogram bin count type)

CPP Version

cpp
template<class ST, class HT>
int histogram(
    const ST* srcImage, HT* hist,
    int width, int height,
    int srcStride, int histLen,
    int hcn = 1, int vcn = 1);
ParameterTypeMeaningDefault
srcImageconst ST*Input imagenon-null
histHT*Output histogram (length histLen, zeroed by the caller)non-null
srcStrideintBytes per row0 = auto
histLenintNumber of histogram bins (u8 → 256, u16 → 65536)
hcn, vcnintHorizontal / vertical channel packing1, 1

NEON Version (uint8_t only, fixed histLen = 256)

cpp
int histogram(
    const uint8_t* srcImage, int* hist,
    int width, int height,
    int srcStride = 0);

hist has a fixed length of 256 int bins; for non-uint8_t or non-256 bins, use the CPP version.


histMatch

Histogram matching (normalization) — adjusts the pixel distribution of src so that it matches the histogram of ref.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
STuint8_t, uint16_tall params must be the same type
DTuint8_t, uint16_tall params must be the same type
RTuint8_t, uint16_tall params must be the same type

CPP Signature

cpp
template<class ST, class DT, class RT>
int histMatch(
    const ST* srcImage, DT* dstImage, const RT* refImage,
    int width, int height,
    int srcStride, int dstStride, int refStride,
    int srcHistLen, int refHistLen,
    double MATCH_TH = 0.0,
    int hcn = 1, int vcn = 1);
ParameterTypeMeaningDefault
srcImage, dstImageconst ST* / DT*input / outputnon-null
refImageconst RT*Reference image (its histogram is used as the target distribution)non-null
srcHistLen, refHistLenintSource / reference bin countsu8: 256
MATCH_THdoubleMatch tolerance threshold [0, 1]0.0
hcn, vcnintHorizontal / vertical channel packing1, 1

equalizeHist

Histogram equalization.

Tier: Starter+
Channels: 1ch
Inplace: supported
Types:

Template parameterAllowed typesConstraint
T (CPP)uint8_t, uint16_t
T (NEON)uint8_tNEON-only

CPP Version

cpp
template<class T>
int equalizeHist(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0);

NEON Version (uint8_t only)

cpp
int equalizeHist(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0);

clahe

Contrast Limited Adaptive Histogram Equalization — divides the image into tilesX × tilesY tiles, performs histogram equalization in each tile and clips the contrast upper bound, then bilinearly interpolates (blends) the results, avoiding the over-contrast from global equalization.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
int clahe(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    double clipLimit = 40.0,
    int tilesX = 8, int tilesY = 8);
ParameterTypeMeaningDefault
srcImage, dstImageconst uint8_t* / uint8_t*input / outputnon-null
width, heightintImage sizemust satisfy width ≥ tilesX, height ≥ tilesY
srcStride, dstStrideintBytes per row0 = auto
clipLimitdoubleContrast upper bound (higher = stronger contrast)40.0 (OpenCV default)
tilesX, tilesYintHorizontal / vertical tile count8 × 8

minMaxLoc

Find the minimum / maximum value in the image and their locations.

Tier: Starter+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP / NEON Signature (identical)

cpp
template<class T>
int minMaxLoc(
    const T* srcImage, int width, int height, int srcStride,
    T* minVal, T* maxVal,
    int* minLocX, int* minLocY,
    int* maxLocX, int* maxLocY);
ParameterTypeMeaningDefault
srcImageconst T*Input imagenon-null
srcStrideintBytes per row0 = auto
minVal, maxValT*Output min / max values (may be nullptr)
minLocX, minLocY, maxLocX, maxLocYint*Output corresponding coordinates (may be nullptr)

When any output pointer is nullptr, the corresponding result is skipped.


moments

Spatial moments (orders 0~3). Single-channel image; outputs a Moments struct containing 10 double raw moments: m00, m10, m01, m20, m11, m02, m30, m21, m12, m03.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
T (CPP)uint8_t, uint16_t, float

CPP Signature

cpp
struct Moments {
    double m00, m10, m01, m20, m11, m02, m30, m21, m12, m03;
};

template<class T>
int moments(
    const T* srcImage, int width, int height, int srcStride,
    Moments& m,
    bool binaryImage = false);
ParameterTypeMeaningDefault
srcImageconst T*Input imagenon-null
mMoments&Output moments structfilled by the function
binaryImagebooltrue = treat all non-zero pixels as 1 (binary evaluation)false

copyMakeBorder

Add a border around the image, supporting multiple border modes. Typical use: padding before convolution.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float
BorderType: BORDER_REPLICATE / BORDER_REFLECT / BORDER_REFLECT_101 / BORDER_WRAP / BORDER_CONSTANT, etc.

CPP Version

cpp
template<class T>
int copyMakeBorder(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight, int channelNum,
    int srcStride, int dstStride,
    int top, int bottom, int left, int right,
    const T* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImageconst T*Input imagenon-null
dstImageT*Output image (size (srcWidth + left + right) × (srcHeight + top + bottom))non-null
channelNumintChannel count
top, bottom, left, rightintPadding width in each of the four directions≥ 0
constantconst T*BORDER_CONSTANT fill-value array (length channelNum)nullptr
btacl::BorderTypeBorder-handling modeBORDER_REFLECT_101

NEON Version

cpp
template<class T>
int copyMakeBorder(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight, int channelNum,
    int srcStride = 0, int dstStride = 0,
    int top = 0, int bottom = 0, int left = 0, int right = 0,
    const T* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

count

Count the number of pixels satisfying a condition.

Tier: Starter+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP Signature (3 entry points)

cpp
// == threshold
template<class T>
int countEQ(const T* srcImage, int width, int height, int stride, const T& threshold);

// <= threshold
template<class T>
int countLET(const T* srcImage, int width, int height, int stride, const T& threshold);

// < threshold
template<class T>
int countLT(const T* srcImage, int width, int height, int stride, const T& threshold);

The return value is the count (not an error code); srcImage == nullptr returns 0.


mean

Per-pixel mean of two or N images: dst[i] = (A[i] + B[i] + …) / N.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
ATuint8_t, uint16_t, float
BTuint8_t, uint16_t, float
DTuint8_t, uint16_t, float

CPP Signature (two overloads: 2-image + N-image)

cpp
// 2-image
template<class AT, class BT, class DT>
int mean(
    const AT* src1Image, const BT* src2Image, DT* dstImage,
    int width, int height, int cn = 1,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

// N-image
template<class ST, class DT>
int mean(
    const ST* const* srcImages, int srcNum, DT* dstImage,
    int width, int height, int cn = 1,
    int srcStride = 0, int dstStride = 0);

matchTemplate

Template matching (6 similarity metrics). FFT-accelerated; suitable for large image × small template.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
T (CPP){uint8_t, float}
T (NEON)uint8_tNEON-only

TemplateMatchMethod:

  • TM_SQDIFF — sum of squared differences
  • TM_SQDIFF_NORMED — normalized squared differences
  • TM_CCORR — cross-correlation
  • TM_CCORR_NORMED — normalized cross-correlation
  • TM_CCOEFF — correlation coefficient
  • TM_CCOEFF_NORMED — normalized correlation coefficient

CPP Version

cpp
template<class T>
int matchTemplate(
    const T* srcImage, int srcW, int srcH, int srcStride,
    const T* templ, int templW, int templH, int templStride,
    float* result, int resultStride,
    acl::TemplateMatchMethod tm = acl::TemplateMatchMethod::TM_SQDIFF);

NEON Version (uint8_t only)

cpp
template<class T>
int matchTemplate(
    const T* srcImage, int srcW, int srcH, int srcStride,
    const T* templ, int templW, int templH, int templStride,
    float* result, int resultStride,
    acl::TemplateMatchMethod tm = acl::TemplateMatchMethod::TM_SQDIFF);
ParameterTypeMeaningDefault
srcImage, templconst T*Search image, template imagenon-null, templW ≤ srcW, templH ≤ srcH
resultfloat*Output score map, size (srcW - templW + 1) × (srcH - templH + 1)non-null
*StrideintBytes per row (result counted as float)0 = auto
tmacl::TemplateMatchMethodSimilarity metricTM_SQDIFF

connectedComponent_8n_dfs

8-connected component labeling (DFS). Takes a binary image as input; outputs the pixel coordinate list for each connected component.

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t
LabelTypeint

CPP Signature

cpp
template<class T>
int connectedComponent_8n_dfs(
    T* binaryImage, int width, int height, int stride,
    std::vector<std::vector<std::pair<int, int>>>& regions,
    int minArea, int maxArea,
    int frontFlag = 255, int backFlag = 0);
ParameterTypeMeaning
binaryImageT*Input binary image (the algorithm may overwrite pixels as markers)
regionsvector<vector<pair<int, int>>>&Output list of (x, y) pixel coordinates for each connected component
minArea, maxAreaintFilter: only retain components with area ∈ [minArea, maxArea]
frontFlag, backFlagintForeground / background pixel value used as DFS markers (defaults 255 / 0)

connectedComponentLabeling

Connected component labeling with a label image (union-find). Outputs a label image (per-pixel label) + a label list sorted by area.

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
DataInTypeuint8_t, uint16_t
LabelTypeint

CPP Signature

cpp
template<class DataInType, class LabelType>
int connectedComponentLabeling(
    DataInType* dataIn, LabelType* label,
    std::vector<std::pair<LabelType, int>>& sortLabelHist,
    DataInType threshold,
    int topAreaCnt, int minArea,
    int width, int height,
    int inStride = 0, int labelStride = 0);
ParameterTypeMeaning
dataInDataInType*Input image (binarized via threshold)
labelLabelType*Output label image
sortLabelHistvector<pair<LabelType, int>>&Output (label, area) list sorted by descending area
thresholdDataInTypeInput binarization threshold
topAreaCntintOnly retain the topAreaCnt components with the largest area

findContours

Find contours in a binary image (Suzuki-Abe algorithm, OpenCV compatible).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP Signature

cpp
enum ContourRetrMode {
    CONTOUR_RETR_EXTERNAL = 0,  // only the outermost layer
    CONTOUR_RETR_LIST     = 1,  // all contours, no hierarchy
    CONTOUR_RETR_CCOMP    = 2,  // two layers (outer + inner holes)
    CONTOUR_RETR_TREE     = 3   // full hierarchy tree
};

enum ContourApproxMethod {
    CONTOUR_CHAIN_APPROX_NONE   = 1,  // retain all contour points
    CONTOUR_CHAIN_APPROX_SIMPLE = 2   // compress intermediate points on horizontal / vertical / diagonal segments
};

struct Point2i { int x, y; };
struct HierarchyEntry { int next, prev, first_child, parent; };

int findContours(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<std::vector<Point2i>>& contours,
    std::vector<HierarchyEntry>* hierarchy = nullptr,
    ContourRetrMode mode = CONTOUR_RETR_LIST,
    ContourApproxMethod method = CONTOUR_CHAIN_APPROX_SIMPLE,
    int offsetX = 0, int offsetY = 0);
ParameterTypeMeaning
contoursvector<vector<Point2i>>&Output contours; each contour is an array of Point2i
hierarchyvector<HierarchyEntry>*(optional) hierarchy info [next, prev, first_child, parent]
offsetX, offsetYintOffset added to all contour point coordinates

distanceTransform

Distance transform — each pixel outputs the distance to its nearest 0 pixel (L1 / L2 / L∞).

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Inputuint8_t
Outputfloat

DistanceType: DIST_L1 (Manhattan) / DIST_L2 (Euclidean, exact algorithm) / DIST_LINF (chessboard)


CPP Signature (2 entry points: float or u8 output)

cpp
// float output (high precision)
int distanceTransform(
    const uint8_t* srcImage, float* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    DistanceType distType = DIST_L2);

// u8 output (normalized to 0-255, suitable for visualization)
int distanceTransformU8(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    DistanceType distType = DIST_L2);

DIST_L2 uses the Felzenszwalb-Huttenlocher exact Euclidean algorithm (not an approximation).


blockAverage

Take the mean over each U × V block as output (image downsampling). U = V = 2 corresponds to 2×2 averaging.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

Commercial package type availability:

TierCallable T in delivered <acl/api.h>
Starteruint8_t
Prouint8_t, uint16_t
Businessuint8_t, uint16_t, float

CPP / NEON Signature (identical)

cpp
template<class T>
int blockAverage(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight,
    int U, int V,
    int srcStride = 0, int dstStride = 0,
    bool round = true,
    int hcn = 1, int vcn = 1);
ParameterTypeMeaningDefault
U, VintBlock horizontal / vertical size
roundbooltrue = round to nearest, false = truncatetrue
hcn, vcnintHorizontal / vertical channel packing1, 1

extractBlockPixels

Extract the pixel at (u, v) from every U × V block (i.e. downsample while specifying the sampling point).

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
STuint8_t, uint16_t, float
DTuint8_t, uint16_t, float

CPP Signature

cpp
template<class ST, class DT>
int extractBlockPixels(
    const ST* srcImage, DT* dstImage,
    int srcWidth, int srcHeight,
    int srcStride, int dstStride,
    int U, int V, int u, int v);
ParameterTypeMeaning
U, VintBlock size
u, vintPixel position sampled from each block (0 ≤ u < U, 0 ≤ v < V); auto-clamped

Arithmetic

Namespace: acl::arithmetic (CPP) / acl::neon::arithmetic (NEON)

addImg

Per-pixel sum of two images dst = src1 + src2; input operand types and output type are decoupled (AT / BT → DT).

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (dst == src1 or dst == src2)
Types:

Template parameterAllowed typesConstraint
AT, BT, DT (CPP){uint8_t, uint16_t, float}
T (NEON){uint8_t, uint16_t}NEON-only

Variant entry points (CPP):

Entry pointPurpose
addImgSum of two images, no saturation
addImgClampSum of two images, result clamped to [minValue, maxValue]
add_ImgsN-image accumulation (srcImages[0] + srcImages[1] + …)
add_ImgsClampN-image accumulation + clamp
add_ImgsManualN-image accumulation with manually specified intermediate accumulation type ACC_T
add_ImgsManualClampN-image accumulation + manual ACC_T + clamp

CPP Version

cpp
template<class AT, class BT, class DT>
int addImg(
    const AT* src1Image, const BT* src2Image, DT* dstImage,
    int width, int height, int cn = 1,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

template<class AT, class BT, class DT>
int addImgClamp(
    const AT* src1Image, const BT* src2Image, DT* dstImage,
    int width, int height, int cn,
    int src1Stride, int src2Stride, int dstStride,
    DT minValue, DT maxValue);

template<class ST, class DT>
int add_Imgs(
    const ST* const* srcImages, int srcNum, DT* dstImage,
    int width, int height, int cn = 1,
    int srcStride = 0, int dstStride = 0);
ParameterTypeMeaningDefault
src1Image, src2Imageconst AT*, const BT*Inputsnon-null
dstImageDT*Outputnon-null
width, heightintImage size> 0
cnintChannel count1
*StrideintBytes per row0 = auto
minValue, maxValueDT(clamp variant only) output upper/lower bounds

NEON Version

cpp
template<class T>
int addImg(
    const T* src1, const T* src2, T* dstImage,
    int width, int height, int cn = 1,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

NEON provides only addImg itself (single template across uint8_t / float, etc., same type for both inputs and output); variants such as N-image accumulation, clamp, and manual ACC_T are via the CPP version.


Example

cpp
uint8_t a[1920*1080], b[1920*1080], dstImage[1920*1080];

// Sum of two images (u8)
acl::neon::arithmetic::addImg(a, b, dstImage, 1920, 1080);

// Saturating clamp to [0, 200] (CPP)
acl::arithmetic::addImgClamp<uint8_t, uint8_t, uint8_t>(
    a, b, dstImage, 1920, 1080, 1, 0, 0, 0,
    /*minValue=*/0, /*maxValue=*/200);

// Accumulate 10 images into u16 to avoid u8 overflow
const uint8_t* imgs[10] = { /* ... */ };
uint16_t acc[1920*1080];
acl::arithmetic::add_Imgs<uint8_t, uint16_t>(
    imgs, 10, acc, 1920, 1080);

absDiff

Per-pixel absolute difference of two images dst = |src1 - src2|.

Tier: Starter+
Channels: any
Inplace: supported
Types:

Template parameterAllowed typesConstraint
ST1uint8_t, uint16_t, float
ST2uint8_t, uint16_t, float
DTuint8_t, uint16_t, float
cpp
template<class T>
int absDiff(
    const T* src1, const T* src2, T* dstImage,
    int width, int height, int cn,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

Available as both acl::arithmetic::absDiff (CPP) and acl::neon::arithmetic::absDiff (NEON) — identical signature, different namespace. NEON path supports uint8_t / float / double; uint16_t falls through to scalar.


Example

cpp
uint8_t a[1920*1080], b[1920*1080], diff[1920*1080];

// Frame differencing (motion detection)
acl::neon::arithmetic::absDiff<uint8_t>(a, b, diff, 1920, 1080, 1);

addWeighted

Weighted sum dst = alpha*src1 + beta*src2 + gamma. Typical uses: image transitions, exposure fusion.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP Version

cpp
template<class ST1, class ST2, class DT>
int addWeighted(
    const ST1* src1, const ST2* src2, DT* dstImage,
    int width, int height, int cn,
    int src1Stride, int src2Stride, int dstStride,
    double alpha, double beta, double gamma);

NEON Version (T{uint8_t, uint16_t, float}, same type for inputs and output)

cpp
template<class T>
int addWeighted(
    const T* src1, const T* src2, T* dstImage,
    int width, int height,
    int cn = 1,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0,
    double alpha = 1.0, double beta = 1.0, double gamma = 0.0);

Example

cpp
uint8_t a[1920*1080*3], b[1920*1080*3], dstImage[1920*1080*3];

// 50% blend: dstImage = 0.5 * a + 0.5 * b
acl::neon::arithmetic::addWeighted(
    a, b, dstImage, 1920, 1080, 3, 0, 0, 0, 0.5, 0.5, 0.0);

alphaImgFusion

Alpha blending C = alpha * A + (1 - alpha) * B.

Tier: Starter+
Channels: 1ch (the width*cn parameter is in bytes)
Inplace: supported
Types:

Template parameterAllowed typesConstraint
ST1uint8_t, uint16_t, float
ST2uint8_t, uint16_t, float
DTuint8_t, uint16_t, float

CPP / NEON Signature (identical)

cpp
template<class T>
int alphaImgFusion(
    const T* A, const T* B, T* C,
    int width, int height,
    int AStride, int BStride, int CStride,
    float alpha);
ParameterTypeMeaningDefault
A, Bconst T*The two input imagesnon-null
CT*Output blended imagenon-null
AStride, BStride, CStrideintBytes per row0 = auto
alphafloatWeight of A, [0, 1]

mul

Per-pixel multiplication C = A * B. When saturateCast = true the result saturates to the CT type (e.g. uint8_t is clamped to [0, 255]).

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported
Types:

Template parameterAllowed typesConstraint
AT, BT, CT (CPP){uint8_t, uint16_t, float}
T (NEON){uint8_t, uint16_t}NEON-only

CPP Version

cpp
template<class AT, class BT, class CT>
int mul(
    const AT* A, const BT* B, CT* C,
    int width, int height, int cn,
    int AStride, int BStride, int CStride,
    bool saturateCast = false);

NEON Version (uint8_t / uint16_t / float, same type for input and output)

cpp
template<class T>
int mul(
    const T* A, const T* B, T* C,
    int width, int height, int cn,
    int AStride, int BStride, int CStride,
    bool saturateCast = false);

NEON requires AT == BT == CT (i.e. T); for heterogeneous types, use the CPP version.


threshold

Fixed-threshold binarization / truncation.

Tier: Starter+
Channels: 1ch
Inplace: supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float
ThreshMode: THRESH_BINARY / THRESH_BINARY_INV / THRESH_TRUNC / THRESH_TOZERO / THRESH_TOZERO_INV / THRESH_OTSU (u8 only)

CPP Version

cpp
template<class ST, class DT>
int threshold(
    const ST* srcImage, DT* dstImage,
    int width, int height,
    ST threshold,
    DT maxVal = 255, DT minVal = 0,
    int srcStride = 0, int dstStride = 0,
    acl::ThreshMode tm = acl::ThreshMode::THRESH_BINARY);

THRESH_OTSU mode automatically computes the optimal threshold (the passed-in threshold parameter is ignored); only uint8_t / uint16_t are supported.


NEON Version (uint8_t only for input and output)

cpp
template<class T>
int threshold(
    const T* srcImage, T* dstImage,
    int width, int height,
    T threshold,
    T maxVal = 255, T minVal = 0,
    int srcStride = 0, int dstStride = 0,
    acl::ThreshMode tm = acl::ThreshMode::THRESH_BINARY);

Example

cpp
uint8_t gray[1920*1080], bin[1920*1080];

// Fixed-threshold binarization
acl::neon::arithmetic::threshold<uint8_t>(
    gray, bin, 1920, 1080, /*threshold=*/128,
    /*maxVal=*/255, /*minVal=*/0, 0, 0,
    acl::ThreshMode::THRESH_BINARY);

// Otsu automatic threshold (CPP only)
acl::arithmetic::threshold<uint8_t, uint8_t>(
    gray, bin, 1920, 1080, /*threshold=*/0,
    /*maxVal=*/255, /*minVal=*/0, 0, 0,
    acl::ThreshMode::THRESH_OTSU);   // threshold parameter is ignored

adaptiveThreshold

Adaptive threshold — each pixel's threshold = local_mean(src, blockSize) - C (or Gaussian-weighted mean).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float
Method: ADAPTIVE_THRESH_MEAN_C / ADAPTIVE_THRESH_GAUSSIAN_C

CPP Version

cpp
template<class T>
int adaptiveThreshold(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    T maxVal,
    int blockSize,
    double C,
    acl::AdaptiveThreshMethod am = acl::AdaptiveThreshMethod::ADAPTIVE_THRESH_MEAN_C);
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
width, heightintImage size> 0
srcStride, dstStrideintBytes per row0 = auto
maxValTOutput value for pixels above thresholdtypically 255
blockSizeintLocal window size (must be odd and ≥ 3)typically 11 / 25
CdoubleThreshold adjustment constanttypically 2 ~ 10
amacl::AdaptiveThreshMethodADAPTIVE_THRESH_MEAN_C / ADAPTIVE_THRESH_GAUSSIAN_CMEAN_C

NEON Version (uint8_t only)

cpp
template<class T>
int adaptiveThreshold(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    T maxVal,
    int blockSize,
    double C,
    acl::AdaptiveThreshMethod am = acl::AdaptiveThreshMethod::ADAPTIVE_THRESH_MEAN_C);

bitwise

Bitwise operations AND / NOT / XOR.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported
Types:

Template parameterAllowed typesConstraint
T (CPP){uint8_t, uint16_t}
T (NEON)uint8_tNEON-only

No bitwiseOr entry point (OR can be composed as NOT + AND + NOT; the current version does not provide it separately).


CPP / NEON Signature (identical)

cpp
// AND
template<class T>
int bitwiseAnd(
    const T* src1, const T* src2, T* dstImage,
    int width, int height, int cn,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

// NOT
template<class T>
int bitwiseNot(
    const T* srcImage, T* dstImage,
    int width, int height, int cn,
    int srcStride = 0, int dstStride = 0);

// XOR
template<class T>
int bitwiseXor(
    const T* src1, const T* src2, T* dstImage,
    int width, int height, int cn,
    int src1Stride = 0, int src2Stride = 0, int dstStride = 0);

lut

Lookup-table transform dst[i] = table[src[i]]. Commonly used for gamma correction, color mapping, curve adjustments.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported
Types:

Template parameterAllowed typesConstraint
ST{uint8_t}, DT{uint8_t, uint16_t, float} (CPP)tested combinations
T (NEON)uint8_tNEON-only

CPP Version: 1ch, ST → DT

cpp
template<class ST, class DT>
int lut(
    const ST* srcImage, DT* dstImage,
    const DT* table,
    int width, int height,
    int srcStride = 0, int dstStride = 0);

table length must cover all possible values of ST (uint8_t → 256 entries, uint16_t → 65536 entries).


NEON Version: 1ch / 3ch / 4ch, uint8_t only

cpp
int lut(
    const uint8_t* srcImage, uint8_t* dstImage,
    const uint8_t* table,
    int width, int height,
    int cn = 1,
    int srcStride = 0, int dstStride = 0);

The NEON version additionally supports multi-channel (applies the same 256-entry table to all channels).


Example

cpp
uint8_t srcImage[1920*1080*3], dstImage[1920*1080*3];

// Gamma 2.2 LUT
uint8_t gamma_lut[256];
for (int i = 0; i < 256; ++i)
    gamma_lut[i] = (uint8_t)(std::pow(i / 255.0, 1.0 / 2.2) * 255.0);

// 3-channel gamma correction (NEON)
acl::neon::arithmetic::lut(srcImage, dstImage, gamma_lut, 1920, 1080, 3);

convertScaleAbs

Scalar multiplication + offset + absolute value + convert to u8: dst = saturate_cast<u8>(|alpha*src + beta|).

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (requires ST == uint8_t)
Types:

Template parameterAllowed typesConstraint
STuint8_t, int16_t, uint16_t, floatoutput fixed to uint8_t

CPP / NEON Signature (identical)

cpp
template<class ST>
int convertScaleAbs(
    const ST* srcImage, uint8_t* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    double alpha, double beta);

Typical pairing: visualize a int16_t gradient from Sobel / Scharr by calling convertScaleAbs<int16_t>.


inRange

Range check: if each channel simultaneously satisfies low[c] ≤ src[c] ≤ high[c], output 255, otherwise 0. Commonly used for HSV color segmentation.

Tier: Starter+
Channels: input 1ch / 3ch / 4ch (per-channel evaluation), output 1ch binary mask
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP / NEON Signature (identical)

cpp
template<class T>
int inRange(
    const T* srcImage, uint8_t* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    const T* low, const T* high);
ParameterTypeMeaning
srcImageconst T*Input image (multi-channel, interleaved)
dstImageuint8_t*Output 1ch binary mask
low, highconst T*Upper/lower-bound arrays of length cn

Example

cpp
uint8_t hsv[1920*1080*3], mask[1920*1080];

// Extract red: H∈[0,10] S∈[100,255] V∈[100,255]
uint8_t low[3]  = {0, 100, 100};
uint8_t high[3] = {10, 255, 255};
acl::neon::arithmetic::inRange(hsv, mask, 1920, 1080, 3, 0, 0, low, high);

normalize

Normalization — linearly map pixel values to the target range.

Tier: Starter+
Channels: 1ch
Inplace: supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float
NormType: NORM_MINMAX (maps to [alpha, beta])

CPP / NEON Signature (identical)

cpp
template<class T>
int normalize(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    acl::NormType nt = acl::NormType::NORM_MINMAX,
    double alpha = 0.0, double beta = 255.0);
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
width, heightintImage size> 0
srcStride, dstStrideintBytes per rowrequired (0 = auto)
ntacl::NormTypeNormalization modeNORM_MINMAX
alphadoubleTarget lower bound0.0
betadoubleTarget upper bound (for NORM_MINMAX only)255.0

linearTransform2x2

2×2 pixel-block linear transform: the caller provides a 2×2 coefficient matrix [[v00, v01], [v10, v11]], applied to each 2×2 pixel block; commonly used for Bayer color correction, etc.

Tier: Business
Channels: 1ch (input is interpreted in a 2×2 block structure)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float
DTuint8_t, uint16_t, float

CPP Signature

cpp
template<class T, class DT>
int linearTransform2x2(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int minValue, int maxValue,
    const DT& v00, const DT& v01, const DT& v10, const DT& v11);
ParameterTypeMeaning
minValue, maxValueintOutput clamp upper/lower bounds
v00, v01, v10, v11DT2×2 matrix coefficients

phaseMagnitude

Compute phase angle and magnitude from Sobel / Scharr gradients (dx, dy).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
GT (CPP){int16_t, int32_t} (template also accepts float)
T (NEON)int16_tNEON-only

CPP Version: two independent entry points

cpp
// Phase angle (radians or degrees)
template<class GT>
int phase(
    const GT* dx, const GT* dy, float* angle,
    int width, int height,
    int dxStride, int dyStride, int angleStride,
    bool angleInDegrees = true);

// Magnitude (L2 norm)
template<class GT>
int magnitude(
    const GT* dx, const GT* dy, float* mag,
    int width, int height,
    int dxStride, int dyStride, int magStride);

GT{short, int, float}. Output is fixed to float.


NEON Version (short input only, non-templated)

cpp
int magnitude(
    const short* dx, const short* dy, float* mag,
    int width, int height,
    int dxStride, int dyStride, int magStride);

int phase(
    const short* dx, const short* dy, float* angle,
    int width, int height,
    int dxStride, int dyStride, int angleStride,
    bool angleInDegrees = true);

Typical pairing: sobel3x3<short> outputs a short gradient that feeds directly into NEON magnitude / phase.


Example

cpp
uint8_t srcImage[1920*1080];
short dx[1920*1080], dy[1920*1080];
float mag[1920*1080], angle[1920*1080];

// Sobel → gradient magnitude + phase
acl::neon::filter::sobel3x3<uint8_t, short>(srcImage, dx, dy, 1920, 1080);
acl::neon::arithmetic::magnitude(dx, dy, mag, 1920, 1080, 0, 0, 0);
acl::neon::arithmetic::phase(dx, dy, angle, 1920, 1080, 0, 0, 0, /*degrees=*/true);

Color Conversion

Namespace: acl::cvtcolor (CPP) / acl::neon::cvtcolor (NEON)

RGB2Gray / RGBA2Gray

RGB(A) → grayscale image. Supports 5 grayscale strategies (luma BT.601, max, min, average, weighted).

Tier: Starter+
Channels: input 3ch (RGB) or 4ch (RGBA), output 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP Version: acl::cvtcolor::RGB2Gray / RGBA2Gray

cpp
template<class T>
int RGB2Gray(
    const T* rgbImage, T* grayImage,
    int width, int height,
    int rgbStride = 0, int grayStride = 0,
    acl::ColorCvtGrayMode mode = acl::ColorCvtGrayMode::GRAY_LUMA,
    float cR = 0.299f, float cG = 0.587f, float cB = 0.114f);

template<class T>
int RGBA2Gray(
    const T* rgbaImage, T* grayImage,
    int width, int height,
    int rgbaStride = 0, int grayStride = 0,
    acl::ColorCvtGrayMode mode = acl::ColorCvtGrayMode::GRAY_LUMA,
    float cR = 0.299f, float cG = 0.587f, float cB = 0.114f);
ParameterTypeMeaningDefault
rgbImage / rgbaImageconst T*Input RGB / RGBA imagenon-null
grayImageT*Output grayscale imagenon-null
width, heightintImage size> 0
rgbStride/rgbaStride, grayStrideintBytes per row0 = auto
modeacl::ColorCvtGrayModeGrayscale strategyGRAY_LUMA
cR, cG, cBfloatWeights used only in GRAY_WEIGHTED modeBT.601 defaults

mode values (see acl::ColorCvtGrayMode):

  • GRAY_LUMA (default) — BT.601 luma: 0.299R + 0.587G + 0.114B
  • GRAY_WEIGHTED — uses custom weights cR/cG/cB
  • GRAY_MIN / GRAY_MAX / GRAY_AVG — take min / max / mean across channels

NEON Version: acl::neon::cvtcolor::RGB2Gray / RGBA2Gray (uint8_t only)

cpp
int RGB2Gray(   // or RGBA2Gray
    const uint8_t* rgbImage, uint8_t* grayImage,
    int width, int height,
    int rgbStride = 0, int grayStride = 0,
    acl::ColorCvtGrayMode mode = acl::ColorCvtGrayMode::GRAY_LUMA,
    float cR = 0.299f, float cG = 0.587f, float cB = 0.114f);

The element type is fixed to uint8_t; other parameter semantics match the CPP version.


Example

cpp
uint8_t rgb[1920*1080*3], gray[1920*1080];

// BT.601 luma grayscale (default)
acl::neon::cvtcolor::RGB2Gray(rgb, gray, 1920, 1080);

// Channel-wise max mode (u16)
acl::cvtcolor::RGB2Gray<uint16_t>(
    src16, gray16, 1920, 1080,
    /*rgbStride=*/0, /*grayStride=*/0,
    acl::ColorCvtGrayMode::GRAY_MAX);

Channel Swap (BGR / RGB / RGBA interchange)

Channel-order swap (3ch ↔ 3ch, 3ch ↔ 4ch, 4ch ↔ 3ch). One template entry point channelSwap<Mode>() covers every direction; only the Mode tag changes.

Tier: Starter+
Channels: 3ch ↔ 3ch / 3ch ↔ 4ch / 4ch ↔ 3ch
Inplace: not supported (when channel counts differ)
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Version (identical signatures, uint8_t only)

All channel-swap directions are dispatched through a single channelSwap<Mode>() template — the Mode template parameter is an empty tag struct selecting the conversion direction. The 5 tag structs cover all 10 useful directions (each tag handles a pair of equivalent swaps).

cpp
template<class Mode>
int channelSwap(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height, int srcStride = 0, int dstStride = 0);

// Mode tags (declared in typeDef.h, namespace acl::):
//   3ch ↔ 3ch:   BGR2RGB    (also covers RGB → BGR — same byte layout)
//   3ch → 4ch:   BGR2BGRA   (also covers RGB → RGBA)
//                BGR2RGBA   (also covers RGB → BGRA — swap R/B + add alpha)
//   4ch → 3ch:   BGRA2BGR   (also covers RGBA → RGB)
//                BGRA2RGB   (also covers RGBA → BGR — swap R/B + drop alpha)

CPP lives in acl::cvtcolor::, NEON in acl::neon::cvtcolor::, with identical signatures.

ParameterTypeMeaningDefault
srcImage, dstImageconst uint8_t* / uint8_t*input / outputnon-null
width, heightintImage size> 0
srcStride, dstStrideintBytes per row0 = auto

Example

cpp
uint8_t bgr[1920*1080*3], rgba[1920*1080*4];

// BGR → RGBA (NEON) — pick the Mode tag that matches the direction
acl::neon::cvtcolor::channelSwap<acl::BGR2RGBA>(bgr, rgba, 1920, 1080);

RGB ↔ YUV (fixed-point)

Conversion between RGB/RGBA and YUV (NV21 / YV12 / YUV444), using integer fixed-point arithmetic. Supports BT.601 / BT.709 / BT.2020 + full-range / limited-range combinations via the unified YUVConvertParams struct.

Tier: Starter+
Channels: 3ch RGB ↔ NV21 / NV12 / YV12 / YUV444
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
RGB_Tuint8_t, uint16_t
YUV_Tuint8_t, uint16_t

NV21 and NV12 share the same entry point; switch via YUVConvertParams::nv21_fmt (true = NV21, false = NV12).


CPP Version: acl::cvtcolor::rgb*2*_fixed (RGB → YUV, 6 entry points)

NV21 / NV12 (Y plane + interleaved UV plane)

cpp
template<class RGB_T, class YUV_T>
int rgb2NV21_fixed(
    const RGB_T* rgbImage, YUV_T* dstYImage, YUV_T* dstUVImage,
    int width, int height,
    int rgbStride = 0, int yStride = 0, int uvStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T>
int rgba2NV21_fixed(
    const RGB_T* rgbaImage, YUV_T* dstYImage, YUV_T* dstUVImage,
    int width, int height,
    int rgbaStride = 0, int yStride = 0, int uvStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

YV12 (three independent planes Y / U / V; U/V are each width/2 × height/2)

cpp
template<class RGB_T, class YUV_T>
int rgb2YV12_fixed(
    const RGB_T* rgbImage,
    YUV_T* dstYImage, YUV_T* dstUImage, YUV_T* dstVImage,
    int width, int height,
    int rgbStride = 0, int yStride = 0, int uStride = 0, int vStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T>
int rgba2YV12_fixed(
    const RGB_T* rgbaImage,
    YUV_T* dstYImage, YUV_T* dstUImage, YUV_T* dstVImage,
    int width, int height,
    int rgbaStride = 0, int yStride = 0, int uStride = 0, int vStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

YUV444 (single-plane interleaved)

cpp
template<class RGB_T, class YUV_T>
int rgb2YUV444_fixed(
    const RGB_T* rgbImage, YUV_T* dstYUVImage,
    int width, int height,
    int rgbStride = 0, int yuvStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T>
int rgba2YUV444_fixed(
    const RGB_T* rgbaImage, YUV_T* dstYUVImage,
    int width, int height,
    int rgbaStride = 0, int yuvStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

CPP Version: acl::cvtcolor::*2RGB_fixed (YUV → RGB, 6 entry points)

NV21 / NV12 → RGB / RGBA

cpp
template<class YUV_T, class RGB_T>
int nv212RGB_fixed(
    const YUV_T* srcYImage, const YUV_T* srcUVImage, RGB_T* rgbImage,
    int width, int height,
    int yStride = 0, int uvStride = 0, int rgbStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class YUV_T, class RGB_T>
int nv212RGBA_fixed(
    const YUV_T* srcYImage, const YUV_T* srcUVImage, RGB_T* rgbaImage,
    int width, int height,
    int yStride = 0, int uvStride = 0, int rgbaStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

YV12 → RGB / RGBA (three input planes)

cpp
template<class YUV_T, class RGB_T>
int yv122RGB_fixed(
    const YUV_T* srcYImage, const YUV_T* srcUImage, const YUV_T* srcVImage,
    RGB_T* rgbImage,
    int width, int height,
    int yStride = 0, int uStride = 0, int vStride = 0, int rgbStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class YUV_T, class RGB_T>
int yv122RGBA_fixed(
    const YUV_T* srcYImage, const YUV_T* srcUImage, const YUV_T* srcVImage,
    RGB_T* rgbaImage,
    int width, int height,
    int yStride = 0, int uStride = 0, int vStride = 0, int rgbaStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

YUV444 → RGB / RGBA (single input plane)

cpp
template<class YUV_T, class RGB_T>
int yuv4442RGB_fixed(
    const YUV_T* srcYUVImage, RGB_T* rgbImage,
    int width, int height,
    int yuvStride = 0, int rgbStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

template<class YUV_T, class RGB_T>
int yuv4442RGBA_fixed(
    const YUV_T* srcYUVImage, RGB_T* rgbaImage,
    int width, int height,
    int yuvStride = 0, int rgbaStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});
ParameterMeaning
rgbImage / rgbaImageRGB / RGBA plane, 3 or 4 bytes/pixel
dstYImage, dstUVImage / dstUImage, dstVImage / dstYUVImageYUV output planes (NV merges UV, YV12 has separate U/V, YUV444 is interleaved)
width, heightImage size (NV/YV12 require even values)
*StrideBytes per row, 0 = auto
cvMatrixCustom 3×3 conversion matrix (used only when p.yuv_std = YUVEncodeStandard::STD_CUSTOM)
pYUVConvertParams — selects standard, channel order, bit depth, range. Defaults to BT.601 8-bit full-range RGB.

NEON Version: acl::neon::cvtcolor::rgb*2*_fixed (uint8_t only)

cpp
template<class RGB_T, class YUV_T>
int rgb2NV21_fixed(
    const RGB_T* rgbImage, YUV_T* dstYImage, YUV_T* dstUVImage,
    int width, int height,
    int rgbStride = 0, int yStride = 0, int uvStride = 0,
    const float* cvMatrix = nullptr,
    const acl::YUVConvertParams& p = {});

// Corresponds to 12 NEON entry points (signatures match the CPP versions):
//   rgb2NV21_fixed / rgba2NV21_fixed / rgb2YV12_fixed / rgba2YV12_fixed
//   rgb2YUV444_fixed / rgba2YUV444_fixed
//   nv212RGB_fixed / nv212RGBA_fixed / yv122RGB_fixed / yv122RGBA_fixed
//   yuv4442RGB_fixed / yuv4442RGBA_fixed

NEON RGB/YUV input/output types are uint8_t only; all other parameters (including YUVConvertParams) match the CPP version exactly.


Example

cpp
uint8_t rgb[1920*1080*3];
uint8_t y[1920*1080], uv[1920*540*2];

// BT.601 full-range RGB → NV21 (default params)
acl::neon::cvtcolor::rgb2NV21_fixed<uint8_t, uint8_t>(
    rgb, y, uv, 1920, 1080);

// BT.709 limited-range NV12 (override defaults)
acl::YUVConvertParams p;
p.yuv_std         = acl::YUVEncodeStandard::STD_BT709;
p.nv21_fmt        = false;   // NV12
p.yuv_full_range  = false;   // limited range [16, 235/240]
acl::cvtcolor::rgb2NV21_fixed<uint8_t, uint8_t>(
    rgb, y, uv, 1920, 1080,
    /*rgbStride=*/0, /*yStride=*/0, /*uvStride=*/0,
    /*cvMatrix=*/nullptr, p);

RGB ↔ YUV (float, CPP only)

Floating-point implementation (higher precision, slightly slower than the fixed-point path; suitable for precision-sensitive scenarios). NEON does not provide this path.

Tier: Starter+
Channels: 3ch RGB ↔ NV21 / NV12 / YV12 / YUV444
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
RGB_Tuint8_t, uint16_t, float
YUV_Tuint8_t, uint16_t, float

CPP Signature (one-to-one correspondence with the _fixed version; only the _fixed suffix is removed)

Signatures, parameters, and the trailing YUVConvertParams& p are completely identical to the fixed version; the _float variants prioritise numerical accuracy, the _fixed variants prioritise throughput. 12 entry points:

RGB → YUV

cpp
template<class RGB_T, class YUV_T>
int rgb2NV21(const RGB_T* rgbImage, YUV_T* dstYImage, YUV_T* dstUVImage,
             int width, int height,
             int rgbStride = 0, int yStride = 0, int uvStride = 0,
             const float* cvMatrix = nullptr,
             const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T> int rgba2NV21(/* same as above with rgba prefix */);

template<class RGB_T, class YUV_T>
int rgb2YV12(const RGB_T* rgbImage,
             YUV_T* dstYImage, YUV_T* dstUImage, YUV_T* dstVImage,
             int width, int height,
             int rgbStride = 0, int yStride = 0, int uStride = 0, int vStride = 0,
             const float* cvMatrix = nullptr,
             const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T> int rgba2YV12(/* same as above */);

template<class RGB_T, class YUV_T>
int rgb2YUV444(const RGB_T* rgbImage, YUV_T* dstYUVImage,
               int width, int height,
               int rgbStride = 0, int yuvStride = 0,
               const float* cvMatrix = nullptr,
               const acl::YUVConvertParams& p = {});

template<class RGB_T, class YUV_T> int rgba2YUV444(/* same as above */);

YUV → RGB

cpp
template<class YUV_T, class RGB_T> int nv212RGB (...);   // signatures mirror nv212RGB_fixed
template<class YUV_T, class RGB_T> int nv212RGBA(...);
template<class YUV_T, class RGB_T> int yv122RGB (...);
template<class YUV_T, class RGB_T> int yv122RGBA(...);
template<class YUV_T, class RGB_T> int yuv4442RGB (...);
template<class YUV_T, class RGB_T> int yuv4442RGBA(...);

Parameter lists and defaults are completely identical to the corresponding _fixed entry points; only the function name drops _fixed.


Example

cpp
// float RGB → NV21 (float precision)
float rgb_f[1920*1080*3];
float y_f[1920*1080], uv_f[1920*540*2];

acl::cvtcolor::rgb2NV21<float, float>(
    rgb_f, y_f, uv_f, 1920, 1080);

Bayer Demosaic

Bayer raw image → RGB / RGBA. Supports 4 Bayer patterns (RGGB / GRBG / GBRG / BGGR).

Tier: Starter+
Channels: 1ch Bayer → 3ch RGB / 4ch RGBA
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
T (CPP)uint8_t, uint16_t
T (NEON)uint8_tNEON-only

CPP Version: acl::cvtcolor::bayer2RGB / bayer2RGBA

cpp
template<class ST, class DT>
int bayer2RGB(
    const ST* bayerImage, DT* rgbImage,
    int width, int height,
    int bayerStride, int rgbStride,
    int borderMode = 1,
    int bayerDataBit = 8,
    int RGBDataBit = 8,
    acl::BayerPattern pattern = acl::BayerPattern::GBRG);

// RGBA output (4ch):
template<class ST, class DT>
int bayer2RGBA(
    const ST* bayerImage, DT* rgbaImage,
    int width, int height,
    int bayerStride, int rgbaStride,
    int borderMode = 1,
    int bayerDataBit = 8,
    int RGBDataBit = 8,
    acl::BayerPattern pattern = acl::BayerPattern::GBRG);
ParameterTypeMeaningDefault
bayerImageconst ST*Input Bayer raw imagenon-null
rgbImage / rgbaImageDT*Output RGB / RGBA imagenon-null
width, heightintImage size> 0, even
bayerStride, rgbStride / rgbaStrideintBytes per row0 = auto
borderModeint0 = replicate the inner ring; 1 = reflect_1011
bayerDataBitintEffective Bayer data bits8
RGBDataBitintEffective RGB data bits8
patternacl::BayerPatternBayer mosaic pattern (RGGB / GRBG / GBRG / BGGR)GBRG

NEON Version: acl::neon::cvtcolor::bayer2RGB (uint8_t only, 3ch only)

cpp
int bayer2RGB(
    const uint8_t* bayerImage, uint8_t* rgbImage,
    int width, int height,
    acl::BayerPattern pattern,
    int bayerStride = 0, int rgbStride = 0);

NEON has no bayer2RGBA entry point, and borderMode is fixed to reflect_101.


Example

cpp
uint8_t bayer[1920*1080], rgb[1920*1080*3];

// NEON (runtime pattern)
acl::neon::cvtcolor::bayer2RGB(bayer, rgb, 1920, 1080,
    acl::BayerPattern::RGGB);

// CPP (u16 bayer → u8 RGB, RGGB pattern)
uint16_t bayer16[1920*1080];
acl::cvtcolor::bayer2RGB<uint16_t, uint8_t>(
    bayer16, rgb, 1920, 1080,
    /*bayerStride=*/0, /*rgbStride=*/0,
    /*borderMode=*/1, /*bayerDataBit=*/10, /*RGBDataBit=*/8,
    acl::BayerPattern::RGGB);

RGB ↔ HSV

Conversion between RGB/BGR and HSV.

Tier: Pro+
Channels: 3ch ↔ 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
// CPP and NEON have identical names and signatures; only the namespace differs
int bgr2HSV(const uint8_t* bgrImage, uint8_t* hsvImage,
            int width, int height, int srcStride = 0, int dstStride = 0);
int rgb2HSV(const uint8_t* rgbImage, uint8_t* hsvImage, ...);
int hsv2BGR(const uint8_t* hsvImage, uint8_t* bgrImage, ...);   // CPP only

NEON provides only the two entry points bgr2HSV / rgb2HSV; there is no hsv2BGR. Use the CPP version for the reverse conversion.

ParameterTypeMeaningDefault
bgrImage / rgbImage / hsvImageconst uint8_t* / uint8_t*input / output planesnon-null
width, heightintImage size> 0
srcStride, dstStrideintBytes per row0 = auto

HSV encoding: H ∈ [0, 180] (OpenCV-compatible), S, V ∈ [0, 255].


Example

cpp
uint8_t rgb[1920*1080*3], hsv[1920*1080*3];

acl::neon::cvtcolor::rgb2HSV(rgb, hsv, 1920, 1080);

RGB ↔ Lab

Conversion between RGB/BGR and CIE Lab.

Tier: Pro+
Channels: 3ch ↔ 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
int bgr2Lab(const uint8_t* bgrImage, uint8_t* labImage,
            int width, int height, int srcStride = 0, int dstStride = 0);
int rgb2Lab(const uint8_t* rgbImage, uint8_t* labImage, ...);
int lab2BGR(const uint8_t* labImage, uint8_t* bgrImage, ...);   // CPP only

NEON provides only bgr2Lab / rgb2Lab; there is no lab2BGR.

Lab encoding: L ∈ [0, 255] (mapping for L* 0-100), a, b ∈ [0, 255] (centered on 128).


Example

cpp
uint8_t rgb[1920*1080*3], lab[1920*1080*3];
acl::neon::cvtcolor::rgb2Lab(rgb, lab, 1920, 1080);

gammaTransform

Gamma transform: dst = A * base * (src/base)^gamma; used for display gamma correction, exposure compression, etc.

Tier: Starter+
Channels: 1ch
Inplace: supported
Types:

Template parameterAllowed typesConstraint
T{uint8_t, uint16_t}
GT (computation) ∈ {float, double}

CPP Signature

cpp
template<class T, class GT = float>
int gammaTransform(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    const GT& gamma,
    int normalizeBase,
    bool if_round = false,
    const GT& A = GT{1});
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
width, heightintImage size> 0
srcStride, dstStrideintBytes per rowrequired (0 = auto)
gammaGTGamma exponent2.2 (display) / 1/2.2 (inverse gamma)
normalizeBaseintBase value used to normalize to [0, 1]u8: 255 / u16: 1023, etc.
if_roundboolWhether to round the result (otherwise floor)false
AGTLinear scaling coefficient1

Example

cpp
uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Display gamma correction (2.2)
acl::cvtcolor::gammaTransform<uint8_t, float>(
    srcImage, dstImage, 1920, 1080, 0, 0, 2.2f, 255);

Filter

Namespace: acl::filter (cpp) / acl::neon::filter (NEON)

gaussianBlur

Gaussian blur (low-pass filter), accelerated with a separable kernel (row kernel × column kernel).

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
STuint8_t, uint16_t, float
DTuint8_t, uint16_t, float

CPP Version: acl::filter::gaussianBlur

cpp
template<class ST, class DT>
int gaussianBlur(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    int kRadiusX, int kRadiusY,
    double sigmaX = 0.0, double sigmaY = 0.0,
    ST* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImage, dstImageconst ST* / DT*input / output; dstImage must be pre-allocatednon-null
width, heightintImage size (pixels)> 0
cnintChannel count1 or 3
srcStride, dstStrideintBytes per row0 = auto
kRadiusX, kRadiusYintKernel radius; kernel size = 2r+1≥ 1
sigmaX, sigmaYdoubleGaussian sigma0 = auto (σ = 0.15·kSize + 0.35)
constantST*BORDER_CONSTANT fill-value pointernullptr
btacl::BorderTypeBorder handlingBORDER_REFLECT_101

NEON Version: acl::neon::filter::gaussianBlur* (uint8_t only)

The NEON layer provides fixed-kernel and generic entry points:

Entry pointKernelChannelsTiersigma configurable
gaussianBlur3x33×31Starter+
gaussianBlur3x3_3ch3×33Starter+
gaussianBlur5x55×51Starter+
gaussianBlur11x1111×111Starter+
gaussianBlur (generic)any 2r+11Starter+
gaussianBlur5x5_3ch5×53Starter+

Trial package note: Trial does not include gaussianBlur. Use the resize wrappers acl::trial::resizeBilinear2xDown_cpp(const uint8_t*, uint8_t*) or acl::trial::resizeBilinear2xDown_neon(const uint8_t*, uint8_t*) for the Trial demo surface.

Fixed-kernel signature (3x3 / 5x5 / 11x11 / 3x3_3ch / 5x5_3ch share the same signature):

cpp
int gaussianBlur3x3(   // or gaussianBlur5x5 / gaussianBlur11x11 / gaussianBlur3x3_3ch / gaussianBlur5x5_3ch
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    uint8_t constant = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Generic signature (supports arbitrary radius / sigma):

cpp
int gaussianBlur(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int kRadiusX, int kRadiusY,
    double sigmaX = 0.0, double sigmaY = 0.0,
    int constant = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Smart dispatch: when kSize ∈ {3, 5, 11} and sigma = 0, the generic version delivers the same throughput as the corresponding fixed-kernel variant; other (kSize, sigma) combinations fall back to the dynamic sepFilter2D performance profile.


Example

cpp
#include <acl/acl.h>
#include <acl/api.h>
acl::init("license.dat");

uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Case 1: 3×3 fixed kernel (Starter+ paid API)
acl::neon::filter::gaussianBlur3x3(srcImage, dstImage, 1920, 1080);

// Case 2: 5×5 + custom sigma (Starter+)
acl::neon::filter::gaussianBlur(srcImage, dstImage, 1920, 1080, 0, 0, 2, 2, 1.5, 1.5);

boxFilter

Box (mean) filter; all pixels within the kernel are summed with equal weight. Optional normalization (when normalize=true the result is divided by the kernel size, which is the standard mean filter).

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
ST, DT (CPP){uint8_t, uint16_t, float}
DT (NEON){uint8_t, int} (src is uint8_t)NEON-only

CPP Version: acl::filter::boxFilter

cpp
template<class ST, class DT>
int boxFilter(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    int kRadius,
    bool isNormalize = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImage, dstImageconst ST* / DT*input / outputnon-null
width, heightintImage size (pixels)> 0
cnintChannel count1 or 3
srcStride, dstStrideintBytes per row0 = auto
kRadiusintKernel radius (kSize = 2r+1)≥ 1
isNormalizebooltrue → mean (divide by kSize²); false → sumtrue
btacl::BorderTypeBorder handlingBORDER_REFLECT_101

NEON Version: acl::neon::filter::boxFilter* (uint8_t only)

Entry pointKernelChannelsTier
boxFilter3x33×31Starter+
boxFilter5x55×51Starter+
boxFilter (generic)any 2r+11Starter+

There is no NEON entry point for 3 channels yet; use the CPP version acl::filter::boxFilter<uint8_t,uint8_t>(..., cn=3).

Fixed-kernel signature (boxFilter3x3 / boxFilter5x5):

cpp
template<class DT>
int boxFilter3x3(   // or boxFilter5x5
    const uint8_t* srcImage, DT* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int constant = 0,
    bool isNormalize = true,
    int cn = 1,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Generic signature:

cpp
template<class DT>
int boxFilter(
    const uint8_t* srcImage, DT* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int kRadius, int constant,
    bool isNormalize = true,
    int cn = 1,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

DT supports uint8_t (normalize) / uint32_t (no normalize, overflow-safe). cn is a runtime parameter (1 or 3).


Example

cpp
uint8_t srcImage[1920*1080], dstImage[1920*1080];
uint32_t dstSum[1920*1080];

// 3×3 mean filter
acl::neon::filter::boxFilter3x3<uint8_t>(srcImage, dstImage, 1920, 1080, 0, 0);

// 5×5 sum (not normalized, u32 output)
acl::neon::filter::boxFilter5x5<uint32_t>(
    srcImage, dstSum, 1920, 1080, 0, 0,
    /*constant=*/0, /*isNormalize=*/false, /*cn=*/1,
    acl::BorderType::BORDER_REPLICATE);

filter2D

Generic 2D convolution (arbitrary kernel). Internally detects separable kernels; if separable, automatically converts to sepFilter2D for speedup.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
STuint8_t, uint16_t, float
DTuint8_t, uint16_t, float
KTuint8_t, uint16_t, float

CPP Version: acl::filter::filter2D

cpp
template<class ST, class DT, class KT>
int filter2D(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    const KT* kernel, int kRadiusX, int kRadiusY,
    const ST* constant = nullptr,
    bool isNormalize = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImage, dstImageconst ST* / DT*input / outputnon-null
width, heightintImage size> 0
cnintChannel count1 or 3
srcStride, dstStrideintBytes per row0 = auto
kernelconst KT*Kernel data, row-major, size (2rX+1)×(2rY+1)non-null
kRadiusX, kRadiusYintKernel radius≥ 1
constantconst ST*BORDER_CONSTANT fill-value pointernullptr
isNormalizeboolWhen true, output is divided by sum of kernel elementstrue
btacl::BorderTypeBorder handlingBORDER_REFLECT_101

NEON Version: acl::neon::filter::filter2D (uint8_t input only)

cpp
template<class DT, class KT>
int filter2D(
    const uint8_t* srcImage, DT* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    const KT* kernel, int kRadiusX, int kRadiusY,
    int constant = 0,
    bool isNormalize = true,
    int cn = 1,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

DT / KT: typically (uint8_t, float) or (int32_t, int32_t).


Example

cpp
uint8_t srcImage[1920*1080], dstImage[1920*1080];

// 5×5 Laplacian kernel (not normalized, sharpen)
int K[25] = { 0, 0,-1, 0, 0,
              0,-1,-2,-1, 0,
             -1,-2,17,-2,-1,
              0,-1,-2,-1, 0,
              0, 0,-1, 0, 0 };
acl::filter::filter2D<uint8_t, uint8_t, int>(
    srcImage, dstImage, 1920, 1080, 1, 0, 0, K, 2, 2,
    nullptr, /*isNormalize=*/false, acl::BorderType::BORDER_REPLICATE);

sepFilter2D

Separable 2D convolution: first convolves along rows with kernelX, then along columns with kernelY. Faster than filter2D: O(k) → O(2k).

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
STuint8_t, uint16_t, float
DTuint8_t, uint16_t, float
KTuint8_t, uint16_t, float

CPP Version: acl::filter::sepFilter2D

cpp
template<class ST, class DT, class KT>
int sepFilter2D(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    const KT* kernelX, const KT* kernelY,
    int kRadiusX, int kRadiusY,
    const ST* constant = nullptr,
    bool isNormalize = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImage, dstImageconst ST* / DT*input / outputnon-null
width, heightintImage size> 0
cnintChannel count1 or 3
srcStride, dstStrideintBytes per row0 = auto
kernelX, kernelYconst KT*Row / column 1D kernels, lengths 2rX+1 / 2rY+1 respectivelynon-null
kRadiusX, kRadiusYintKernel radius≥ 1
constantconst ST*BORDER_CONSTANT fill-value pointernullptr
isNormalizeboolIf true divide by kernel sumtrue
btacl::BorderTypeBorder-handling modeBORDER_REFLECT_101

NEON Version: acl::neon::filter::sepFilter2D (uint8_t only, 1ch / 3ch)

cpp
template<class KT>
int sepFilter2D(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    const KT* kernelX, const KT* kernelY,
    int kRadiusX, int kRadiusY,
    int constant = 0,
    int cn = 1,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

cn = 1 or 3 (channel count selected at runtime). KT: usually float.


Example

cpp
uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Decompose Gaussian 5×5 into [1,4,6,4,1]/16 × [1,4,6,4,1]/16
float kx[5] = {1,4,6,4,1}, ky[5] = {1,4,6,4,1};
acl::neon::filter::sepFilter2D<float>(
    srcImage, dstImage, 1920, 1080, 0, 0, kx, ky, 2, 2, /*constant=*/0, /*cn=*/1);

sobel3x3

3×3 Sobel edge-detection operator. The runtime flag isGradX selects whether to compute Gx (horizontal gradient) or Gy (vertical gradient).

Tier: Starter+
Channels: 1ch (call separately per channel or use filter2D)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
ST{uint8_t, uint16_t, float}
DTtypically int16_t / int32_t / float
Output type: int16_t (short) (since gradient values may be negative)

CPP Version: acl::filter::sobel3x3

cpp
template<class ST, class DT>
int sobel3x3(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride = 0, int dstStride = 0,
    const ST* constant = nullptr,
    bool isGradX = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImage, dstImageconst ST* / DT*input / outputnon-null
width, heightintImage size> 0
cnintChannel count1 or 3
srcStride, dstStrideintBytes per row0 = auto
constantconst ST*BORDER_CONSTANT fill valuenullptr
isGradXbooltrue: Gx (horizontal); false: Gy (vertical)true
btacl::BorderTypeBorder-handling modeBORDER_REFLECT_101

Both directions require two separate calls.


NEON Version: acl::neon::filter::sobel3x3 (uint8_t → int16_t only, 1ch)

cpp
int sobel3x3(
    const uint8_t* srcImage, int16_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    int constant = 0,
    bool isGradX = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Example

cpp
int16_t gx[1920*1080], gy[1920*1080];
acl::neon::filter::sobel3x3(src_u8, gx, 1920, 1080, 0, 0, 0, /*isGradX=*/true);   // Gx
acl::neon::filter::sobel3x3(src_u8, gy, 1920, 1080, 0, 0, 0, /*isGradX=*/false);  // Gy
// Gradient magnitude can be composed via acl::arithmetic::phaseMagnitude

scharr

3×3 Scharr edge-detection operator. Offers better rotational symmetry than Sobel and slightly higher numerical precision.

Tier: Starter+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
ST{uint8_t, uint16_t, float}
DTtypically int16_t / int32_t / float
Output type: int16_t

CPP Version: acl::filter::scharr

cpp
template<class ST, class DT>
int scharr(
    const ST* srcImage, DT* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    const ST* constant = nullptr,
    bool isGradX = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter semantics are the same as sobel3x3 (only the kernel coefficients are Scharr [-3, -10, -3; 0, 0, 0; 3, 10, 3] instead of Sobel).


NEON Version: acl::neon::filter::scharr (uint8_t → int16_t only)

cpp
int scharr(
    const uint8_t* srcImage, int16_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    bool isGradX = true,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Example

cpp
int16_t gx[1920*1080];
acl::neon::filter::scharr(src_u8, gx, 1920, 1080, 0, 0, /*isGradX=*/true);   // Gx (Scharr)

laplacian

Laplacian operator (second-order gradient), used for edge detection or sharpening. Internally performs two convolutions with a 3×3 or larger kernel.

Tier: Starter+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
ST{uint8_t, uint16_t, float}
DTtypically int16_t / int32_t / float
Output type: int16_t (second-order gradient may be negative)

CPP Version: acl::filter::laplacian

cpp
template<class ST, class DT>
int laplacian(
    const ST* srcImage, DT* dstImage,
    int width, int height, int cn,
    int srcStride = 0, int dstStride = 0,
    int ksize = 1, double scale = 1.0, double delta = 0.0,
    const ST* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImage, dstImageconst ST* / DT*input / outputnon-null
width, heightintImage size> 0
cnintChannel count1
srcStride, dstStrideintBytes per row0 = auto
ksizeintKernel aperture size1 (= 3×3 standard Laplacian)
scaledoubleOutput scaling factor1.0
deltadoubleOutput offset0.0
constantconst ST*BORDER_CONSTANT fill valuenullptr
btacl::BorderTypeBorder-handling modeBORDER_REFLECT_101

NEON Version: acl::neon::filter::laplacian (uint8_t → int16_t only)

cpp
int laplacian(
    const uint8_t* srcImage, int16_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    int ksize = 1, int constant = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

The NEON version omits scale / delta (fixed at 1.0 / 0.0). For scaling, use the CPP version.


Example

cpp
uint8_t srcImage[1920*1080];
int16_t dstImage[1920*1080];
acl::neon::filter::laplacian(srcImage, dstImage, 1920, 1080);

canny

Canny edge detection. Typical pipeline: Gaussian blur → gradient → non-maximum suppression → double-threshold linking.

Tier: Starter+
Channels: 1ch (grayscale input)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Src_Tuint8_t, uint16_tCPP backend
Src_Tuint8_tNEON backend

NEON Version: acl::neon::filter::canny (uint8_t only)

cpp
int canny(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int low_thresh, int high_thresh,
    int aperture_size = 3,
    int srcStride = 0, int dstStride = 0,
    bool l2GradFlag = true);
ParameterTypeMeaningDefault
srcImageconst uint8_t*Input grayscale imagenon-null
dstImageuint8_t*Output binary edge image (0 / 255)non-null
width, heightintImage size> 0
low_thresh, high_threshintLow / high double thresholdslow < high
aperture_sizeintSobel kernel aperture3 (only 3 is supported)
srcStride, dstStrideintBytes per row0 = auto
l2GradFlagbooltrue: L2 Euclidean gradient; false: L1 gradient (faster)true

Example

cpp
uint8_t srcImage[1920*1080], dstImage[1920*1080];
acl::neon::filter::canny(srcImage, dstImage, 1920, 1080, 50, 150);

morphology (erode / dilate)

Basic morphological operators: erosion (erode, taking the minimum over kernel coverage) and dilation (dilate, taking the maximum). Uses the O(N) van Herk / Gil-Werman algorithm; runtime is independent of kernel size.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float
Kernel shape: square (determined by radius; actual kernel size = 2r+1)

CPP Version: acl::filter::erode / acl::filter::dilate

cpp
template<class T>
int erode(
    const T* srcImage, T* dstImage,
    int width, int height, int cn, int radius,
    int srcStride = 0, int dstStride = 0);

template<class T>
int dilate(
    const T* srcImage, T* dstImage,
    int width, int height, int cn, int radius,
    int srcStride = 0, int dstStride = 0);
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
width, heightintImage size> 0
cnintChannel count1 or 3
radiusintStructuring-element radius (square kernel, size = 2r+1)≥ 1
srcStride, dstStrideintBytes per row0 = auto

NEON Version: acl::neon::filter::erode / acl::neon::filter::dilate (uint8_t only)

cpp
int erode(   // or dilate
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height, int cn, int radius,
    int srcStride = 0, int dstStride = 0);

Parameter semantics match the CPP version.


Example

cpp
uint8_t srcImage[1920*1080], dstImage[1920*1080];

// 3×3 erosion (radius=1)
acl::filter::erode<uint8_t>(srcImage, dstImage, 1920, 1080, 1, 1);

// 11×11 dilation (radius=5; the O(N) algorithm is unaffected by size)
acl::filter::dilate<uint8_t>(srcImage, dstImage, 1920, 1080, 1, 5);

medianFilter

3×3 median filter — outputs the median of the 9 pixels within the kernel. Typical use: salt-and-pepper noise removal.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: supported (src == dst OK)
Types:

Template parameterAllowed typesConstraint
DTuint8_t, uint16_t, float
Kernel size: 3×3 only (5×5 and larger are not implemented)

CPP Version: acl::filter::medianFilter3x3

cpp
template<class DT>
int medianFilter3x3(
    const DT* srcImage, DT* dstImage,
    int width, int height,
    int cn = 1,
    int srcStride = 0, int dstStride = 0,
    DT borderValue = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImage, dstImageconst DT* / DT*input / output (inplace supported)non-null
width, heightintImage size> 0
cnintChannel count1
srcStride, dstStrideintBytes per row0 = auto
borderValueDTBORDER_CONSTANT fill value0
btacl::BorderTypeBorder-handling modeBORDER_REFLECT_101

NEON Version: acl::neon::filter::medianFilter3x3 / medianFilter3x3_3ch (uint8_t only)

Entry pointChannelsTier
medianFilter3x31Starter+
medianFilter3x3_3ch3Starter+
cpp
int medianFilter3x3(   // or medianFilter3x3_3ch
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    uint8_t borderValue = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Example

cpp
uint8_t srcImage[1920*1080];

// Salt-and-pepper denoise (in-place)
acl::neon::filter::medianFilter3x3(srcImage, srcImage, 1920, 1080);

bilateralFilter

Edge-preserving smoothing — bilateral filter; simultaneously considers spatial distance and pixel-value difference, denoising while preserving edges.

Tier: Pro+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
T (CPP)uint8_t, uint16_t, float
T (NEON)uint8_tNEON-only

CPP Version: acl::filter::bilateralFilter

cpp
template<class T = uint8_t>
int bilateralFilter(
    const T* srcImage, T* dstImage,
    int width, int height, int cn,
    int srcStride, int dstStride,
    int d, double sigmaColor, double sigmaSpace,
    const T* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
width, heightintImage size> 0
cnintChannel count1 or 3
srcStride, dstStrideintBytes per rowrequired
dintFilter radius (kernel = 2d+1)≥ 1
sigmaColordoubleStandard deviation in color spacetypically 10-100
sigmaSpacedoubleStandard deviation in coordinate spacetypically 10-100
constantconst T*BORDER_CONSTANT fill valuenullptr
btacl::BorderTypeBorder handlingBORDER_REFLECT_101

NEON Version: acl::neon::filter::bilateralFilter (uint8_t only, 1ch / 3ch)

cpp
int bilateralFilter(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int d, double sigmaColor, double sigmaSpace,
    int constant = 0,
    int cn = 1,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

cn = 1 or 3 (channel count, runtime).


Example

cpp
uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Edge-preserving denoising, d=5, sigmaColor=sigmaSpace=30
acl::neon::filter::bilateralFilter(
    srcImage, dstImage, 1920, 1080, 0, 0, 5, 30.0, 30.0);

nlMeansDenoising

Non-Local Means denoising — searches for similar patches within the entire search window and forms a weighted average, denoising while preserving details.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
T (CPP)uint8_t, uint16_t, float
T (NEON)uint8_tNEON-only

CPP Version: acl::filter::nlMeansDenoising

cpp
template<class T = uint8_t>
int nlMeansDenoising(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    float h,
    int patchRadius = 3,
    int searchRadius = 10);
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
width, heightintImage size> 0
srcStride, dstStrideintBytes per rowrequired
hfloatDenoising strength (larger = smoother; typically 5-15)required
patchRadiusintPatch radius (patch = 2r+1)3 (= 7×7)
searchRadiusintSearch-window radius10 (= 21×21)

NEON Version: acl::neon::filter::nlMeansDenoising (uint8_t only)

cpp
int nlMeansDenoising(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    float h,
    int patchRadius = 3,
    int searchRadius = 10);

Non-templated; the signature matches the CPP version with the template <T> removed.


Example

cpp
uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Light denoising (h=10, patch 7×7, search 21×21 — defaults)
acl::neon::filter::nlMeansDenoising(srcImage, dstImage, 1920, 1080, 0, 0, 10.0f);

// Stronger denoising + larger search window
acl::neon::filter::nlMeansDenoising(srcImage, dstImage, 1920, 1080, 0, 0, 15.0f, 3, 15);

guidedFilter

Guided filter — edge-aware smoothing of the input image based on the guide image (guideImage). O(N) complexity (does not grow with kernel size).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP Version: acl::filter::guidedFilter

cpp
template<class T>
int guidedFilter(
    const T* guideImage,
    const T* srcImage,
    T* dstImage,
    int width, int height,
    int guideStride, int srcStride, int dstStride,
    int radius, double eps);
ParameterTypeMeaningDefault
guideImageconst T*Guide image (often identical to srcImage; another image also works)non-null
srcImage, dstImageconst T* / T*input / outputnon-null
width, heightintImage size> 0
guideStride, srcStride, dstStrideintRespective bytes per rowrequired
radiusintWindow radius (window size = 2r+1)≥ 1
epsdoubleRegularization parametertypically 0.01 (integer images 1.0-100.0)

NEON Version: acl::neon::filter::guidedFilter (uint8_t only)

cpp
int guidedFilter(
    const uint8_t* guideImage,
    const uint8_t* srcImage,
    uint8_t* dstImage,
    int width, int height,
    int guideStride, int srcStride, int dstStride,
    int radius, double eps);

Example

cpp
// Use the source image itself as the guide; radius=8, eps=1000
acl::filter::guidedFilter<uint8_t>(
    srcImage, srcImage, dstImage, 1920, 1080, 1920, 1920, 1920, 8, 1000.0);

stackBlur

O(1) approximate Gaussian blur (complexity independent of kernel size); suitable for large-kernel scenarios. The effect is close to Gaussian but slightly different; used where exact Gaussian is not required (e.g. UI blurred backgrounds).

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP Version: acl::filter::stackBlur

cpp
template<class T>
int stackBlur(
    const T* srcImage, T* dstImage,
    int width, int height, int cn,
    int srcStride = 0, int dstStride = 0,
    int kSizeX = 3, int kSizeY = 3);
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
width, heightintImage size> 0
cnintChannel count1 or 3
srcStride, dstStrideintBytes per row0 = auto
kSizeX, kSizeYintHorizontal / vertical kernel size (must be odd)3

NEON Version: acl::neon::filter::stackBlur (uint8_t only)

cpp
int stackBlur(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height, int cn,
    int srcStride = 0, int dstStride = 0,
    int kSizeX = 3, int kSizeY = 3);

Example

cpp
// 21×21 blur (well beyond Gaussian 11×11; the O(1) algorithm does not degrade)
acl::filter::stackBlur<uint8_t>(srcImage, dstImage, 1920, 1080, 1, 0, 0, 21, 21);

unsharpMask

Unsharp Mask — subtracts a blurred image from the source to produce a sharpening enhancement. amount controls the sharpening strength.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP Version: acl::filter::unsharpMask

cpp
template<class T>
int unsharpMask(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int kRadius, double sigma, float amount);
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
width, heightintImage size> 0
srcStride, dstStrideintBytes per rowrequired
kRadiusintGaussian kernel radius≥ 1
sigmadoubleGaussian sigma (controls blur strength)typically 1.0 ~ 2.0
amountfloatSharpening strengthtypically 0.5 ~ 2.0 (1.0 = original strength)

NEON Version: acl::neon::filter::unsharpMask (uint8_t only)

cpp
int unsharpMask(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int kRadius, double sigma, float amount);

Example

cpp
// Light sharpening (radius=2, sigma=1.5, amount=1.2)
acl::filter::unsharpMask<uint8_t>(
    srcImage, dstImage, 1920, 1080, 0, 0, 2, 1.5, 1.2f);

gaborFilter

Gabor filter — sinusoidal × Gaussian-modulated kernel used for texture analysis and direction-sensitive edge detection.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP Version: acl::filter::gaborFilter

cpp
template<class T>
int gaborFilter(
    const T* srcImage, T* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int ksize, double sigma, double theta,
    double lambd, double gamma, double psi);
ParameterTypeMeaningTypical value
srcImage, dstImageconst T* / T*input / outputnon-null
width, heightintImage size> 0
srcStride, dstStrideintBytes per rowrequired
ksizeintKernel size (typically odd)21 / 31
sigmadoubleGaussian envelope sigma4.0-8.0
thetadoubleDirection (radians)0 (horizontal) ~ π
lambddoubleSinusoidal wavelength10.0
gammadoubleSpatial aspect ratio0.5
psidoublePhase offset0

NEON Version: acl::neon::filter::gaborFilter (uint8_t only)

cpp
int gaborFilter(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride, int dstStride,
    int ksize, double sigma, double theta,
    double lambd, double gamma, double psi);

Example

cpp
uint8_t srcImage[1920*1080], dstImage[1920*1080];

// Horizontal-direction Gabor kernel
acl::neon::filter::gaborFilter(srcImage, dstImage, 1920, 1080, 0, 0,
    21, 4.0, 0.0, 10.0, 0.5, 0.0);

edgePreservingFilter / detailEnhance

Edge-preserving filtering — O(N) edge-preserving smoothing based on recursive domain transforms (Gastal & Oliveira, SIGGRAPH 2011).
edgePreservingFilter smooths while preserving edges; detailEnhance uses it in reverse to enhance details.

Tier: Business
Channels: 3ch (RGB)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP Signature

cpp
int edgePreservingFilter(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    float sigmaS = 60.0f, float sigmaR = 0.4f,
    int numIter = 3);

int detailEnhance(
    const uint8_t* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    float sigmaS = 10.0f, float sigmaR = 0.15f);
ParameterTypeMeaningDefault
sigmaSfloatSpatial sigma (larger → larger smoothing range)60.0 (edge-preserving) / 10.0 (detail)
sigmaRfloatColor-value sigma0.4 / 0.15
numIterintNumber of iterations (edge-preserving)3

Example

cpp
uint8_t rgbSrc[1920*1080*3], rgbDst[1920*1080*3];

// Cartoon-style effect
acl::filter::edgePreservingFilter(rgbSrc, rgbDst, 1920, 1080, 0, 0, 60.0f, 0.4f, 3);

// Detail enhancement
acl::filter::detailEnhance(rgbSrc, rgbDst, 1920, 1080);

tonemap

HDR tone mapping — compresses a float HDR image into the uint8_t LDR display space. Provides 3 classic algorithms.

Tier: Business
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Inputfloat
Outputuint8_t

CPP Signature

cpp
// Linear exposure (simple gamma)
int tonemapLinear(
    const float* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    float gamma = 2.2f, float exposure = 1.0f);

// Reinhard (global key control)
int tonemapReinhard(
    const float* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    float gamma = 2.2f,
    float key = 0.18f,
    float lWhite = 0.0f);

// Drago (logarithmic mapping, suitable for high dynamic range)
int tonemapDrago(
    const float* srcImage, uint8_t* dstImage,
    int width, int height,
    int srcStride = 0, int dstStride = 0,
    float gamma = 2.2f,
    float saturation = 1.0f,
    float bias = 0.85f);
ParameterMeaningTypical value
gammaOutput gamma correction2.2
exposure (Linear)Exposure multiplier1.0
key (Reinhard)Mean-luminance key0.18
lWhite (Reinhard)White point (0 = auto-take max)0.0
saturation (Drago)Saturation1.0
bias (Drago)Bias0.85

Example

cpp
float hdr[1920*1080];
uint8_t ldr[1920*1080];

// Simplest: linear + gamma
acl::filter::tonemapLinear(hdr, ldr, 1920, 1080);

// Use Reinhard when the scene is too bright
acl::filter::tonemapReinhard(hdr, ldr, 1920, 1080, 0, 0, 2.2f, 0.18f);

mergeMertens

Multi-exposure image fusion (Mertens et al. 2007) — fuses multiple images at different exposures into a single balanced-exposure output. A simpler alternative to the HDR + tonemap pipeline.

Tier: Business
Channels: 3ch (RGB)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP Signature

cpp
int mergeMertens(
    const uint8_t** images, int numImages,
    uint8_t* dstImage,
    int width, int height,
    const int* strides = nullptr,
    int dstStride = 0,
    float wContrast = 1.0f,
    float wSaturation = 1.0f,
    float wExposure = 1.0f);
ParameterMeaningTypical value
imagesInput image pointer array (3ch RGB)numImages images
numImagesNumber of input imagestypically 3 (underexposed / normal / overexposed)
stridesPer-image stride array; when nullptr, all treated as width*3optional
wContrastContrast weight1.0
wSaturationSaturation weight1.0
wExposureExposure-quality weight1.0

Example

cpp
uint8_t under[1920*1080*3], normal[1920*1080*3], over[1920*1080*3];
uint8_t fused[1920*1080*3];

const uint8_t* imgs[3] = { under, normal, over };
acl::filter::mergeMertens(imgs, 3, fused, 1920, 1080);

Geometric

Namespace: acl::geometric (CPP) / acl::neon::geometric (NEON)

resize

Image resize, supporting 4 interpolation modes (NEAREST / LINEAR2D / CUBIC4x4 / AREA_AVG).

Tier: Starter+
Channels: 1ch / 3ch / 4ch (runtime via hcn / cn)
Inplace: supported only when srcWidth == dstWidth && srcHeight == dstHeight (degenerate copy / crop case)
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP Version: acl::geometric::resize

cpp
template<class T, class OT = float>
int resize(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    int hcn = 1, int vcn = 1,
    acl::InterpMode im = acl::InterpMode::LINEAR2D);
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
srcWidth, srcHeightintSource image size> 0
dstWidth, dstHeightintDestination image size> 0
srcStride, dstStrideintBytes per row0 = auto
hcnintHorizontal channel count (grayscale 1 / Bayer 2 / RGB 3 / RGBA 4)1
vcnintVertical channel count1
imacl::InterpModeNEAREST / LINEAR2D (default) / CUBIC4x4 / AREA_AVGLINEAR2D

Template parameters:

  • T — input / output element type
  • OT — intermediate interpolation type (float default, double high-precision)

NEON Version: acl::neon::geometric::resize (uint8_t / uint16_t)

cpp
template<class T>
int resize(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    acl::InterpMode im = acl::InterpMode::LINEAR2D,
    int cn = 1,
    int shiftRight = 0);

Parameters:

  • Tuint8_t / uint16_t
  • im — same as the CPP version
  • cn — channel count (1 / 3 / 4)
  • shiftRight — number of bits to right-shift the data for P010 format (set when 10-bit is in the high bits of 16-bit)

Example

cpp
uint8_t srcImage[1920*1080], dstImage[960*540];

// NEON LINEAR2D 1ch, 2× downscale
acl::neon::geometric::resize<uint8_t>(
    srcImage, dstImage, 1920, 1080, 960, 540);

// CPP LINEAR2D 3ch upscale
uint8_t rgbSrc[640*480*3], rgbDst[1280*960*3];
acl::geometric::resize<uint8_t>(
    rgbSrc, rgbDst, 640, 480, 1280, 960,
    /*srcStride=*/0, /*dstStride=*/0,
    /*hcn=*/3, /*vcn=*/1,
    acl::InterpMode::LINEAR2D);

// float CUBIC4x4 (CPP only)
float fSrc[640*480], fDst[1280*960];
acl::geometric::resize<float>(
    fSrc, fDst, 640, 480, 1280, 960,
    0, 0, 1, 1, acl::InterpMode::CUBIC4x4);

rotate

Image rotation / flip / transpose. Supports ROT_0 / ROT_180 / ROT_CW_90 / ROT_CCW_90 / FLIP_H / FLIP_V / XPOSE.

Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported for ROT_180 / FLIP_H / FLIP_V (srcImage == dstImage)
Types:

Template parameterAllowed typesConstraint
T (CPP){uint8_t, uint16_t, float}
uint8_t only (NEON) (—)

Channel / type support matrix

Entry pointChannelsTypes
CPP rotate<T>any (packed via runtime blockW / blockH)T any (uint8_t / uint16_t / float / …)
NEON rotate<T>1ch onlyuint8_t only
NEON rotateNVNV21 / NV12 (Y + UV)uint8_t
NEON rotateYV12YV12 / I420 (Y + U + V)uint8_t
NEON rotateYUV444YUV444 (3ch interleaved)uint8_t

Recommended approach for RGB / RGBA rotation:

  • NEON-accelerated path: there is no direct NEON rotate entry for RGB, but since pure rotation is essentially memory movement, the CPP blockW=3/4 path already uses memcpy + inlined reads/writes, so for RGB/RGBA the performance is equivalent.
  • For multi-channel, use CPP rotate<uint8_t> with blockW=3 (RGB) or blockW=4 (RGBA); see the example below.

CPP Version: acl::geometric::rotate

cpp
template<class T>
int rotate(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    acl::RotateOrient ori = acl::RotateOrient::ROT_CW_90,
    int blockW = 1, int blockH = 1);
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
srcWidth, srcHeightintSource image size, counted in pixel blocks (not pixels)must be divisible by blockW/H
dstWidth, dstHeightintDestination size, same as abovesame as above
srcStride, dstStrideintBytes per row0 = auto
oriacl::RotateOrientRotation directionROT_CW_90
blockW, blockHintPixel-block size (unit). 1,1 handles single-channel scalar; blockW=3 packs one RGB pixel as an indivisible unit; blockW=4 packs RGBA.1, 1

Note that srcWidth / dstWidth are given in block count: a grayscale image 1920 wide → srcWidth=1920; RGB 1920 wide → srcWidth=1920 (block count is still 1920, not 5760) with blockW=3.


NEON Version: acl::neon::geometric::rotate (uint8_t only, 1ch)

cpp
template<class T>
int rotate(
    const T* srcImage, T* dstImage,
    int srcW, int srcH,
    int srcStride = 0, int dstStride = 0,
    acl::RotateOrient ori = acl::RotateOrient::ROT_CW_90);

Signature differs slightly: no dstWidth/dstHeight; the output size is derived from ori (CW/CCW/XPOSE → swap width and height, otherwise → unchanged).


Example

cpp
// 1) 1ch u8 goes through NEON
uint8_t srcImage[1920*1080], dstImage[1080*1920];
acl::neon::geometric::rotate<uint8_t>(
    srcImage, dstImage, 1920, 1080,
    /*srcStride=*/0, /*dstStride=*/0,
    acl::RotateOrient::ROT_CW_90);

// 2) RGB (3ch) goes through CPP; blockW=3 packs the pixel
uint8_t rgbSrc[1920*1080*3], rgbDst[1080*1920*3];
acl::geometric::rotate<uint8_t>(
    rgbSrc, rgbDst,
    /*srcWidth=*/1920, /*srcHeight=*/1080,   // block count, not 1920*3
    /*dstWidth=*/1080, /*dstHeight=*/1920,
    /*srcStride=*/1920*3, /*dstStride=*/1080*3,
    acl::RotateOrient::ROT_CW_90,
    /*blockW=*/3, /*blockH=*/1);

// 3) float 1ch 180° in-place
float img[512*512];
acl::geometric::rotate<float>(
    img, img, 512, 512, 512, 512,
    0, 0,
    acl::RotateOrient::ROT_180);

// 4) YUV NV21 rotation uses the dedicated NEON entry (see the "YUV rotate" section)

pyrDown

2× downsample (Gaussian pyramid, going down): first 5×5 Gaussian smoothing, then 2×2 downsampling.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t, uint16_t, float

CPP Version: acl::geometric::pyrDown

cpp
template<class T>
int pyrDown(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight, int cn,
    int srcStride = 0, int dstStride = 0,
    int dstWidth = 0, int dstHeight = 0,
    const T* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImage, dstImageconst T* / T*input / outputnon-null
srcWidth, srcHeightintSource image size≥ 2
cnintChannel count1 or 3
srcStride, dstStrideintBytes per row0 = auto
dstWidth, dstHeightintDestination size ((src+1)/2 when 0)OpenCV-compatible: |dstW*2 - srcW| ≤ 2
constantconst T*BORDER_CONSTANT fill valuenullptr
btacl::BorderTypeBorder-handling modeBORDER_REFLECT_101

NEON Version: acl::neon::geometric::pyrDown (uint8_t only)

cpp
template<class T>
int pyrDown(...)   // signature is completely identical to the CPP version

The type is fixed to uint8_t; parameter semantics match the CPP version.


pyrUp

2× upsample (Gaussian pyramid, going up): interpolated upscaling followed by 5×5 Gaussian smoothing.

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
T (CPP)uint8_t (only uint8_t is currently supported)
uint8_t only (NEON) (—)

CPP / NEON Signature (identical)

cpp
template<class T>
int pyrUp(
    const T* srcImage, T* dstImage,
    int srcWidth, int srcHeight, int cn,
    int srcStride = 0, int dstStride = 0,
    int dstWidth = 0, int dstHeight = 0,
    const T* constant = nullptr,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);

Parameter semantics are the same as pyrDown; default output is srcWidth*2 × srcHeight*2.


buildPyramid

Build a multi-level Gaussian pyramid (accumulated successive pyrDown). pyramid[0] = pyrDown(srcImage), pyramid[1] = pyrDown(pyramid[0]), ...

Tier: Starter+
Channels: 1ch / 3ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
T (CPP)uint8_t (only uint8_t is currently supported)
uint8_t only (NEON) (—)

CPP / NEON Signature (identical)

cpp
template<class T>
int buildPyramid(
    const T* srcImage,
    T** pyramid,
    int* widths, int* heights,
    int srcWidth, int srcHeight,
    int cn, int numLevels,
    int srcStride = 0,
    acl::BorderType bt = acl::BorderType::BORDER_REFLECT_101);
ParameterTypeMeaningDefault
srcImageconst T*Input image (level 0)non-null
pyramidT**Output pointer array (numLevels entries, pre-allocated by the caller)non-null
widths, heightsint*Output sizes at each levelnon-null
srcWidth, srcHeightintSource image size≥ 2
cnintChannel count1 or 3
numLevelsintNumber of levels≥ 1
srcStrideintSource bytes per row0 = auto

Example

cpp
uint8_t srcImage[1920*1080];
uint8_t l0[960*540], l1[480*270], l2[240*135];
uint8_t* pyr[3] = { l0, l1, l2 };
int widths[3], heights[3];

acl::neon::geometric::buildPyramid<uint8_t>(
    srcImage, pyr, widths, heights, 1920, 1080, 1, 3);

YUV resize (NV21 / NV12 / YV12 / YUV444)

Native YUV resize; avoids the YUV ↔ RGB conversion. Each channel is resized independently (the NV plane is split → resize → merge to avoid mixing U/V).

Tier: Starter+
Channels: Y plane (1ch) + UV plane (interleaved or separate)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

NEON Signature

cpp
// NV21 / NV12 — full-resolution Y; UV plane at 1/4 resolution
int resizeNV(
    const uint8_t* srcYImage, const uint8_t* srcUVImage,
    uint8_t* dstYImage, uint8_t* dstUVImage,
    int srcW, int srcH, int dstW, int dstH,
    int srcYStride = 0, int srcUVStride = 0,
    int dstYStride = 0, int dstUVStride = 0,
    bool nv21Fmt = true);

// YV12 / I420 — full-resolution Y; U / V each at 1/4 resolution
int resizeYV12(
    const uint8_t* srcYImage, const uint8_t* srcUImage, const uint8_t* srcVImage,
    uint8_t* dstYImage, uint8_t* dstUImage, uint8_t* dstVImage,
    int srcW, int srcH, int dstW, int dstH,
    int srcYStride = 0, int srcUVStride = 0,
    int dstYStride = 0, int dstUVStride = 0);

// YUV444 — 3 channels interleaved, no sub-sampling
int resizeYUV444(
    const uint8_t* srcImage, uint8_t* dstImage,
    int srcW, int srcH, int dstW, int dstH,
    int srcStride = 0, int dstStride = 0);

nv21Fmt = true → NV21 (V/U order); false → NV12 (U/V order).
NV / YV12 require input sizes to be even.


YUV rotate (NV21 / NV12 / YV12 / YUV444)

Native YUV rotation; avoids the YUV ↔ RGB conversion.

Tier: Starter+
Channels: Y plane (1ch) + UV plane (interleaved or separate)
Inplace: supported for ROT_180 / FLIP_H / FLIP_V
Types:

Template parameterAllowed typesConstraint
Tuint8_t

NEON Signature

cpp
// NV21 / NV12
int rotateNV(
    const uint8_t* srcYImage, const uint8_t* srcUVImage,
    uint8_t* dstYImage, uint8_t* dstUVImage,
    int srcW, int srcH,
    int* dstW = nullptr, int* dstH = nullptr,
    acl::RotateOrient ori = acl::RotateOrient::ROT_CW_90,
    bool nv21Fmt = true);

// YV12 / I420
int rotateYV12(
    const uint8_t* srcYImage, const uint8_t* srcUImage, const uint8_t* srcVImage,
    uint8_t* dstYImage, uint8_t* dstUImage, uint8_t* dstVImage,
    int srcW, int srcH,
    int* dstW = nullptr, int* dstH = nullptr,
    acl::RotateOrient ori = acl::RotateOrient::ROT_CW_90);

// YUV444 (single-plane interleaved, 3ch)
int rotateYUV444(
    const uint8_t* srcImage, uint8_t* dstImage,
    int srcW, int srcH,
    int* dstW = nullptr, int* dstH = nullptr,
    acl::RotateOrient ori = acl::RotateOrient::ROT_CW_90);

dstW / dstH are optional output parameters — the caller passes in pointers and the function fills in the rotated destination size (for ROT_CW_90 / ROT_CCW_90 / XPOSE, this is a width/height swap). Pass nullptr if this info is not needed.


Example

cpp
uint8_t srcY[1920*1280], srcUV[1920*640*2];
uint8_t dstY[1280*1920], dstUV[1280*640*2];

// NV21 clockwise 90°
int dstW = 0, dstH = 0;
acl::neon::geometric::rotateNV(
    srcY, srcUV, dstY, dstUV, 1920, 1280,
    &dstW, &dstH,
    acl::RotateOrient::ROT_CW_90,
    /*nv21Fmt=*/true);
// dstW == 1280, dstH == 1920

Feature Detection

Namespace: acl::feature (CPP) / acl::neon::feature (NEON)

Common data structures (in the acl:: namespace):

cpp
struct KeyPoint      { int x, y; float response; };
struct KeyPointORB   { float x, y, response, scale, angle; uint8_t descriptor[32]; };
struct KeyPointExt   { float x, y, response, scale, angle; float descriptor[128]; };
struct Point2f       { float x, y; };
struct DMatch        { int queryIdx, trainIdx; float distance; };
struct Vec2f         { float val[2]; };    // (rho, theta)
struct Vec3f         { float val[3]; };    // (cx, cy, radius)
struct Vec4i         { int val[4]; };      // (x1, y1, x2, y2)

FAST Corner Detection

FAST corner detection (Bresenham 16-pixel circle comparison).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
int fastCornerDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    int threshold = 20,
    bool nonmaxSuppress = true,
    int type = 9);
ParameterTypeMeaningDefault
srcImageconst uint8_t*Input grayscale imagenon-null
srcStrideintBytes per row0 = width
keypointsvector<KeyPoint>&Output corners
thresholdintGrayscale difference threshold20
nonmaxSuppressboolWhether to perform NMStrue
typeintFAST-N (9 or 12; N is the number of consecutive pixels)9

Harris Corner Detection

Harris corner detection. Response R = det(M) - k * trace(M)^2, where M is the structure tensor.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
// Detect corners (including NMS)
int harrisCornerDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    int blockSize = 2,
    float k = 0.04f,
    float threshold = 1e6f);

// Emit the Harris response map only (caller does the threshold / NMS) — CPP only
int harrisResponse(
    const uint8_t* srcImage, float* dstImage,
    int width, int height, int srcStride,
    int blockSize = 2, float k = 0.04f);
ParameterTypeMeaningDefault
blockSizeintStructure-tensor neighborhood radius2
kfloatHarris free parameter0.04
thresholdfloatMinimum response threshold1e6

Shi-Tomasi Corner Detection

Shi-Tomasi corners (Good Features to Track): response = min(λ1, λ2) (the structure tensor's eigenvalues).

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
int shiTomasiCornerDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    int maxCorners = 500,
    float qualityLevel = 0.01f,
    float minDistance = 10.0f,
    int blockSize = 2);

// Alias (same semantics as shiTomasiCornerDetect)
int shiTomasiDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& corners,
    int maxCorners = 500,
    float qualityLevel = 0.01f,
    float minDistance = 10.0f,
    int blockSize = 2);

// Emit the min-eigenvalue response map only — CPP only
int minEigenValResponse(
    const uint8_t* srcImage, float* dstImage,
    int width, int height, int srcStride,
    int blockSize = 2);
ParameterTypeMeaningDefault
maxCornersintUpper bound on returned corners (0 = no limit)500
qualityLevelfloatMinimum quality ratio relative to the strongest response0.01
minDistancefloatMinimum Euclidean distance between adjacent corners10.0
blockSizeintStructure-tensor neighborhood radius2

ORB (Oriented FAST and Rotated BRIEF)

ORB detection + 256-bit rBRIEF descriptors.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
// Detect + compute descriptors
int orbDetectAndCompute(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPointORB>& keypoints,
    int maxKeypoints = 500,
    float scaleFactor = 1.2f,
    int nLevels = 8,
    int fastThreshold = 20);

// Detection only (discard descriptors, output KeyPoint)
int orbDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    int maxKeypoints = 500,
    float scaleFactor = 1.2f,
    int nLevels = 8,
    int fastThreshold = 20);

// Hamming distance between two 256-bit descriptors (0-256)
int orbHammingDistance(const uint8_t desc1[32], const uint8_t desc2[32]);
ParameterTypeMeaningDefault
maxKeypointsintTarget keypoint count500
scaleFactorfloatInter-level pyramid scale factor (> 1.0)1.2
nLevelsintNumber of pyramid levels8
fastThresholdintFAST internal threshold20

srcWidth / srcHeight must be ≥ 32.


SIFT (Scale-Invariant Feature Transform)

SIFT scale-invariant features + 128-D descriptors.

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
int siftDetectAndCompute(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPointExt>& keypoints,
    int nOctaves = 0,
    int nScalesPerOctave = 3,
    float contrastThresh = 0.04f,
    float edgeThresh = 10.0f,
    float sigma = 1.6f);

// Detection only
int siftDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    int nOctaves = 0, int nScalesPerOctave = 3,
    float contrastThresh = 0.04f, float edgeThresh = 10.0f,
    float sigma = 1.6f);
ParameterTypeMeaningDefault
nOctavesintNumber of pyramid octaves (0 = auto log2(min(w,h)) - 2)0
nScalesPerOctaveintScales per octave3
contrastThreshfloatDoG extremum contrast threshold0.04
edgeThreshfloatEdge-response rejection threshold10.0
sigmafloatInitial Gaussian sigma1.6

srcWidth / srcHeight must be ≥ 16.


SURF (Speeded-Up Robust Features)

SURF accelerated features. Integral image + Hessian determinant; outputs 128-D descriptors (the first 64 dimensions of keypoints.descriptor are used).

Tier: Business
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
int surfDetectAndCompute(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPointExt>& keypoints,
    float hessianThresh = 100.0f,
    int nOctaves = 4,
    int nOctaveLayers = 3);

// Detection only
int surfDetect(
    const uint8_t* srcImage, int width, int height, int srcStride,
    std::vector<KeyPoint>& keypoints,
    float hessianThresh = 100.0f,
    int nOctaves = 4, int nOctaveLayers = 3);

srcWidth / srcHeight must be ≥ 24.


HOG (Histogram of Oriented Gradients)

HOG descriptor — histogram of gradient orientations; commonly used for pedestrian detection and classification features.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
struct HOGParams {
    int cellSize;    // default 8
    int blockSize;   // default 2 (block = blockSize × blockSize cells)
    int nbins;       // default 9
    int blockStride; // default 1 (in units of cells)
};

int computeHOG(
    const uint8_t* srcImage, int width, int height, int srcStride,
    float* descriptors, int& descriptorSize,
    const HOGParams& params = HOGParams());
ParameterTypeMeaningDefault
descriptorsfloat*Output descriptor array (pre-allocated by the caller)non-null
descriptorSizeint&Returns the actual number of floats written
paramsconst HOGParams&HOG parametersdefault-constructed

Output size = blocksX * blocksY * (blockSize * blockSize * nbins). You can estimate this in advance using the default parameters or by calling once to read descriptorSize.


houghLines / houghLinesP

Standard Hough line detection (houghLines outputs (rho, theta)) and probabilistic Hough (houghLinesP outputs line-segment endpoints).

Tier: Pro+
Channels: 1ch binary edge map (typically produced by canny)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
// Standard Hough
int houghLines(
    const uint8_t* edgeImage, int width, int height, int stride,
    std::vector<acl::Vec2f>& lines,
    float rho,
    float theta,
    int threshold);

// Probabilistic Hough (returns line segments)
int houghLinesP(
    const uint8_t* edgeImage, int width, int height, int stride,
    std::vector<acl::Vec4i>& lines,
    float rho,
    float theta,
    int threshold,
    double minLineLength,
    double maxLineGap);
ParameterTypeMeaningRecommended
edgeImageconst uint8_t*Input binary edge image (non-zero = edge)non-null
rhofloatDistance resolution (pixels)1.0
thetafloatAngle resolution (radians)M_PI/180
thresholdintAccumulator vote thresholdstandard 100 / probabilistic 50
minLineLengthdouble(probabilistic only) minimum line length0
maxLineGapdouble(probabilistic only) maximum gap between two points on the same line10

Output:

  • houghLinesVec2f(rho, theta)
  • houghLinesPVec4i(x1, y1, x2, y2)

houghCircles

Hough circle detection (21HT gradient method).

Tier: Pro+
Channels: 1ch grayscale image (internally performs Canny + gradient, no pre-binarization needed)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
int houghCircles(
    const uint8_t* grayImage, int width, int height, int stride,
    std::vector<acl::Vec3f>& circles,
    float dp = 1.0f,
    float minDist = 20.0f,
    float param1 = 100.0f,
    float param2 = 100.0f,
    int minRadius = 0,
    int maxRadius = 0);
ParameterTypeMeaningDefault
grayImageconst uint8_t*Input grayscale imagenon-null
circlesvector<Vec3f>&Output circles (cx, cy, radius)
dpfloatAccumulator-to-image resolution ratio1.0
minDistfloatMinimum distance between adjacent circle centers20.0
param1floatCanny upper threshold (lower threshold auto = param1 / 2)100.0
param2floatAccumulator threshold for circle-center detection100.0
minRadiusintMinimum radius0
maxRadiusintMaximum radius (0 = max(width, height))0

opticalFlowLK

Sparse pyramid Lucas-Kanade optical flow.

Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_t

CPP / NEON Signature (identical)

cpp
int opticalFlowLK(
    const uint8_t* prevImage, const uint8_t* nextImage,
    int width, int height, int stride,
    const acl::Point2f* prevPts, acl::Point2f* nextPts,
    uint8_t* status, float* error,
    int numPoints,
    int winSize = 21,
    int maxLevel = 3,
    int maxIter = 30,
    float epsilon = 0.01f);
ParameterTypeMeaningDefault
prevImage, nextImageconst uint8_t*Previous / next framesnon-null
prevPtsconst Point2f*Points in the previous frame to be trackednon-null
nextPtsPoint2f*Tracked points in the next frame (caller pre-allocates numPoints)non-null
statusuint8_t*Per-point status (1 = tracking succeeded, 0 = lost; pre-allocated for numPoints)non-null
errorfloat*Per-point tracking error (may be nullptr; when non-null, pre-allocated for numPoints)nullable
numPointsintNumber of tracked points
winSizeintSearch window size21
maxLevelintMaximum pyramid level3
maxIterintMax iterations per level30
epsilonfloatConvergence threshold0.01

descriptorMatch (bfMatch / bfMatchBinary / bfKnnMatch / bfKnnMatchBinary)

Brute-force descriptor matching: bfMatch* returns 1 nearest neighbor per query; bfKnn* returns K nearest neighbors. The float version uses L2 distance; the Binary version uses Hamming distance.

Tier: Pro+
Channels: N/A (descriptor vectors)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tuint8_tbinary descriptors (e.g. ORB)
Tfloatreal-valued descriptors (e.g. SIFT/SURF)

CPP / NEON Signature (identical)

cpp
// float descriptors, L2 distance, 1-NN
int bfMatch(
    const float* queryDescs, int queryCount, int descDim,
    const float* trainDescs, int trainCount,
    std::vector<acl::DMatch>& matches);

// Binary descriptors (e.g. ORB), Hamming distance, 1-NN
int bfMatchBinary(
    const uint8_t* queryDescs, int queryCount, int descBytes,
    const uint8_t* trainDescs, int trainCount,
    std::vector<acl::DMatch>& matches);

// float descriptors, L2, K-NN
int bfKnnMatch(
    const float* queryDescs, int queryCount, int descDim,
    const float* trainDescs, int trainCount,
    std::vector<std::vector<acl::DMatch>>& matches,
    int k = 2);

// Binary descriptors, Hamming, K-NN
int bfKnnMatchBinary(
    const uint8_t* queryDescs, int queryCount, int descBytes,
    const uint8_t* trainDescs, int trainCount,
    std::vector<std::vector<acl::DMatch>>& matches,
    int k = 2);
ParameterTypeMeaning
queryDescs, trainDescsconst float* / const uint8_t*Row-major descriptors, length Count × (descDim or descBytes)
descDimintfloat descriptor dimension (SIFT 128; the SURF implementation uses 64)
descBytesintBinary descriptor byte count (ORB 32)
matchesvector<DMatch>& or vector<vector<DMatch>>&Output matches; each DMatch contains queryIdx, trainIdx, distance
kintK for K-NN (actually returns min(k, trainCount))

Example

cpp
uint8_t srcImage[1920*1080];
std::vector<acl::KeyPointORB> kps;

// 1) ORB detection + description
acl::neon::feature::orbDetectAndCompute(
    srcImage, 1920, 1080, 0, kps);

// 2) ORB matching between two images (Binary + Hamming)
std::vector<uint8_t> qDescs(kps.size() * 32), tDescs(/*...*/);
for (size_t i = 0; i < kps.size(); ++i)
    memcpy(qDescs.data() + i*32, kps[i].descriptor, 32);

std::vector<acl::DMatch> matches;
acl::neon::feature::bfMatchBinary(
    qDescs.data(), (int)kps.size(), 32,
    tDescs.data(), /*trainCount=*/200,
    matches);

Transform

Namespace: acl::transform (CPP) / acl::neon::transform (NEON)

Transform is split into two categories:

  • Matrix computation (compute the transform matrix from point pairs): getRotationMatrix2D / getAffineTransform / getPerspectiveTransform / findHomography
  • Applying the transform (resample the image using the matrix): warpAffine / warpPerspective / remap / yuvRemap

getRotationMatrix2D

Construct a 2×3 affine matrix from rotation around (cx, cy) + scaling.

Tier: Starter+
Channels: N/A (matrix builder)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tdouble

CPP Signature

cpp
int getRotationMatrix2D(
    double cx, double cy,
    double angle, double scale,
    double* M);
ParameterTypeMeaningDefault
cx, cydoubleRotation center coordinates
angledoubleRotation angle (degrees, counter-clockwise positive)
scaledoubleScaling factor
Mdouble*Output 2×3 matrix, row-major, 6 doublesnon-null

getAffineTransform

Compute a 2×3 affine transform matrix from 3 point pairs.

Tier: Starter+
Channels: N/A (matrix builder)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tdouble

CPP Signature

cpp
int getAffineTransform(
    const double* srcImage, const double* dstImage,
    double* M);
ParameterTypeMeaning
srcImageconst double*3 source points (x0,y0, x1,y1, x2,y2), 6 doubles total
dstImageconst double*3 destination points (same format as above)
Mdouble*Output 2×3 matrix, row-major

getPerspectiveTransform

Compute a 3×3 perspective transform matrix from 4 point pairs.

Tier: Starter+
Channels: N/A (matrix builder)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tdouble

CPP / NEON Signature (identical)

cpp
int getPerspectiveTransform(
    const double* srcImage, const double* dstImage,
    double* M);
ParameterTypeMeaning
srcImageconst double*4 source points, 8 doubles total
dstImageconst double*4 destination points
Mdouble*Output 3×3 matrix, row-major, 9 doubles (finally normalized so M[8] = 1)

findHomography

Compute a 3×3 homography from N ≥ 4 point pairs; supports least-squares or RANSAC.

Tier: Pro+
Channels: N/A (point sets)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tdouble

CPP / NEON Signature (identical)

cpp
int findHomography(
    const double* srcPts, const double* dstPts,
    int numPts, double* H,
    int method = 0,
    double ransacThreshold = 3.0);
ParameterTypeMeaningDefault
srcPtsconst double*N source points (x0,y0, x1,y1, …), 2*N doubles totalnon-null
dstPtsconst double*N destination pointsnon-null
numPtsintNumber of point pairs, must be ≥ 4
Hdouble*Output 3×3 homography, row-major, normalized H[8] = 1non-null
methodint0 = least-squares DLT; 1 = RANSAC0
ransacThresholddoubleRANSAC inlier distance threshold3.0

When numPts == 4, getPerspectiveTransform is used automatically (exact solution).


dltHomography

Low-level / algorithm-customization API. Most users should call findHomography; use this only if you need direct access to the linear DLT solver and will handle outlier rejection yourself.

Solve a 3×3 homography from numPts ≥ 4 point pairs using a single normalized Direct Linear Transform (DLT) least-squares pass — no RANSAC, no robust filtering.

Tier: Pro+
Channels: N/A (point sets)
Inplace: not supported


CPP Signature (acl::transform only)

This helper is not declared under acl::neon::transform; NEON packages expose findHomography, getPerspectiveTransform, warpAffine, and warpPerspective, but not the DLT helper.

cpp
int dltHomography(
    const double* srcPts, const double* dstPts,
    int numPts, double* H);
ParameterTypeMeaning
srcPtsconst double*N source points (x0,y0, x1,y1, …), 2*N doubles
dstPtsconst double*N destination points, same layout
numPtsintNumber of point pairs, must be ≥ 4
Hdouble*Output 3×3 homography, row-major, normalized H[8] = 1

Returns: 0 on success, non-zero on failure (degenerate point configuration, numPts < 4).


homographyError

Low-level / algorithm-customization API. Mostly useful when implementing custom outlier rejection or scoring loops on top of dltHomography.

Project a single point through a homography and return the squared Euclidean distance to its observed correspondence — i.e. the per-correspondence reprojection error used inside RANSAC scoring.

Tier: Pro+
Channels: N/A (scalar math)
Inplace: N/A


CPP Signature (acl::transform only)

This helper is not declared under acl::neon::transform; call acl::transform::homographyError from paid packages.

cpp
double homographyError(
    const double* H,
    double sx, double sy,
    double dx, double dy);
ParameterTypeMeaning
Hconst double*Row-major 3×3 homography
sx, sydoubleSource point coordinates
dx, dydoubleObserved destination point coordinates

Returns: Squared Euclidean distance between H · (sx, sy, 1) (normalized to z=1) and (dx, dy).


warpAffine

Apply a 2×3 affine transform to an image. Uses inverse mapping: destination pixel (x', y') looks up source coordinates (x, y) = M * (x', y', 1)^T.

Tier: Starter+
Channels: 1ch / 3ch / 4ch (via runtime cn parameter)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
IMG_T{uint8_t, uint16_t, float}
MAT_T{float, double}

CPP / NEON Signature (identical)

cpp
template<class IMG_T, class MAT_T = double>
int warpAffine(
    const IMG_T* srcImage, IMG_T* dstImage,
    const MAT_T* M,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    acl::InterpMode interpMode = acl::InterpMode::LINEAR2D,
    int cn = 1);
ParameterTypeMeaningDefault
srcImage, dstImageconst IMG_T* / IMG_T*input / outputnon-null
Mconst MAT_T*2×3 affine matrix, row-majornon-null
srcWidth, srcHeightintSource image size> 0
dstWidth, dstHeightintDestination image size> 0
srcStride, dstStrideintBytes per row0 = auto
interpModeacl::InterpModeNEAREST / LINEAR2DLINEAR2D
cnintChannel count (1 / 3 / 4)1

warpPerspective

Apply a 3×3 perspective transform to an image. Inverse mapping; homogeneous coordinates are divided by w.

Tier: Starter+
Channels: 1ch / 3ch / 4ch (via runtime cn parameter)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
IMG_T{uint8_t, uint16_t, float}
MAT_T{float, double}

CPP / NEON Signature (identical)

cpp
template<class IMG_T, class MAT_T = double>
int warpPerspective(
    const IMG_T* srcImage, IMG_T* dstImage,
    const MAT_T* M,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    acl::InterpMode interpMode = acl::InterpMode::LINEAR2D,
    int cn = 1);

Runtime parameters are the same as warpAffine; M is a 3×3 matrix (9 MAT_T).


Example (warpAffine + warpPerspective chain)

cpp
uint8_t srcImage[1920*1080], dstImage[1920*1080];

// 1) Build a 30° rotation matrix around the image center
double M_affine[6];
acl::transform::getRotationMatrix2D(960.0, 540.0, 30.0, 1.0, M_affine);

// 2) Apply affine (NEON-accelerated, LINEAR2D)
acl::neon::transform::warpAffine<uint8_t, double>(
    srcImage, dstImage, M_affine, 1920, 1080, 1920, 1080,
    0, 0, acl::InterpMode::LINEAR2D, 1);

// 3) 4-point perspective matrix
double srcPts[8] = { 0,0,  1920,0,  1920,1080,  0,1080 };
double dstPts[8] = { 100,50,  1820,0,  1920,1080,  50,1050 };
double H[9];
acl::transform::getPerspectiveTransform(srcPts, dstPts, H);
acl::neon::transform::warpPerspective<uint8_t, double>(
    srcImage, dstImage, H, 1920, 1080, 1920, 1080,
    0, 0, acl::InterpMode::LINEAR2D, 1);

remap

Generic pixel remap: dst(x, y) = src(mapX(x, y), mapY(x, y)). Can be used to implement fisheye correction, arbitrary distortion correction, etc.

Tier: Starter+
Channels: 1ch / 3ch / 4ch (via runtime cn parameter)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
IMG_T{uint8_t, uint16_t, float}
MAP_T{float, double}

CPP Signature

cpp
template<class IMG_T, class MAP_T>
int remap(
    const IMG_T* srcImage, IMG_T* dstImage,
    const MAP_T* mapX, const MAP_T* mapY,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0,
    int mapStride = 0,
    acl::InterpMode interpMode = acl::InterpMode::LINEAR2D,
    int cn = 1);
ParameterTypeMeaningDefault
mapX, mapYconst MAP_T*Source coordinate map, size dstWidth × dstHeight (one (x, y) pair per pixel)non-null
mapStrideintMap bytes per row (shared by mapX and mapY)0 = auto
interpModeacl::InterpModeInterpolation modeLINEAR2D
cnintChannel count1

yuvRemap

Remap an NV21 / NV12 image directly while preserving the Y / UV plane structure (equivalent to remap followed by restoration of the YUV sampling relationship).

Tier: Business
Channels: NV21 / NV12 (Y plane + interleaved UV plane)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
IMG_Tuint8_t
MAP_T{float, double}

CPP Signature

cpp
template<class IMG_T, class MAP_T>
int yuvRemap(
    const IMG_T* srcYImage, const IMG_T* srcUVImage,
    IMG_T* dstYImage, IMG_T* dstUVImage,
    const MAP_T* mapX, const MAP_T* mapY,
    int srcWidth, int srcHeight,
    int dstWidth, int dstHeight,
    int srcStride = 0, int dstStride = 0, int mapStride = 0,
    acl::InterpMode interpMode = acl::InterpMode::LINEAR2D);
ParameterTypeMeaning
srcYImage, srcUVImageconst IMG_T*Source NV21 / NV12 planes
dstYImage, dstUVImageIMG_T*Destination NV21 / NV12 planes
mapX, mapYconst MAP_T*Source coordinate map based on the Y plane size (UV is automatically sampled at half resolution)

Math

Namespace: acl::neon::math

Discrete Fourier Transform (DFT / IDFT / complex spectrum multiplication). matchTemplate also uses this API set internally.

Tier: Pro+
NEON only (there is no corresponding standalone CPP API)

DftFlags (see acl::DftFlags):

FlagValueMeaning
DFT_FORWARD0Forward transform (default)
DFT_INVERSE1Inverse transform
DFT_SCALE2Divide the result by N for normalization

Flags can be OR-combined, e.g. DFT_INVERSE \| DFT_SCALE.


dft1d

1D complex → complex DFT. Tier: Pro+
Channels: N/A (1-D signal)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tfloat
cpp
int dft1d(
    const float* srcRe, const float* srcIm,
    float* dstRe, float* dstIm,
    int n,
    int flags = acl::DFT_FORWARD);
ParameterTypeMeaningDefault
srcReconst float*Input real part (n elements)non-null
srcImconst float*Input imaginary part (n elements; may be nullptr for real-valued input)nullable
dstRe, dstImfloat*Output real / imaginary parts (n elements each)non-null
nintTransform length> 0
flagsintDFT_FORWARD / DFT_INVERSE, optionally OR'd with DFT_SCALEDFT_FORWARD

dftReal1d

1D real → complex forward DFT (output is a half-spectrum, n/2 + 1 complex coefficients, exploiting conjugate symmetry). Tier: Pro+
Channels: N/A (1-D signal)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tfloat
cpp
int dftReal1d(
    const float* srcImage,
    float* dstRe, float* dstIm,
    int n);
ParameterTypeMeaningDefault
srcImageconst float*Input real-valued array (n elements)non-null
dstRe, dstImfloat*Output real / imaginary parts (n/2 + 1 elements each)non-null
nintInput lengtheven and a power of 2

idftReal1d

1D complex (CCS half-spectrum) → real inverse DFT. Symmetric to dftReal1d: input is n/2 + 1 complex coefficients, output is n real values. Tier: Pro+
Channels: N/A (1-D signal)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tfloat
cpp
int idftReal1d(
    const float* srcRe, const float* srcIm,
    float* dstImage,
    int n);
ParameterTypeMeaningDefault
srcRe, srcImconst float*Input real / imaginary parts (n/2 + 1 elements each, CCS format)non-null
dstImagefloat*Output real array (n elements)non-null
nintOutput lengtheven and a power of 2

dft2d

2D complex → complex DFT (row-wise + column-wise, two 1D FFTs). Tier: Pro+
Channels: 1ch
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Tfloat
cpp
int dft2d(
    const float* srcRe, const float* srcIm,
    float* dstRe, float* dstIm,
    int width, int height,
    int flags = acl::DFT_FORWARD);
ParameterTypeMeaningDefault
srcReconst float*Input real part (width * height, row-major)non-null
srcImconst float*Input imaginary part (same; may be nullptr for real input)nullable
dstRe, dstImfloat*Output real / imaginary parts (same size)non-null
width, heightintColumns / rows> 0
flagsintSame as dft1dDFT_FORWARD

mulSpectrums

Per-element complex multiplication: C = A * B or C = A * conj(B). Commonly used for frequency-domain cross-correlation / convolution. Tier: Pro+
Channels: N/A (complex spectra)
Inplace: supported (aRe / aIm may equal dstRe / dstIm)
Types:

Template parameterAllowed typesConstraint
Tfloat
cpp
int mulSpectrums(
    const float* aRe, const float* aIm,
    const float* bRe, const float* bIm,
    float* cRe, float* cIm,
    int n,
    bool conjB = false);
ParameterTypeMeaningDefault
aRe, aImconst float*Real / imaginary parts of complex array Anon-null
bRe, bImconst float*Real / imaginary parts of complex array Bnon-null
cRe, cImfloat*Real / imaginary parts of the output product Cnon-null
nintNumber of complex elements> 0
conjBbooltrue = take the conjugate of B before multiplyingfalse

Example

cpp
int n = 1024;   // input length: even and a power of 2
std::vector<float> srcImage(n, 0.0f), re(n), im(n);
std::vector<float> back(n);

// 1) real → half-spectrum
acl::neon::math::dftReal1d(srcImage.data(), re.data(), im.data(), n);

// 2) Frequency-domain processing (example: pass-through)

// 3) half-spectrum → real restoration
acl::neon::math::idftReal1d(re.data(), im.data(), back.data(), n);

Drawing

Namespace: acl::draw

Draw lines / rectangles / circles / text on an image. All entry points are non-templated, uint8_t only, supporting 1ch / 3ch / 4ch.

Tier: Starter+
Channels: 1 / 3 / 4
Inplace: drawn directly on the source image (img is both input and output)


drawLine

Draw a line segment using the Bresenham algorithm. Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (writes onto the provided image buffer)
Types:

Template parameterAllowed typesConstraint
Tuint8_t
cpp
int drawLine(
    uint8_t* img,
    int width, int height, int cn, int stride,
    int x0, int y0, int x1, int y1,
    const uint8_t* color,
    int thickness = 1);
ParameterTypeMeaningDefault
imguint8_t*Destination image (drawn in place)non-null
width, heightintImage size> 0
cnintChannel count1 / 3 / 4
strideintBytes per row0 = width * cn
x0, y0, x1, y1intLine segment start / end points
colorconst uint8_t*Color-value array of length cnnon-null
thicknessintLine width (pixels)1

drawRect

Draw a rectangle (outline or filled). Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (writes onto the provided image buffer)
Types:

Template parameterAllowed typesConstraint
Tuint8_t
cpp
int drawRect(
    uint8_t* img,
    int imgW, int imgH, int cn, int stride,
    int x, int y, int w, int h,
    const uint8_t* color,
    int thickness = 1);
ParameterTypeMeaningDefault
x, yintRectangle top-left
w, hintRectangle width / height
thicknessintLine width; -1 = filled rectangle1

Other parameters are the same as drawLine.


drawCircle

Draw a circle using the midpoint circle algorithm (outline or filled). Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (writes onto the provided image buffer)
Types:

Template parameterAllowed typesConstraint
Tuint8_t
cpp
int drawCircle(
    uint8_t* img,
    int width, int height, int cn, int stride,
    int cx, int cy, int radius,
    const uint8_t* color,
    int thickness = 1);
ParameterTypeMeaningDefault
cx, cyintCircle center
radiusintRadius≥ 0
thicknessintLine width; -1 = filled circle1

putText

Render ASCII text using the built-in 8×16 bitmap font. Supports newlines \n; the font size scales linearly via scale. Tier: Starter+
Channels: 1ch / 3ch / 4ch
Inplace: supported (writes onto the provided image buffer)
Types:

Template parameterAllowed typesConstraint
Tuint8_t
cpp
int putText(
    uint8_t* img,
    int width, int height, int cn, int stride,
    const char* text,
    int x, int y,
    const uint8_t* color,
    int scale = 1);
ParameterTypeMeaningDefault
textconst char*\0-terminated ASCII stringnon-null
x, yintText top-left coordinates
scaleintFont-size multiplier (1 = 8×16, 2 = 16×32 …)1

Example

cpp
uint8_t img[1920*1080*3] = {};
uint8_t red[3]  = { 255, 0, 0 };
uint8_t blue[3] = { 0, 0, 255 };

acl::draw::drawLine(img, 1920, 1080, 3, 0, 100, 100, 500, 500, red, 2);
acl::draw::drawRect(img, 1920, 1080, 3, 0, 200, 200, 300, 150, blue, -1);
acl::draw::drawCircle(img, 1920, 1080, 3, 0, 960, 540, 80, red, 3);
acl::draw::putText(img, 1920, 1080, 3, 0, "Hello\nWorld", 50, 50, blue, 2);

Contour Analysis

Namespace: acl::contour

Contour geometry analysis. Input is std::vector<acl::Point2i> (typically produced by findContours).

Tier: Pro+
NEON: none (geometric computation, pure CPP)

Related structs

cpp
struct acl::Point2i      { int x, y; };                 // input point type
struct acl::Rect         { int x, y, width, height; };
struct acl::Point2f      { float x, y; };
struct acl::Size2f       { float width, height; };
struct acl::RotatedRect  { Point2f center; Size2f size; float angle; };

contourArea

Compute polygon area via the Shoelace formula. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Inputstd::vector<acl::Point2i>
Outputdouble
cpp
double contourArea(
    const std::vector<acl::Point2i>& contour,
    bool oriented = false);
ParameterTypeMeaningDefault
contourpoint arrayContour (at least 3 points, otherwise returns 0)
orientedbooltrue = signed area (positive = counter-clockwise, negative = clockwise); false = absolute valuefalse

The return value is the area (not an error code).


arcLength

Contour perimeter (sum of Euclidean distances between consecutive points). Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Inputstd::vector<acl::Point2i>
Outputdouble
cpp
double arcLength(
    const std::vector<acl::Point2i>& contour,
    bool closed = true);
ParameterTypeMeaningDefault
closedbooltrue = closed (including last point → first point); false = opentrue

The return value is the length (not an error code).


boundingRect

Compute the axis-aligned bounding rectangle of a contour. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Inputstd::vector<acl::Point2i>
Outputacl::Rect
cpp
acl::Rect boundingRect(
    const std::vector<acl::Point2i>& contour);

Returns a Rect (not an error code). An empty contour returns Rect(0, 0, 0, 0).


convexHull

Compute the convex hull using Andrew's monotone chain algorithm. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
input/outputstd::vector<acl::Point2i>
cpp
int convexHull(
    const std::vector<acl::Point2i>& points,
    std::vector<acl::Point2i>& hull);

The output hull is arranged in counter-clockwise order.


approxPolyDP

Simplify a polygon (reduce the number of vertices) using the Douglas-Peucker algorithm. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
input/outputstd::vector<acl::Point2i>
cpp
int approxPolyDP(
    const std::vector<acl::Point2i>& curve,
    std::vector<acl::Point2i>& approx,
    double epsilon,
    bool closed = true);
ParameterTypeMeaningDefault
epsilondoubleMaximum distance between the original curve and the simplified curve
closedboolTreat as a closed contourtrue

minAreaRect

Compute the minimum-area rotated bounding rectangle using rotating calipers. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Inputstd::vector<acl::Point2i>
Outputacl::RotatedRect
cpp
int minAreaRect(
    const std::vector<acl::Point2i>& points,
    acl::RotatedRect& result);

The input requires at least 3 points (0 points returns an error; 1-2 points returns a degenerate Rect).


fitEllipse

Fit an ellipse via Direct Least Squares; outputs the ellipse's rotated-rectangle description. Tier: Pro+
Channels: N/A (point set)
Inplace: not supported
Types:

Template parameterAllowed typesConstraint
Inputstd::vector<acl::Point2i>≥5
Outputacl::RotatedRect
cpp
int fitEllipse(
    const std::vector<acl::Point2i>& points,
    acl::RotatedRect& result);

The input requires at least 5 points.

Utilities

cropRect

Copy a rectangular region between two buffers; supports inplace stride reduction.

Namespace: acl::memory

cpp
template<class T>
int cropRect(
    const T* srcBuffer, T* dstBuffer,
    int copyWidth, int copyHeight,
    int srcStride = 0, int dstStride = 0,
    int srcLeft = 0, int srcTop = 0,
    int dstLeft = 0, int dstTop = 0);
ParameterTypeMeaningDefault
copyWidth, copyHeightintSize of the region to copy
srcLeft, srcTopintStarting position in the source buffer0
dstLeft, dstTopintStarting position in the destination buffer0
srcStride, dstStrideintBytes per row0 = auto

Supports inplace stride reduction when srcBuffer == dstBuffer (compresses in place when srcStride ≥ dstStride).

Namespace Availability Matrix

Per-category distribution across acl::neon::* and the scalar CPP path (acl::{module}::*). When an operator exists in both, the signatures are identical unless the operator's own section calls out a difference.

CategoryBoth neon:: + CPPCPP only
Analysisintegral, histogram, equalizeHist, clahe, minMaxLoc, copyMakeBorder, matchTemplate, blockAveragehistMatch, moments, count, mean, connectedComponent_8n_dfs, connectedComponentLabeling, findContours, distanceTransform, extractBlockPixels
ArithmeticaddImg, absDiff, addWeighted, alphaImgFusion, mul, threshold, adaptiveThreshold, bitwise (And/Not/Xor), lut, convertScaleAbs, inRange, normalize, phaseMagnitudelinearTransform2x2
Color ConversionRGB2Gray / RGBA2Gray, channelSwap (5 Mode tags), bayer2RGB, rgb2YUV_fixed family, yuv2RGB_fixed family, rgb2HSV / bgr2HSV, rgb2Lab / bgr2Labhsv2BGR, lab2BGR, rgb2YUV_float family, yuv2RGB_float family, bayer2RGBA, gammaTransform
FiltergaussianBlur, boxFilter, filter2D, sepFilter2D, sobel3x3, scharr, laplacian, morphology (erode/dilate), canny, medianFilter3x3, bilateralFilter, nlMeansDenoising, guidedFilter, unsharpMask, stackBlur, gaborFilteredgePreservingFilter, detailEnhance, tonemap (Linear/Reinhard/Drago), mergeMertens
Geometricresize, rotate (NEON 1ch u8 only), pyrDown, pyrUp, buildPyramid
Geometric (NEON only)resizeNV, resizeYV12, resizeYUV444, rotateNV, rotateYV12, rotateYUV444
Feature DetectionFAST, Harris (detect), Shi-Tomasi (detect) / shiTomasiDetect, ORB (detect + describe), SIFT, SURF, HOG, houghLines / houghLinesP, houghCircles, opticalFlowLK, descriptorMatch (bfMatch / bfMatchBinary and K-NN variants)harrisResponse, minEigenValResponse
TransformgetPerspectiveTransform, findHomography, warpAffine, warpPerspectivegetRotationMatrix2D, getAffineTransform, remap, yuvRemap
Math
Math (NEON only)dft1d / dft2d / dftReal1d / idftReal1d / mulSpectrums
DrawingdrawLine, drawRect, drawCircle, putText
Contour AnalysiscontourArea, arcLength, boundingRect, convexHull, approxPolyDP, minAreaRect, fitEllipse
UtilitiescropRect

On Android arm64-v8a, prefer the NEON variant whenever the operator exists under acl::neon::. Concrete speedups over OpenCV vary by operator and image size — see the Performance Whitepaper.