Skip to content

Changelog

v1.0.3 (May 2026)

API unification + ARM Linux aarch64 platform — single ship gate covering Android arm64-v8a + Linux aarch64.

API change — type-only template parameter slots

61 operators previously exposed enum, bool, or int non-type template parameters as part of their public template signature. v1.0.3 demotes them all to runtime function arguments with sensible defaults.

cpp
// v1.0.1
acl::cvtcolor::RGB2Gray<uint8_t, ColorCvtGrayMode::GRAY_MAX>(src, dst, w, h);

// v1.0.3
acl::cvtcolor::RGB2Gray<uint8_t>(src, dst, w, h,
    /*rgbStride*/ 0, /*grayStride*/ 0,
    /*cR*/ 0.299f, /*cG*/ 0.587f, /*cB*/ 0.114f,
    ColorCvtGrayMode::GRAY_MAX);
  • v1.0.3 main batch — 47 ops: RGB2Gray family, normalize, histogram, histMatch, blockAverage, sobel3x3, scharr, boxFilter, filter2D, sepFilter2D, rotate (incl. NV/YV12/YUV444 NEON variants), resize (incl. NV NEON), warpAffine, warpPerspective, remap, yuvRemap, plus the full RGB↔YUV family (24 ops bundled into the new acl::YUVConvertParams struct)
  • v1.0.3-extension — 14 ops: mul, threshold, adaptiveThreshold, matchTemplate, bayer2RGB, bayer2RGBA, connectedComponent_8n_dfs, bilateralFilter NEON, canny NEON, geometric::rotate (blockW/blockH → runtime), neon::resize SHIFT_RIGHT
  • Behavior bit-identical — algorithms unchanged, kernel template specializations unchanged, identical results for matching parameter values
  • license.dat fully backward compatible — licenses issued for v1.0.1 / v1.0.2 work directly on v1.0.3
  • Header set unchanged: acl.h, api.h, err.h, typeDef.h

New: acl::YUVConvertParams struct

The 24 RGB↔YUV operators (rgb2YV12, rgb2NV21, rgb2YUV444, yv122RGB, nv212RGB, yuv4442RGB and their _fixed / RGBA siblings) previously required 7-9 non-type template parameters. They now accept a single struct argument with defaulted fields:

cpp
struct YUVConvertParams {
    YUVEncodeStandard yuv_std = STD_BT601;
    bool yuv444_fmt = true;
    bool nv21_fmt = false;
    bool rgb_fmt = true;
    int  bit_depth = 0;        // 0 = auto: sizeof(YUV_T)*8
    int  shift_right = 0;
    int  shift_left = 0;
    bool yuv_full_range = true;
    bool rgb_full_range = true;
};

// All defaults — common BT.601 8-bit full-range pipeline
acl::cvtcolor::rgb2YV12_fixed<uint8_t, uint8_t>(rgb, yv12, w, h);

// Override individual fields
acl::cvtcolor::rgb2YV12_fixed<uint8_t, uint8_t>(rgb, yv12, w, h,
    {.yuv_std = STD_BT709, .yuv_full_range = false});

New platform: ARM Linux aarch64

v1.0.3 ships a second platform alongside Android arm64-v8a — Linux aarch64 static libraries, validated end-to-end via QEMU + license isolation tests:

PlatformToolchainValidated on
Android arm64-v8aNDK 26.3 (Clang 17)DX-4 / DX-5 / SD8550 / SD8650
Linux aarch64aarch64-linux-gnu-g++ 9.4qemu-aarch64-static 4.2
  • Both platforms share the same acl:: public API — single source of truth
  • Same licenseId ledger and per-slot key binding across platforms
  • Both platforms pass the same 5-axis stock_qa gate: zip structure / positive init (rv=0) / negative init(nullptr) (rv=-1001) / pre-init operator call (rv=-1001) / 6-operator demo PASS

Release packaging — 3 paid SKUs × 2 platforms

commercial_release/ ships 6 paid tier zips (measured sizes; Linux aarch64 is ~30% smaller than Android — GCC produces more compact NEON code):

TierAndroid arm64-v8a ziplibacl.a (Android)Linux aarch64 ziplibacl.a (Linux)
starter1.51 MB6.18 MB0.82 MB3.94 MB
pro1.80 MB7.13 MB1.13 MB4.89 MB
business1.88 MB7.52 MB1.22 MB5.29 MB

Trial ships through the GitHub release channel (see v1.0.3 GitHub release notes), not via the Taobao stock pipeline.

Tier rename effective v1.0.2: core → starter / advanced → pro / full → business. Trial naming unchanged.

Regression highlights

  • Android (4 device × 3 paid tier × 7 size): 84 paid runs, 0 mismatch vs v1.0.1 release baseline; 6,032 PNG byte-identical across cross-device pairs
  • Linux aarch64 (3 paid tier × 2 slot × 5 axis): stock_qa 6/6 PASS via qemu-aarch64-static
  • Trial smoke (4 Android devices + Linux qemu): 6/6 PASS each — init returns -1001 without license, returns 0 with valid license, demo output carries trial watermark
  • Cross-platform consistency: Android and Linux libacl.a built from one source tree, ABI distinguished only by toolchain

Migration

  • scripts/migrate_to_v1.0.3.py — automated rewrite of v1.0.1/v1.0.2-style template-arg call sites to v1.0.3 runtime-arg form. YUV family emits manual-review warnings (struct literals cannot be auto-derived)
  • docs/MIGRATION_v1.0.3.md — per-operator migration recipes and YUV struct field mapping

Internals

  • Generated api.h excludes _v102kernel internal templates (audit cleanup)
  • All tiers rebuilt with fresh per-slot keys; per-slot key binding mechanism unchanged

v1.0.2 (May 2026)

Footprint reduction + tier rename.

Tier rename

SKU names aligned with marketing tiers:

v1.0.1 namev1.0.2 name
corestarter
advancedpro
fullbusiness
trialtrial (unchanged)

license.dat already issued under the old names continues to work — the licenseId ledger maps both old and new tier strings to the same per-slot key.

Footprint reduction

  • BorderType demoted from template parameter to runtime argument — one function instantiation now covers all five border modes (BORDER_REPLICATE / BORDER_REFLECT / BORDER_REFLECT_101 / BORDER_WRAP / BORDER_CONSTANT) instead of 5× template specializations
  • Kernel layer unchanged — wrapper folds the BORDER template dimension into a runtime switch; bit-identical for matching (op, T, DT, BT) combos
  • Commercial release builds add -fno-exceptions -fno-rtti
  • ld -r post-processing dedupes weak symbols across translation units
  • llvm-strip + .eh_frame removal drops unwind metadata not needed for static linkage

Validation

  • 4 device × 4 tier × 7 size = 112 runs all bit-identical vs v1.0.1 release baseline
  • Cross-device 6 pairs × 7 size = 42 pairs bit-identical
  • IP audit: 0 public symbols lost vs v1.0.1
  • DX-4 customer-style link smoke — single-translation-unit #include <acl/api.h> link + run PASS

v1.0.1 (April 2026)

Stability release — 109-operator set hardened through multi-device regression + license delivery pivot to Taobao auto-fulfillment.

License Delivery (Session 68)

  • Removed: per-app binding (Android package name + certificate SHA-256 verification)
  • Added: licenseId (ACL-YYYY-NNNNNN) recorded against Taobao buyer ID + order number for lifetime traceability
  • New purchase flow: Taobao auto-fulfillment — order triggers delivery of pre-generated license.dat + tier-matching SDK zip via Taobao message channel; no sign-up, no signing-key collection, no email round-trip
  • Anti-redistribution: legal notice shipped with every SDK (README.md / README_CN.md in the delivery zip); each licenseId in the ledger identifies the source of any leaked copy
  • API surface: acl::init(const char* licensePath) — the public contract only requires a license path; JNIEnv* / jobject context remain as trailing parameters defaulted to nullptr for backward compatibility with v1.0.0 integrations and are no longer consulted by the license gate
  • Error code -1004 (ACL_ERR_LICENSE_MISMATCH) repurposed: License tier doesn't match library tier (e.g. core license linked against libacl_full.a), instead of package-name / certificate mismatch
  • Runtime gate: license magic word check is injected into every operator — if acl::init() never ran or failed, any operator returns -1001 immediately

Release Packaging

  • 4 standard tier zips shipped from commercial/package/:
    • acl-pack-trial-v1.0.0-android-arm64.zip (~725 KB)
    • acl-pack-core-v1.0.0-android-arm64.zip (~1.2 MB)
    • acl-pack-advanced-v1.0.0-android-arm64.zip (~1.4 MB)
    • acl-pack-full-v1.0.0-android-arm64.zip (~1.5 MB)
  • Each zip layout: lib/arm64-v8a/libacl.a + include/acl/ + demo/ + README.md (EN) + README_CN.md (CN) + LICENSE

Regression Highlights

  • 4 devices × 7 size tiers = 7,813 rows/device = 31,252 benchmark rows total, all devices bit-identical
  • Pass rate 7,421 / 7,813 = 95.0% per device (was 7,561 / 7,717 = 98.0% at S67; new NEON sub_tests added to regression matrix pull this down); 154 failures all in documented categories
  • +89 PASS vs S67 / 0 new FAIL / +96 rows across the 7 size tiers (15 new NEON sub_tests for bilateralFilter / boxFilter / filter2D / sepFilter2D / resize variants)
  • Commercial v2b regression (all 4 SoCs) — 30,753 rows/device × 4 devices = 123,012 bit-identical tests, 0 PASS flips vs release baseline and 0 flips cross-device (4 SoCs × 4 SKU tiers × 7 image sizes; ship gate)
  • Overall M-tier speedup: 1.56× over OpenCV 4.13.0 (3,217 timed rows aggregated across 4 devices)

Bug Fixes (S47 → S67)

  • S60: houghLines NMS (CPP + NEON); opticalFlowLK rewrite
  • S62: 3 crash fixes — gabor vld1_u8, YUV odd-dimension, extractPixelPerUxV ceil boundary
  • S64: Geometric YUV takes YUV_STD + range template params (shares cvtcolor infra); BT.601/709/2020 × full/limited coverage 24/24 PASS
  • S65: cvtcolor roundtripTolerance 3 → 160 + BT.601 decoder range alignment (4-device FAIL 352 → 156)
  • S66-67: houghCircles CPP + NEON rewrite (gradient-line vote + top-K NMS + NEON); coverage tuned 0.10 → 0.12 (ACL 4 circles vs OCV 5, was 12 over-detect)

Release Hardening (2026-04-27)

  • Trial zip repackagedcommercial/package/acl-pack-trial-v1.0.0-android-arm64.zip (239 KB). acl.h auto-defines ACL_BUILT_TIER_TRIAL when linking libacl_trial.a, so trial consumers need no manual macro setup. Now ships a pre-signed license.dat at the zip root so acl::init("license.dat") succeeds out-of-the-box (2-year validity from release date)
  • watermark.h shipped in trial zip (3 ship-path fixes) — trial guard now activates automatically
  • Dead-code cleanup: removed jni_verify, strcasecmp_portable, license_manager.py, stray test_license_tier.cpp, and stale build.sh echo text
  • Gate negative test: acl::init(nullptr) returns -1001 (verified on DX-5 with fresh binary)
  • Plaintext audit across 4 tier .a files: 0 ACE11CDA magic leak, 0 acl::internal leak; only mangled C++ symbols exposed (acceptable per obfuscation design)
  • push_test_all.sh SD8550 trap diagnosed: a past adb root push left /data/local/tmp/acl_test/bin/ owned by root with write blocked for uid=2000 (shell), silently freezing binaries at the old build. One-shot fix documented; script defensive-purge is a follow-up TODO

Release Chain End-to-End Validation (2026-04-28)

  • Zip-only smoke 4/4 tier on DX-4: each tier zip is unzipped, compiled against its own bundled libacl.a + acl.h via NDK clang++ (the same path a customer takes), pushed to device, and run. All 4 tiers: 5/5 PASS. Trial correctly returns -1006 for 1921×1080 (resolution cap) and applies watermark; core/advanced/full allow 2048×1536 and leave output clean
  • Trial demo scoped to 3 operators: new demo_trial.cpp ships only the trial-whitelisted operators (gaussianBlur / threshold / resize) and demonstrates the cap + watermark. Other tiers continue to ship the full demo_acl.cpp
  • Package tooling hardening: package_sdk.sh now selects tier-appropriate demo (demo_trial.cpp for trial, demo_acl.cpp for core/advanced/full)
  • Second-device commercial validation (DX-4 × 4 tiers × M size): 4/4 pairs non-timing bit-identical to release DX-4 M baseline (1130 rows, 1110 grep-PASS, 19 grep-FAIL each). Combined with SD8550's 28/28, commercial validated on 2 physical devices × 32 (tier, size) pairs = 32/32 bit-identical; the 4-device cross-device invariant (ISA determinism) carries the remaining 2 devices

Test / Infra Improvements

  • 7-tier sizing: S 640×480, M 1920×1280, L 4096×3072, E1 639×479, E2 1277×717, E3 17×17, E4 1919×1279
  • Human review complete: 2,550+ rows across S, E1, and M tiers visually verified
  • Cross-device pairwise diff = 0 on all 6 device pairs

Performance (Session 67 new Top)

Top operators on DX-5 M tier (ACL > 0.5 ms):

SpeedupOperator
27.16×nlmeans_h10_p3_s5 NEON
14.96×gammaTransform_2.2_1ch CPP
13.79×nlmeans_h5_p2_s3 NEON
13.53×bgr2Lab NEON

v1.1.0 (April 2026) — superseded by v1.0.1

P1 operator expansion + YUV renaming (integrated into v1.0.1 final release).

New Operators (10)

CategoryOperatorTierNEON
FeaturehoughLinesAdvancedYes
FeaturehoughLinesPAdvancedYes
FeaturehoughCirclesAdvancedYes
FeatureopticalFlowLKAdvancedYes
FeaturebfMatch (L2)AdvancedYes
FeaturebfMatchBinary (Hamming)AdvancedYes
FeaturebfKnnMatch (L2)AdvancedYes
FeaturebfKnnMatchBinary (Hamming)AdvancedYes
ContourminAreaRectAdvancedNo
ContourfitEllipseAdvancedNo

Changes

  • Total operators: 99 → 109 (Core 53, Advanced 93 cumulative, Full 109)
  • YUV pipeline naming: The NV21/NV12 resize+rotate+YUV→RGB pipeline is exposed as YUV_utilities / YUV_utilities_crop / YUV_utilities_float / YUV_utilities_crop_float (Enterprise Only). An earlier experimental naming scheme (nv21ResizeRotate2RGB / yuvResizeRotate*) was reverted; only the YUV_utilities* headers ship.
  • New test files: test_feature_houghLines, test_feature_houghCircles, test_feature_opticalFlowLK, test_feature_descriptorMatch, test_contour_minAreaRect

v1.0.0 (April 2026)

Initial commercial release.

Highlights

  • 89 operators across 8 categories: filter, color conversion, geometric, arithmetic, analysis, feature detection, transform, math
  • ARM NEON acceleration for all performance-critical operators
  • 83% of NEON operators faster than OpenCV on ARM64 (140 faster / 7 tie / 21 slower)
  • Peak speedups: resize area 37.5x, NL-Means denoising 27.8x, Scharr 24.1x, LUT 6.9x (v1.0.0 single-device baseline; see v1.0.1 for multi-device S67 figures)
  • Zero dependencies — static library, no OpenCV/third-party runtime required
  • ~200KB library size (vs OpenCV ~50MB)

Operators by Category

CategoryCountKey Operators
Filter18gaussianBlur, boxFilter, medianFilter, bilateralFilter, canny, sobel, morphology, guidedFilter, nlMeansDenoising
Color Conversion11RGB/YUV (float+fixed), Bayer, grayscale, gamma, channel swap
Geometric7resize (NN/bilinear/area), rotate, NV21 compound ops, pyramid
Arithmetic14threshold, adaptiveThreshold, LUT, normalize, bitwise, addWeighted, inRange
Analysis16histogram, integral, matchTemplate (6 methods), CLAHE, moments, minMaxLoc
Feature8Harris, ORB, SIFT, SURF, shiTomasiDetect, HOG
Transform8warpAffine, warpPerspective, remap, findHomography
Math5DFT/IDFT (1D/2D), mulSpectrums

Platform Support

  • Android ARM64 (arm64-v8a), NDK r26+
  • ARM Linux (aarch64) — same library, different binding

Tier System

TierOperatorsData Types
Trial2 (gaussianBlur + resize)uint8_t
Starter (Core)48uint8_t
Pro (Advanced)71uint8_t, uint16_t, int16_t
Business (Full)89uint8_t, uint16_t, int16_t, float

License System

  • RSA-2048 signed license files (PKCS#1 v1.5 + SHA-256)
  • Per-application binding (Android package name + certificate SHA-256) — replaced in v1.0.1 by Taobao licenseId traceability
  • Expiration date enforcement
  • Tier validation (license tier must match library tier)

Known Limitations

  • 24 test cases with known deviations (canny edge thresholds, Bayer demosaic edge pixels, pyramid upscale boundary, NV21 color space rounding)
  • matchTemplate NEON: slower than OpenCV for 6 of 12 variants (TM_CCORR_NORMED, TM_CCOEFF_NORMED on large templates)
  • No iOS/macOS support in v1.0 (planned for v2.0)