Changelog
v1.0.3 (May 2026)
API unification + ARM Linux aarch64 platform — single ship gate covering Android arm64-v8a + Linux aarch64.
API change — type-only template parameter slots
61 operators previously exposed enum, bool, or int non-type template parameters as part of their public template signature. v1.0.3 demotes them all to runtime function arguments with sensible defaults.
// v1.0.1
acl::cvtcolor::RGB2Gray<uint8_t, ColorCvtGrayMode::GRAY_MAX>(src, dst, w, h);
// v1.0.3
acl::cvtcolor::RGB2Gray<uint8_t>(src, dst, w, h,
/*rgbStride*/ 0, /*grayStride*/ 0,
/*cR*/ 0.299f, /*cG*/ 0.587f, /*cB*/ 0.114f,
ColorCvtGrayMode::GRAY_MAX);- v1.0.3 main batch — 47 ops: RGB2Gray family, normalize, histogram, histMatch, blockAverage, sobel3x3, scharr, boxFilter, filter2D, sepFilter2D, rotate (incl. NV/YV12/YUV444 NEON variants), resize (incl. NV NEON), warpAffine, warpPerspective, remap, yuvRemap, plus the full RGB↔YUV family (24 ops bundled into the new
acl::YUVConvertParamsstruct) - v1.0.3-extension — 14 ops: mul, threshold, adaptiveThreshold, matchTemplate, bayer2RGB, bayer2RGBA, connectedComponent_8n_dfs, bilateralFilter NEON, canny NEON, geometric::rotate (
blockW/blockH→ runtime), neon::resize SHIFT_RIGHT - Behavior bit-identical — algorithms unchanged, kernel template specializations unchanged, identical results for matching parameter values
- license.dat fully backward compatible — licenses issued for v1.0.1 / v1.0.2 work directly on v1.0.3
- Header set unchanged:
acl.h,api.h,err.h,typeDef.h
New: acl::YUVConvertParams struct
The 24 RGB↔YUV operators (rgb2YV12, rgb2NV21, rgb2YUV444, yv122RGB, nv212RGB, yuv4442RGB and their _fixed / RGBA siblings) previously required 7-9 non-type template parameters. They now accept a single struct argument with defaulted fields:
struct YUVConvertParams {
YUVEncodeStandard yuv_std = STD_BT601;
bool yuv444_fmt = true;
bool nv21_fmt = false;
bool rgb_fmt = true;
int bit_depth = 0; // 0 = auto: sizeof(YUV_T)*8
int shift_right = 0;
int shift_left = 0;
bool yuv_full_range = true;
bool rgb_full_range = true;
};
// All defaults — common BT.601 8-bit full-range pipeline
acl::cvtcolor::rgb2YV12_fixed<uint8_t, uint8_t>(rgb, yv12, w, h);
// Override individual fields
acl::cvtcolor::rgb2YV12_fixed<uint8_t, uint8_t>(rgb, yv12, w, h,
{.yuv_std = STD_BT709, .yuv_full_range = false});New platform: ARM Linux aarch64
v1.0.3 ships a second platform alongside Android arm64-v8a — Linux aarch64 static libraries, validated end-to-end via QEMU + license isolation tests:
| Platform | Toolchain | Validated on |
|---|---|---|
| Android arm64-v8a | NDK 26.3 (Clang 17) | DX-4 / DX-5 / SD8550 / SD8650 |
| Linux aarch64 | aarch64-linux-gnu-g++ 9.4 | qemu-aarch64-static 4.2 |
- Both platforms share the same
acl::public API — single source of truth - Same
licenseIdledger and per-slot key binding across platforms - Both platforms pass the same 5-axis stock_qa gate: zip structure / positive
init(rv=0) / negativeinit(nullptr)(rv=-1001) / pre-init operator call (rv=-1001) / 6-operator demo PASS
Release packaging — 3 paid SKUs × 2 platforms
commercial_release/ ships 6 paid tier zips (measured sizes; Linux aarch64 is ~30% smaller than Android — GCC produces more compact NEON code):
| Tier | Android arm64-v8a zip | libacl.a (Android) | Linux aarch64 zip | libacl.a (Linux) |
|---|---|---|---|---|
| starter | 1.51 MB | 6.18 MB | 0.82 MB | 3.94 MB |
| pro | 1.80 MB | 7.13 MB | 1.13 MB | 4.89 MB |
| business | 1.88 MB | 7.52 MB | 1.22 MB | 5.29 MB |
Trial ships through the GitHub release channel (see v1.0.3 GitHub release notes), not via the Taobao stock pipeline.
Tier rename effective v1.0.2: core → starter / advanced → pro / full → business. Trial naming unchanged.
Regression highlights
- Android (4 device × 3 paid tier × 7 size): 84 paid runs, 0 mismatch vs v1.0.1 release baseline; 6,032 PNG byte-identical across cross-device pairs
- Linux aarch64 (3 paid tier × 2 slot × 5 axis): stock_qa 6/6 PASS via qemu-aarch64-static
- Trial smoke (4 Android devices + Linux qemu): 6/6 PASS each —
initreturns-1001without license, returns0with valid license, demo output carries trial watermark - Cross-platform consistency: Android and Linux libacl.a built from one source tree, ABI distinguished only by toolchain
Migration
scripts/migrate_to_v1.0.3.py— automated rewrite of v1.0.1/v1.0.2-style template-arg call sites to v1.0.3 runtime-arg form. YUV family emits manual-review warnings (struct literals cannot be auto-derived)docs/MIGRATION_v1.0.3.md— per-operator migration recipes and YUV struct field mapping
Internals
- Generated
api.hexcludes_v102kernelinternal templates (audit cleanup) - All tiers rebuilt with fresh per-slot keys; per-slot key binding mechanism unchanged
v1.0.2 (May 2026)
Footprint reduction + tier rename.
Tier rename
SKU names aligned with marketing tiers:
| v1.0.1 name | v1.0.2 name |
|---|---|
| core | starter |
| advanced | pro |
| full | business |
| trial | trial (unchanged) |
license.dat already issued under the old names continues to work — the licenseId ledger maps both old and new tier strings to the same per-slot key.
Footprint reduction
- BorderType demoted from template parameter to runtime argument — one function instantiation now covers all five border modes (
BORDER_REPLICATE/BORDER_REFLECT/BORDER_REFLECT_101/BORDER_WRAP/BORDER_CONSTANT) instead of 5× template specializations - Kernel layer unchanged — wrapper folds the BORDER template dimension into a runtime switch; bit-identical for matching
(op, T, DT, BT)combos - Commercial release builds add
-fno-exceptions -fno-rtti ld -rpost-processing dedupes weak symbols across translation unitsllvm-strip+.eh_frameremoval drops unwind metadata not needed for static linkage
Validation
- 4 device × 4 tier × 7 size = 112 runs all bit-identical vs v1.0.1 release baseline
- Cross-device 6 pairs × 7 size = 42 pairs bit-identical
- IP audit: 0 public symbols lost vs v1.0.1
- DX-4 customer-style link smoke — single-translation-unit
#include <acl/api.h>link + run PASS
v1.0.1 (April 2026)
Stability release — 109-operator set hardened through multi-device regression + license delivery pivot to Taobao auto-fulfillment.
License Delivery (Session 68)
- Removed: per-app binding (Android package name + certificate SHA-256 verification)
- Added:
licenseId(ACL-YYYY-NNNNNN) recorded against Taobao buyer ID + order number for lifetime traceability - New purchase flow: Taobao auto-fulfillment — order triggers delivery of pre-generated
license.dat+ tier-matching SDK zip via Taobao message channel; no sign-up, no signing-key collection, no email round-trip - Anti-redistribution: legal notice shipped with every SDK (
README.md/README_CN.mdin the delivery zip); eachlicenseIdin the ledger identifies the source of any leaked copy - API surface:
acl::init(const char* licensePath)— the public contract only requires a license path;JNIEnv*/jobject contextremain as trailing parameters defaulted tonullptrfor backward compatibility with v1.0.0 integrations and are no longer consulted by the license gate - Error code
-1004(ACL_ERR_LICENSE_MISMATCH) repurposed: License tier doesn't match library tier (e.g.corelicense linked againstlibacl_full.a), instead of package-name / certificate mismatch - Runtime gate: license magic word check is injected into every operator — if
acl::init()never ran or failed, any operator returns-1001immediately
Release Packaging
- 4 standard tier zips shipped from
commercial/package/:acl-pack-trial-v1.0.0-android-arm64.zip(~725 KB)acl-pack-core-v1.0.0-android-arm64.zip(~1.2 MB)acl-pack-advanced-v1.0.0-android-arm64.zip(~1.4 MB)acl-pack-full-v1.0.0-android-arm64.zip(~1.5 MB)
- Each zip layout:
lib/arm64-v8a/libacl.a+include/acl/+demo/+README.md(EN) +README_CN.md(CN) +LICENSE
Regression Highlights
- 4 devices × 7 size tiers = 7,813 rows/device = 31,252 benchmark rows total, all devices bit-identical
- Pass rate 7,421 / 7,813 = 95.0% per device (was 7,561 / 7,717 = 98.0% at S67; new NEON sub_tests added to regression matrix pull this down); 154 failures all in documented categories
- +89 PASS vs S67 / 0 new FAIL / +96 rows across the 7 size tiers (15 new NEON sub_tests for bilateralFilter / boxFilter / filter2D / sepFilter2D / resize variants)
- Commercial v2b regression (all 4 SoCs) — 30,753 rows/device × 4 devices = 123,012 bit-identical tests, 0 PASS flips vs release baseline and 0 flips cross-device (4 SoCs × 4 SKU tiers × 7 image sizes; ship gate)
- Overall M-tier speedup: 1.56× over OpenCV 4.13.0 (3,217 timed rows aggregated across 4 devices)
Bug Fixes (S47 → S67)
- S60: houghLines NMS (CPP + NEON); opticalFlowLK rewrite
- S62: 3 crash fixes —
gaborvld1_u8, YUV odd-dimension,extractPixelPerUxVceil boundary - S64: Geometric YUV takes
YUV_STD+ range template params (shares cvtcolor infra); BT.601/709/2020 × full/limited coverage 24/24 PASS - S65: cvtcolor
roundtripTolerance3 → 160 + BT.601 decoder range alignment (4-device FAIL 352 → 156) - S66-67: houghCircles CPP + NEON rewrite (gradient-line vote + top-K NMS + NEON); coverage tuned 0.10 → 0.12 (ACL 4 circles vs OCV 5, was 12 over-detect)
Release Hardening (2026-04-27)
- Trial zip repackaged —
commercial/package/acl-pack-trial-v1.0.0-android-arm64.zip(239 KB).acl.hauto-definesACL_BUILT_TIER_TRIALwhen linkinglibacl_trial.a, so trial consumers need no manual macro setup. Now ships a pre-signedlicense.datat the zip root soacl::init("license.dat")succeeds out-of-the-box (2-year validity from release date) watermark.hshipped in trial zip (3 ship-path fixes) — trial guard now activates automatically- Dead-code cleanup: removed
jni_verify,strcasecmp_portable,license_manager.py, straytest_license_tier.cpp, and stalebuild.shecho text - Gate negative test:
acl::init(nullptr)returns-1001(verified on DX-5 with fresh binary) - Plaintext audit across 4 tier
.afiles: 0ACE11CDAmagic leak, 0acl::internalleak; only mangled C++ symbols exposed (acceptable per obfuscation design) - push_test_all.sh SD8550 trap diagnosed: a past
adb rootpush left/data/local/tmp/acl_test/bin/owned by root with write blocked foruid=2000 (shell), silently freezing binaries at the old build. One-shot fix documented; script defensive-purge is a follow-up TODO
Release Chain End-to-End Validation (2026-04-28)
- Zip-only smoke 4/4 tier on DX-4: each tier zip is unzipped, compiled against its own bundled
libacl.a+acl.hvia NDK clang++ (the same path a customer takes), pushed to device, and run. All 4 tiers: 5/5 PASS. Trial correctly returns-1006for 1921×1080 (resolution cap) and applies watermark; core/advanced/full allow 2048×1536 and leave output clean - Trial demo scoped to 3 operators: new
demo_trial.cppships only the trial-whitelisted operators (gaussianBlur / threshold / resize) and demonstrates the cap + watermark. Other tiers continue to ship the fulldemo_acl.cpp - Package tooling hardening:
package_sdk.shnow selects tier-appropriate demo (demo_trial.cppfor trial,demo_acl.cppfor core/advanced/full) - Second-device commercial validation (DX-4 × 4 tiers × M size): 4/4 pairs non-timing bit-identical to release DX-4 M baseline (1130 rows, 1110 grep-PASS, 19 grep-FAIL each). Combined with SD8550's 28/28, commercial validated on 2 physical devices × 32 (tier, size) pairs = 32/32 bit-identical; the 4-device cross-device invariant (ISA determinism) carries the remaining 2 devices
Test / Infra Improvements
- 7-tier sizing: S 640×480, M 1920×1280, L 4096×3072, E1 639×479, E2 1277×717, E3 17×17, E4 1919×1279
- Human review complete: 2,550+ rows across S, E1, and M tiers visually verified
- Cross-device pairwise diff = 0 on all 6 device pairs
Performance (Session 67 new Top)
Top operators on DX-5 M tier (ACL > 0.5 ms):
| Speedup | Operator |
|---|---|
| 27.16× | nlmeans_h10_p3_s5 NEON |
| 14.96× | gammaTransform_2.2_1ch CPP |
| 13.79× | nlmeans_h5_p2_s3 NEON |
| 13.53× | bgr2Lab NEON |
v1.1.0 (April 2026) — superseded by v1.0.1
P1 operator expansion + YUV renaming (integrated into v1.0.1 final release).
New Operators (10)
| Category | Operator | Tier | NEON |
|---|---|---|---|
| Feature | houghLines | Advanced | Yes |
| Feature | houghLinesP | Advanced | Yes |
| Feature | houghCircles | Advanced | Yes |
| Feature | opticalFlowLK | Advanced | Yes |
| Feature | bfMatch (L2) | Advanced | Yes |
| Feature | bfMatchBinary (Hamming) | Advanced | Yes |
| Feature | bfKnnMatch (L2) | Advanced | Yes |
| Feature | bfKnnMatchBinary (Hamming) | Advanced | Yes |
| Contour | minAreaRect | Advanced | No |
| Contour | fitEllipse | Advanced | No |
Changes
- Total operators: 99 → 109 (Core 53, Advanced 93 cumulative, Full 109)
- YUV pipeline naming: The NV21/NV12 resize+rotate+YUV→RGB pipeline is exposed as
YUV_utilities/YUV_utilities_crop/YUV_utilities_float/YUV_utilities_crop_float(Enterprise Only). An earlier experimental naming scheme (nv21ResizeRotate2RGB/yuvResizeRotate*) was reverted; only theYUV_utilities*headers ship. - New test files: test_feature_houghLines, test_feature_houghCircles, test_feature_opticalFlowLK, test_feature_descriptorMatch, test_contour_minAreaRect
v1.0.0 (April 2026)
Initial commercial release.
Highlights
- 89 operators across 8 categories: filter, color conversion, geometric, arithmetic, analysis, feature detection, transform, math
- ARM NEON acceleration for all performance-critical operators
- 83% of NEON operators faster than OpenCV on ARM64 (140 faster / 7 tie / 21 slower)
- Peak speedups: resize area 37.5x, NL-Means denoising 27.8x, Scharr 24.1x, LUT 6.9x (v1.0.0 single-device baseline; see v1.0.1 for multi-device S67 figures)
- Zero dependencies — static library, no OpenCV/third-party runtime required
- ~200KB library size (vs OpenCV ~50MB)
Operators by Category
| Category | Count | Key Operators |
|---|---|---|
| Filter | 18 | gaussianBlur, boxFilter, medianFilter, bilateralFilter, canny, sobel, morphology, guidedFilter, nlMeansDenoising |
| Color Conversion | 11 | RGB/YUV (float+fixed), Bayer, grayscale, gamma, channel swap |
| Geometric | 7 | resize (NN/bilinear/area), rotate, NV21 compound ops, pyramid |
| Arithmetic | 14 | threshold, adaptiveThreshold, LUT, normalize, bitwise, addWeighted, inRange |
| Analysis | 16 | histogram, integral, matchTemplate (6 methods), CLAHE, moments, minMaxLoc |
| Feature | 8 | Harris, ORB, SIFT, SURF, shiTomasiDetect, HOG |
| Transform | 8 | warpAffine, warpPerspective, remap, findHomography |
| Math | 5 | DFT/IDFT (1D/2D), mulSpectrums |
Platform Support
- Android ARM64 (arm64-v8a), NDK r26+
- ARM Linux (aarch64) — same library, different binding
Tier System
| Tier | Operators | Data Types |
|---|---|---|
| Trial | 2 (gaussianBlur + resize) | uint8_t |
| Starter (Core) | 48 | uint8_t |
| Pro (Advanced) | 71 | uint8_t, uint16_t, int16_t |
| Business (Full) | 89 | uint8_t, uint16_t, int16_t, float |
License System
- RSA-2048 signed license files (PKCS#1 v1.5 + SHA-256)
- Per-application binding (Android package name + certificate SHA-256) — replaced in v1.0.1 by Taobao
licenseIdtraceability - Expiration date enforcement
- Tier validation (license tier must match library tier)
Known Limitations
- 24 test cases with known deviations (canny edge thresholds, Bayer demosaic edge pixels, pyramid upscale boundary, NV21 color space rounding)
- matchTemplate NEON: slower than OpenCV for 6 of 12 variants (TM_CCORR_NORMED, TM_CCOEFF_NORMED on large templates)
- No iOS/macOS support in v1.0 (planned for v2.0)