// Copyright (c) 2023 Qualcomm Technologies, Inc.
//
// SPDX-License-Identifier: CC-BY-4.0

include::{generated}/meta/{refprefix}VK_QCOM_image_processing2.adoc[]

=== Other Extension Metadata

*Last Modified Date*::
    2023-03-10
*Interactions and External Dependencies*::
  - This extension requires
    {spirv}/QCOM/SPV_QCOM_image_processing2.html[`SPV_QCOM_image_processing2`]
  - This extension provides API support for
    {GLSLregistry}/qcom/GLSL_QCOM_image_processing2.txt[`GL_QCOM_image_processing2`]

*Contributors*::
  - Jeff Leger, Qualcomm Technologies, Inc.

=== Description

This extension enables support for the SPIR-V code:TextureBlockMatch2QCOM
capability.
It builds on the functionality of QCOM_image_processing with the addition of
4 new image processing operations.

  * The code:opImageBlockMatchWindowSADQCOM` SPIR-V instruction builds upon
    the functionality of code:opImageBlockMatchSADQCOM` by repeatedly
    performing block match operations across a 2D window.
    The "`2D windowExtent`" and "`compareMode`" are are specified by
    slink:VkSamplerBlockMatchWindowCreateInfoQCOM in the sampler used to
    create the _target image_.
    Like code:OpImageBlockMatchSADQCOM, code:opImageBlockMatchWindowSADQCOM
    computes an error metric, that describes whether a block of texels in
    the _target image_ matches a corresponding block of texels in the
    _reference image_.
    Unlike code:OpImageBlockMatchSADQCOM, this instruction computes an error
    metric at each (X,Y) location within the 2D window and returns either
    the minimum or maximum error.
    The instruction only supports single-component formats.
    Refer to the pseudocode below for details.
  * The code:opImageBlockMatchWindowSSDQCOM follows the same pattern,
    computing the SSD error metric at each location within the 2D window.
  * The code:opImageBlockMatchGatherSADQCOM builds upon
    code:OpImageBlockMatchSADQCOM.
    This instruction computes an error metric, that describes whether a
    block of texels in the _target image_ matches a corresponding block of
    texels in the _reference image_.
    The instruction computes the SAD error metric at 4 texel offsets and
    returns the error metric for each offset in the X,Y,Z,and W components.
    The instruction only supports single-component texture formats.
    Refer to the pseudocode below for details.
  * The code:opImageBlockMatchGatherSSDQCOM follows the same pattern,
    computing the SSD error metric for 4 offsets.

Each of the above 4 image processing instructions are limited to
single-component formats.

Below is the pseudocode for GLSL built-in function
code:textureWindowBlockMatchSADQCOM.
The pseudocode for code:textureWindowBlockMatchSSD is identical other than
replacing all instances of `"SAD"` with `"SSD"`.

[source,c]
----
vec4 textureBlockMatchWindowSAD( sampler2D target,
                                 uvec2 targetCoord,
                                 samler2D reference,
                                 uvec2 refCoord,
                                 uvec2 blocksize) {
    // compareMode (MIN or MAX) comes from the vkSampler associated with `target`
    // uvec2 window  comes from the vkSampler associated with `target`
    minSAD = INF;
    maxSAD = -INF;
    uvec2 minCoord;
    uvec2 maxCoord;

    for (uint x=0, x < window.width; x++) {
        for (uint y=0; y < window.height; y++) {
            float SAD = textureBlockMatchSAD(target,
                                            targetCoord + uvec2(x, y),
                                            reference,
                                            refCoord,
                                            blocksize).x;
            // Note: the below comparison operator will produce undefined results
            // if SAD is a denorm value.
            if (SAD < minSAD) {
                minSAD = SAD;
                minCoord = uvec2(x,y);
            }
            if (SAD > maxSAD) {
                maxSAD = SAD;
                maxCoord = uvec2(x,y);
            }
        }
    }
    if (compareMode=MIN) {
        return vec4(minSAD, minCoord.x, minCoord.y, 0.0);
    } else {
        return vec4(maxSAD, maxCoord.x, maxCoord.y, 0.0);
    }
}
----

Below is the pseudocode for code:textureBlockMatchGatherSADQCOM.
The pseudocode for code:textureBlockMatchGatherSSD follows an identical
pattern.

[source,c]
----
vec4 textureBlockMatchGatherSAD( sampler2D target,
                                 uvec2 targetCoord,
                                 samler2D reference,
                                 uvec2 refCoord,
                                 uvec2 blocksize) {
    vec4 out;
    for (uint x=0, x<4; x++) {
            float SAD = textureBlockMatchSAD(target,
                                            targetCoord + uvec2(x, 0),
                                            reference,
                                            refCoord,
                                            blocksize).x;
            out[x] = SAD;
    }
    return out;
}
----

include::{generated}/interfaces/VK_QCOM_image_processing2.adoc[]

=== Issues

1) What is the precision of the min/max comparison checks?

*RESOLVED*: Intermediate computations for the new operations are performed
at 16-bit floating point precision.
If the value of `"float SAD"` in the above code sample is a 16-bit denorm
value, then behavior of the MIN/MAX comparison is undefined.

=== Version History

  * Revision 1, 2023-03-10 (Jeff Leger)