动机

最直接的动机是我最近需要实现屏幕空间接触阴影了。索尼的Bend Studio的Graham Aldridge在Sigraph 2023的索尼创作者大会上,介绍了往日不再(Days Gone)中计算屏幕空间接触阴影的方式,这里可以找到演示文稿和参考代码。演示文稿的第24-27页,展示了一种新颖的分派Compute Shader的方法,传统的分派Compute Shader往往是将画面水平和竖直切分成像素数量为64倍数的小块,将分派的Compute Shader对应到这些小块上,而Days Gone中则是将分派的Compute Shader对应到呈放射状的像素小块上。大致的意思可以看下图,下图中相同颜色的相邻像素属于同一个thread group,左边是传统的分派方式,右边则是径向的分派方式。

RadiallyDispatchedComputeShader.png

当进行径向模糊或是计算接触阴影时,往往需要沿着某个方向连续采样纹理。对于多次采样,我们一般会想到使用Compute Shader中的Group Shared Memory进行缓存从而减少采样次数。但是对特定方向进行缓存的话,会要缓存O((N+C)^2)个颜色,如果分派的Thread Group Size或是步进的次数比较大,很容易就超出了Group Shared Memory的最大限制。如果我们使用径向分派的方式,将每一个Thread Group对应的像素沿着采样的方向排列,算上线性插值也只需要缓存(N+C)*2个颜色,这样就能很方便地进行较远的步进了。

相较于索尼的演示,本文解决了Thread Group对应的像素重叠的问题,也尽量地介绍了设置分派参数时的各种条件判断。本文使用的是Unity 2022.3.21f1,URP版本是14.0.10。

如何进行径向分派

分派方式和原因

首先我们注意到对于屏幕中所有指向中心的射线,可以将其分为左下、左上、右下、右上四种,这四种射线最明显的是符号相反,因此在我们分派的时候可以分成四组数据,每一组数据使用同样的方式找到对应的偏移值,再乘上符号和中心的坐标相加,就能得到对应的像素坐标。

因此我们只需要考虑一种情况,我们以右上角为例。下图是一个径向分派的示意图,绿色是我们的中心点,所有的Thread Group都会以绿点为中心放射状排布,黑框就是屏幕上中心点右上角对应的区域(为了简便这里选取了比较小的18x10像素),这里每四个相邻白色方框同属于一个Thread Group(更多的Thread Group我没有画出来),蓝色的区域是每一个Thread Group的起点,这里可以看到深蓝和浅蓝两种颜色,它们对应了两种分派的规律,一种是呈正方形的,另一种则是呈矩形的,灰色的区域是所有计算而得的每一个Thread对应的像素,为了让灰色的区域覆盖整个黑框的区域,我们需要做比当前像素更多的分派。

RadialDispatchDiagram.png

直接计算每一个Thread对应的像素似乎有点困难,我们可以将分派分成两个维度,用第一个维度计算Thread Group的起点,即上图的蓝色区域,用第二个维度和Thread Group的起点,计算对应的像素的位置。因此我们分派的数据也就变成了一个GroupID和GroupIndex了。注意到浅蓝色的区域的位置决定于黑框的长宽比,当黑框的高大于长时,浅蓝色的区域会在深蓝色的上方且横向排布。我们可以做一个xMajor的判断,如果不是xMajor,我们就调换xy分量,全部计算完毕之后再换回来。

根据图上的深蓝色和浅蓝色区域,我们会将两个区域分开来计算GroupID。比较简单的是浅蓝色的区域,从数学上我们需要传入每一列的列高,计算出GroupID的列序号和在一列中的序号,就能得到起点的坐标了。深蓝色的区域,如果单纯对每一圈求和的话,这是一个二次方程,虽然也能计算但效率肯定不会很高。我们可以考虑高斯求和的方法,将第一圈的竖向的像素和最后一圈的横向像素合并成一列(也就是图上深蓝色方框左上角图案相同的为同一列),这样得到的每一列的列高都是相同的,就能使用浅蓝色区域的方式计算序号了,之后我们再对比较序号的大小来决定是竖向的像素还是横向的像素。

得到了Thread Group的起点坐标之后,我们只需要使用起点坐标到中心的向量,对X方向或Y方向以1为单位步进,再对另一个方向取最近的整数,就能得到当前Thread对应的像素相对于整个Thread Group起点坐标的偏移,两者相加就能得到最终的像素坐标了。

事实上,我们的中心点有可能会在屏幕外部,这个时候上图就会变成这样,我们在计算列高的时候需要额外的考虑中心点的偏移,深蓝色的区域也不会考虑完全在屏幕外的圈。

RadialDispatchDiagram2.png

径向分派的额外参数

为了在Compute Shader中计算每个Thread对应的像素,我们需要从CPU额外传递一些参数。在径向分派中,我们从SV_DispatchThreadID中获取到的其实是GroupID和GroupIndex两个参数。由上面的讨论,我们将所有情况分为4 * 2种,即左下、左上、右下、右上、深蓝、浅蓝的组合,对于每一种组合我们需要知道总的数量,才能计算在每一种组合中的GroupID。根据我们上述的计算方式,我们还需要知道每一种组合对应的列高和xMajor的信息。为了兼容中心点在屏幕外的情况,我们还需要知道中心点的偏移值。这样我们的参数就是8组5个int值,分别对应偏移值X,偏移值Y,当前总Thread Group数,列高和xMajor,其中xMajor其实是一个布尔值可以封装到列高的第一位,这样就刚好是四个int值了,我们这里为了方便演示就不做这样的优化了。

private struct DispatchParams
{
    public int2 offset;
    public int count;
    public int stride;
    public int xMajor;
    public DispatchParams(int2 offset, int count, int stride, int xMajor)
    {
        this.offset = offset; this.count = count; this.stride = stride; this.xMajor = xMajor;
    }
}

解决多个Thread对应同一个像素带来的闪烁

由于我们是从外部向内部步进,这必然会导致越靠近中心,越多的Thread会在同一个像素发生碰撞,在示意图中我们也能看到越靠近中心锯齿感越强烈。为了解决这个问题,我们需要从中心向当前Thread对应的像素发射射线,计算和这个射线最接近的Thread Group的起始像素。如果这个起始像素和当前Thread Group的起始像素相同,我们认为当前像素属于当前的Thread Group,保留这个像素,否则,当前像素属于别的Thread Group,我们跳过后续的填色。这样我们就能确保一个像素最多只会被一个Thread Group写入。

由于此时我们屏幕上的像素并不一定总会被写入(尤其是我们写了什么bug的时候),建议在Debug时先对RenderTexture进行一次初始化为0的操作,本文也将ClearMain保留在Compute Shader中。

具体的代码

指导思想就是上面所描述的了,但是实际实现的时候会被各种取模、取余、加一、减一搞得晕头转向。。。这边还稍做了优化,比较{从中心到当前Thread Group的射线的斜率和当前像素水平偏移值的乘积}和{从中心到当前像素的射线的斜率和当前像素水平偏移值的乘积},从而快速地判断当前像素是否属于当前Thread Group。RadialDispatch即为径向分派的主函数,NormalDispatch为普通分派的主函数,通过对GroupID做哈希来可视化。

RadialDispatchComputeShader.compute

#pragma kernel RadialDispatch
#pragma kernel NormalDispatch
#pragma kernel ClearMain
// #pragma warning(disable: 3556)

#define THREAD_COUNT 128

Texture2D<float4> _ColorTex;
RWTexture2D<float4> _RW_TargetTex;

float2 _CenterPosSS;
float4 _TextureSize;

struct DispatchParams
{
    int2 offset;
    int count;
    int stride;
    int xMajor;
};
StructuredBuffer<DispatchParams> _DispatchData;

int GetDispatchType(int index, out int dispatchIndex, out DispatchParams dispatchParams)
{
    for (int i=0; i<8; ++i)
    {
        dispatchParams = _DispatchData[i];
        dispatchIndex = dispatchParams.count - 1 - index;
        if (dispatchIndex >= 0) return i;
    }
    return 0;
}

int2 GetDispatchDirection(int dispatchType, out int2 iLightPosOffset)
{
    dispatchType /= 2;
    int xDir = dispatchType / 2;
    int yDir = dispatchType % 2;
    int2 dir = int2(xDir, yDir);
    iLightPosOffset = dir - 1;
    return dir * 2 - 1;
}

int2 GetDispatchOffset(int dispatchType, int dispatchIndex, DispatchParams dispatchParams, out int groupIndex)
{
    groupIndex = 0;
    int2 dispatchOffset = int2(0, 0);
    int offsetType = dispatchType % 2;
    int colIndexOffset = max(dispatchParams.offset.x,dispatchParams.offset.y)/THREAD_COUNT;
    int2 indexOffset = dispatchParams.xMajor==1?dispatchParams.offset:dispatchParams.offset.yx;
    
    int stride = dispatchParams.stride;
    int colIndex = dispatchIndex / stride;
    int rowIndex = dispatchIndex - colIndex * stride;
    if (offsetType == 0)
    {         
    int offsetedColIndex = colIndex + colIndexOffset;
        int tempIndex = rowIndex + indexOffset.y - (offsetedColIndex + 1) * THREAD_COUNT;
        if (tempIndex >= 0)
        {
            dispatchOffset = int2(tempIndex + indexOffset.x, dispatchParams.stride - (colIndex + colIndexOffset + 1) * THREAD_COUNT + indexOffset.x + indexOffset.y);
            groupIndex = tempIndex;
        }
        else
        {
            dispatchOffset = int2((offsetedColIndex + 1) * THREAD_COUNT - 1, rowIndex + indexOffset.y);
            groupIndex = rowIndex;
        }
    }
    else
    {
        int minOffsetX = max(dispatchParams.stride + indexOffset.y, (colIndexOffset + 1) * THREAD_COUNT);
        dispatchOffset = int2(minOffsetX + colIndex * THREAD_COUNT - 1, rowIndex + indexOffset.y);
        groupIndex = rowIndex;
    }
    if (dispatchParams.xMajor == 0) dispatchOffset.xy = dispatchOffset.yx;
    return dispatchOffset;
}

// https://www.shadertoy.com/view/4djSRW
//  1 out, 1 in...
float hash11(float p)
{
    p = frac(p * .1031);
    p *= p + 33.33;
    p *= p + p;
    return frac(p);
}

// https://www.shadertoy.com/view/MsS3Wc
// Smooth HSV to RGB conversion 
float3 hsv2rgb_smooth(float3 c)
{
    float3 rgb = clamp(abs(fmod(c.x*6.0+float3(0.0,4.0,2.0),6.0)-3.0)-1.0, 0.0, 1.0);
	rgb = rgb*rgb*(3.0-2.0*rgb); // cubic smoothing	
	return c.z * lerp(float3(1.0, 1.0f, 1.0f), rgb, c.y);
}

[numthreads(1, THREAD_COUNT, 1)]
void RadialDispatch(uint3 id : SV_DISPATCHTHREADID)
{
    float2 centerPosSS = _CenterPosSS;
    int2 iCenterPosSS = int2(floor(centerPosSS + 0.5f));

    int dispatchIndex;
    DispatchParams dispatchParams;
    int dispatchType = GetDispatchType(id.x, dispatchIndex, dispatchParams);
    int2 iCenterPosOffset;
    int2 dispatchDirection = GetDispatchDirection(dispatchType, iCenterPosOffset);
    int groupIndex;
    int2 dispatchOffset = GetDispatchOffset(dispatchType, dispatchIndex, dispatchParams, groupIndex);
    int2 iGroupStartSS = iCenterPosSS + iCenterPosOffset + dispatchDirection * dispatchOffset;

    float2 toCenter = centerPosSS - (float2(iGroupStartSS) + 0.5f);
    float2 absDir = abs(toCenter);
    int2 signDir = sign(toCenter);
    bool xMajor = absDir.x >= absDir.y;
    float2 absNDir = normalize(absDir);

    float absToCenterStepRatio = xMajor ?  absDir.y / absDir.x :  absDir.x / absDir.y;
    int baseOffsetY = int(float(id.y) * absToCenterStepRatio + 0.5f);
    int2 iOffset = xMajor ? int2(id.y, baseOffsetY) : int2(baseOffsetY, id.y);
    int2 iPosSS = iGroupStartSS + iOffset * signDir;
    if (any(iPosSS < int2(0, 0)) || any(iPosSS >= int2(_TextureSize.xy))) return;

    float2 posSS = float2(iPosSS) + 0.5f;
    float2 toPosSS = posSS - centerPosSS;
    float2 absToPos = abs(toPosSS);
    float absToPosStepRatio = xMajor ? absToPos.y / absToPos.x :  absToPos.x / absToPos.y;
    int yIntersect = int(float(id.y) * absToPosStepRatio + 0.5f);
    int yVal = baseOffsetY;
    if (yIntersect != yVal) return;

    float rv1 = hash11(id.x);
    float3 color = hsv2rgb_smooth(float3(rv1, 0.8f, 1.0f));

    _RW_TargetTex[iPosSS] = float4(color, 1.0f);
}

[numthreads(1, THREAD_COUNT, 1)]
void NormalDispatch(uint3 groupID : SV_GroupID,
                    uint groupIndex : SV_GroupIndex,
                    uint3 dispatchThreadID : SV_DispatchThreadID)
{
    float rv1 = hash11(THREAD_COUNT * groupID.x + groupID.y + THREAD_COUNT * groupID.z);   
    float3 color = hsv2rgb_smooth(float3(rv1, 0.8f, 1.0f));

    _RW_TargetTex[dispatchThreadID.xy] = float4(color, 1.0f);
}

[numthreads(16, 16, 1)]
void ClearMain(uint3 id : SV_DISPATCHTHREADID)
{
    _RW_TargetTex[id.xy] = 0.0f;
}

RadialDispatchRenderPass.cs

很需要注意的是,在计算中心点最近的整数时,不能简单地使用int2 iCenterPosSS = new int2(centerPosSS + 0.5f);来计算,因为centerPosSS的分量很可能会小于0,转换为int时会变成最接近零的整数。

using Unity.Mathematics;

namespace UnityEngine.Rendering.Universal
{
    public class RadialDispatchRenderPass : ScriptableRenderPass
    {
        public static Transform centerTrans;

        private static readonly string passName = "Radial Dispatch Render Pass";
        private ScriptableRenderer renderer;
        private RadialDispatchRendererFeature.RadialDispatchSettings settings;
        private RadialDispatch radialDispatch;
        private ComputeShader computeShader;
        private Vector2Int textureSize;

        private static readonly string radialDispatchTextureName = "_RadialDispatchTexture";
        private static readonly int radialDispatchTextureID = Shader.PropertyToID(radialDispatchTextureName);
        private RTHandle radialDispatchTextureHandle;

        private ComputeBuffer computeBuffer;

        private static readonly int THREAD_COUNT = 128;
        private static readonly int DISPATCH_DATA_COUNT = 8;
        private static readonly int DISPATCH_DATA_STRIDE = 5;
        private static readonly int DISPATCH_DATA_SIZE = DISPATCH_DATA_COUNT * DISPATCH_DATA_STRIDE;
        private int[] dispatchData = new int[DISPATCH_DATA_SIZE];

        public RadialDispatchRenderPass(RadialDispatchRendererFeature.RadialDispatchSettings settings)
        {
            this.settings = settings;
            computeShader = settings.computeShader;
            renderPassEvent = settings.renderPassEvent;
            profilingSampler = new ProfilingSampler(passName);
        }

        public void Setup(ScriptableRenderer renderer, RadialDispatch RadialDispatch)
        {
            this.renderer = renderer;
            this.radialDispatch = RadialDispatch;
        }

        private void EnsureComputeBuffer(int count, int stride)
        {
            if (computeBuffer == null || computeBuffer.count != count || computeBuffer.stride != stride)
            {
                if (computeBuffer != null)
                {
                    computeBuffer.Release();
                }
                computeBuffer = new ComputeBuffer(count, stride, ComputeBufferType.Structured);
            }
        }

        public override void OnCameraSetup(CommandBuffer cmd, ref RenderingData renderingData)
        {
            EnsureComputeBuffer(DISPATCH_DATA_COUNT, DISPATCH_DATA_STRIDE * 4);

            RenderTextureDescriptor desc = renderingData.cameraData.cameraTargetDescriptor;
            textureSize = new Vector2Int(desc.width, desc.height);

            desc.enableRandomWrite = true;
            desc.graphicsFormat = Experimental.Rendering.GraphicsFormat.R16G16B16A16_SFloat;
            desc.depthBufferBits = 0;
            desc.msaaSamples = 1;
            desc.useMipMap = false;
            RenderingUtils.ReAllocateIfNeeded(ref radialDispatchTextureHandle, desc, FilterMode.Point, TextureWrapMode.Clamp, false, 1, 0, radialDispatchTextureName); ;
        }

        private Vector4 GetTextureSizeParameter(Vector2Int textureSize)
        {
            return new Vector4(textureSize.x, textureSize.y, 1.0f / textureSize.x, 1.0f / textureSize.y);
        }

        private struct DispatchParams
        {
            public int2 offset;
            public int count;
            public int stride;
            public int xMajor;
            public DispatchParams(int2 offset, int count, int stride, int xMajor)
            {
                this.offset = offset; this.count = count; this.stride = stride; this.xMajor = xMajor;
            }
        }

        private void GetDispatchParams(int2 coord, int2 offset, out DispatchParams dp1, out DispatchParams dp2)
        {
            int colIndexOffset = math.max(offset.x, offset.y) / THREAD_COUNT;
            int yIndexOffset;
            int minVal, maxVal, xMajor;
            if (coord.x >= coord.y)
            {
                minVal = coord.y;
                maxVal = coord.x;
                yIndexOffset = offset.y;
                xMajor = 1;
            }
            else
            {
                minVal = coord.x;
                maxVal = coord.y;
                yIndexOffset = offset.x;
                xMajor = 0;
            }

            int stride1 = math.max(0, (minVal + colIndexOffset + 1) * THREAD_COUNT - 1 - offset.x - offset.y);
            int count1 = stride1 * math.max(0, minVal - colIndexOffset);
            int stride2 = math.max(0, (minVal + 1) * THREAD_COUNT - yIndexOffset);
            int count2 = stride2 * math.max(0, maxVal - math.max(minVal, colIndexOffset));
            dp1 = new DispatchParams(offset, count1, stride1, xMajor);
            dp2 = new DispatchParams(offset, count2, stride2, xMajor);
        }

        private void GetDispatchList(int2 iCenterPosSS, int2 textureSize, out DispatchParams[] dispatchList)
        {
            int2 offsetLB = math.max(0, iCenterPosSS - textureSize);
            int2 offsetRT = math.max(0, new int2(0, 0) - iCenterPosSS);
            int2 coordLB = (iCenterPosSS + THREAD_COUNT - 1) / THREAD_COUNT;
            int2 coordRT = (textureSize - iCenterPosSS + THREAD_COUNT - 1) / THREAD_COUNT;

            int2 coordRB = new int2(coordRT.x, coordLB.y);
            int2 coordLT = new int2(coordLB.x, coordRT.y);
            int2 offsetRB = new int2(offsetRT.x, offsetLB.y);
            int2 offsetLT = new int2(offsetLB.x, offsetRT.y);

            GetDispatchParams(coordLB, offsetLB, out DispatchParams dpLB1, out DispatchParams dpLB2);
            GetDispatchParams(coordLT, offsetLT, out DispatchParams dpLT1, out DispatchParams dpLT2);
            GetDispatchParams(coordRB, offsetRB, out DispatchParams dpRB1, out DispatchParams dpRB2);
            GetDispatchParams(coordRT, offsetRT, out DispatchParams dpRT1, out DispatchParams dpRT2);
            dispatchList = new DispatchParams[] { dpLB1, dpLB2, dpLT1, dpLT2, dpRB1, dpRB2, dpRT1, dpRT2 };
        }

        private int SetDispatchData(DispatchParams[] dispatchList)
        {
            if (dispatchList.Length != 8) return 0;
            int totalCount = 0;
            for (int i = 0; i < 8; ++i)
            {
                var param = dispatchList[i];
                totalCount += param.count;
                dispatchData[5 * i + 0] = param.offset.x;
                dispatchData[5 * i + 1] = param.offset.y;
                dispatchData[5 * i + 2] = totalCount;
                dispatchData[5 * i + 3] = param.stride;
                dispatchData[5 * i + 4] = param.xMajor;
            }
            computeBuffer.SetData(dispatchData);
            return totalCount;
        }

        public override void Execute(ScriptableRenderContext context, ref RenderingData renderingData)
        {
            CommandBuffer cmd = renderingData.commandBuffer;
            UniversalRenderer universalRenderer = renderer as UniversalRenderer;
            if (universalRenderer == null || computeShader == null || centerTrans == null) return;

            using (new ProfilingScope(cmd, profilingSampler))
            {
                float4 centerPosWS = new float4(centerTrans.position, 1.0f);
                float4x4 viewMat = renderingData.cameraData.GetViewMatrix();
                float4x4 projMat = renderingData.cameraData.GetGPUProjectionMatrix();
                float4x4 vpMat = math.mul(projMat, viewMat);
                float4 centerPosCS = math.mul(vpMat, centerPosWS);
                centerPosCS.xyz /= math.abs(centerPosCS.w);
                centerPosCS.y = -centerPosCS.y;

                float2 centerPosSS = (centerPosCS.xy * 0.5f + 0.5f) * new float2(textureSize.x, textureSize.y);
                int2 iCenterPosSS = new int2(math.floor(centerPosSS + 0.5f));
                int2 ts = new int2(textureSize.x, textureSize.y);
                GetDispatchList(iCenterPosSS, ts, out DispatchParams[] dispatchList);
                int totalDispatchCount = SetDispatchData(dispatchList);

                var backBuffer = universalRenderer.m_ColorBufferSystem.GetBackBuffer(cmd);
                int clearID = computeShader.FindKernel("ClearMain");
                cmd.SetComputeTextureParam(computeShader, clearID, "_RW_TargetTex", radialDispatchTextureHandle);
                computeShader.GetKernelThreadGroupSizes(clearID, out uint x1, out uint y1, out uint z1);
                cmd.DispatchCompute(computeShader, clearID,
                                    Mathf.CeilToInt((float)textureSize.x / x1),
                                    Mathf.CeilToInt((float)textureSize.y / y1),
                                    1);

                if (radialDispatch.radialDispatch.value)
                {
                    int kernelID = computeShader.FindKernel("RadialDispatch");
                    cmd.SetComputeTextureParam(computeShader, kernelID, "_ColorTex", backBuffer);
                    cmd.SetComputeTextureParam(computeShader, kernelID, "_RW_TargetTex", radialDispatchTextureHandle);
                    cmd.SetComputeVectorParam(computeShader, "_CenterPosSS", new float4(centerPosSS, 0.0f, 0.0f));
                    cmd.SetComputeVectorParam(computeShader, "_TextureSize", GetTextureSizeParameter(textureSize));
                    cmd.SetComputeBufferParam(computeShader, kernelID, "_DispatchData", computeBuffer);

                    computeShader.GetKernelThreadGroupSizes(kernelID, out uint x, out uint y, out uint z);
                    cmd.DispatchCompute(computeShader, kernelID,
                                         Mathf.CeilToInt((float)totalDispatchCount / x),
                                         1,
                                         1);
                }
                else
                {
                    int kernelID = computeShader.FindKernel("NormalDispatch");
                    cmd.SetComputeTextureParam(computeShader, kernelID, "_ColorTex", backBuffer);
                    cmd.SetComputeTextureParam(computeShader, kernelID, "_RW_TargetTex", radialDispatchTextureHandle);
                    cmd.SetComputeVectorParam(computeShader, "_CenterPosSS", new float4(centerPosSS, 0.0f, 0.0f));
                    cmd.SetComputeVectorParam(computeShader, "_TextureSize", GetTextureSizeParameter(textureSize));
                    cmd.SetComputeBufferParam(computeShader, kernelID, "_DispatchData", computeBuffer);

                    computeShader.GetKernelThreadGroupSizes(kernelID, out uint x, out uint y, out uint z);
                    cmd.DispatchCompute(computeShader, kernelID,
                                         Mathf.CeilToInt((float)textureSize.x / x),
                                         Mathf.CeilToInt((float)textureSize.y / y),
                                         1);
                }
                cmd.Blit(radialDispatchTextureHandle, backBuffer);
            }
        }

        public void Dispose()
        {
            radialDispatchTextureHandle?.Release();
            if (computeBuffer != null)
            {
                computeBuffer.Release();
                computeBuffer = null;
            }
        }
    }
}

RadialDispatchRendererFeature.cs

using System;

namespace UnityEngine.Rendering.Universal
{
    public class RadialDispatchRendererFeature : ScriptableRendererFeature
    {

        [Serializable]
        public class RadialDispatchSettings
        {
            public ComputeShader computeShader;
            public RenderPassEvent renderPassEvent = RenderPassEvent.BeforeRenderingPostProcessing;
        }

        public RadialDispatchSettings settings = new RadialDispatchSettings();
        private RadialDispatchRenderPass radialDispatchRenderPass;

        public override void Create()
        {
            radialDispatchRenderPass = new RadialDispatchRenderPass(settings);
        }

        public override void AddRenderPasses(ScriptableRenderer renderer, ref RenderingData renderingData)
        {
            RadialDispatch rd = VolumeManager.instance.stack.GetComponent<RadialDispatch>();
            if (rd.IsActive())
            {
                radialDispatchRenderPass.Setup(renderer, rd);
                renderer.EnqueuePass(radialDispatchRenderPass);
            }
        }

        protected override void Dispose(bool disposing)
        {
            radialDispatchRenderPass?.Dispose();
            base.Dispose(disposing);
        }
    }
}

RadialDispatch.cs

using System;

namespace UnityEngine.Rendering.Universal
{
    [Serializable, VolumeComponentMenuForRenderPipeline("Post-processing/Radial Dispatch", typeof(UniversalRenderPipeline))]
    public sealed class RadialDispatch : VolumeComponent, IPostProcessComponent
    {
        public BoolParameter isEnabled = new BoolParameter(false);
        public BoolParameter radialDispatch = new BoolParameter(true);

        public bool IsActive()
        {
            return isEnabled.value;
        }

        public bool IsTileCompatible() => false;
    }
}

RadialDispatchCenter.cs

using UnityEngine;

[ExecuteAlways]
public class RadialDispatchCenter : MonoBehaviour
{
    public static RadialDispatchCenter Instance { get; private set; }

    private void OnEnable()
    {
        if (Instance == null)
        {
            Instance = this;
            UnityEngine.Rendering.Universal.RadialDispatchRenderPass.centerTrans = this.transform;
        }
        else
        {
            Debug.LogError("Only one instance of RadialDispatchCenter is allowed to exist at the same time.");
            enabled = false;
        }
    }

    private void OnDisable()
    {
        if (Instance == this)
        {
            Instance = null;
            UnityEngine.Rendering.Universal.RadialDispatchRenderPass.centerTrans = null;
        }
    }

    private void OnDestroy()
    {
        if (Instance == this)
        {
            Instance = null;
            UnityEngine.Rendering.Universal.RadialDispatchRenderPass.centerTrans = null;
        }
    }

}

后记

又是头晕目眩的取模、取余、加一、减一,而且是极难debug的不规则Thread Group和像素对应的方式,好几次对着屏幕上黑色的区域发呆,但最终还是艰难的做了出来。但是最终的代码写的很抽象,就像我没有怎么看Bend Studio提供的参考代码一样,读者(如果有的话)也不会怎么看我写的代码吧。。。

非常感谢Unity的Mathematics这个包,这个包极大地减少了我将同样的代码复制到C#中debug的工作量。但愿没有什么没查出来的bug,明天应该能写一个径向模糊的文章了,之后就是屏幕空间接触阴影了,在之后大概就能到草场的渲染了。