Quantcast
Channel: FlexMonkey
Viewing all articles
Browse latest Browse all 257

Generating & Filtering Metal Textures From Live Video

$
0
0

My last post, Applying CIFilters to a Live Camera Feed with Swift, showed the basics of setting up an AVCaptureSession to create and filter CIImages. This post expands on that and discusses using the same capture session fundamentals to create a Metal texture and apply a MetalPerformanceShader Gaussian blur to it.

You could use this technique to apply your own Metal kernel functions to images for video processing or for mapping camera feeds onto 3D objects in Metal scenes.

The function that will do the hard work is CVMetalTextureCacheCreateTextureFromImage and the implementation of this may not be immediately obvious since it contains two YCbCr'planes' - one that holds the luma component and the other than holds the blue and red differences. Our application has to generate two metal textures from those planes and, with a small compute shader, generate an RGB one to display to the user.

Before we start, I need to thank McZonk for their work doing this in Objective-C. This article proved invaluable in getting this demo working and discusses the technique in great detail.

After setting up the AVFoundation stuff, to get up and running in Swift we need to follow these steps:

First we need to create texture caches for the two textures using CVMetalTextureCacheCreate using a Metal device:

    let device = MTLCreateSystemDefaultDevice()

    // Texture for Y
    CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, device, nil, &videoTextureCache)
        
    // Texture for CbCr
    CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, device, nil, &videoTextureCache)

Next, inside the captureOutput() function we create a Core Video texture reference which supplies source image data to Metal. From the pixel buffer we can figure out the size of our texture for the plane we want to use:

    let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)

    var cbcrTextureRef : Unmanaged<CVMetalTextureRef>?
    
    let cbcrWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer!, 1);

    let cbcrHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer!, 1);

...and with that information, we can populate the CVMetalTextureRef from the buffer:

    CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
        videoTextureCache!.takeUnretainedValue(),
        pixelBuffer!,
        nil,
        MTLPixelFormat.RG8Unorm,
        cbcrWidth, cbcrHeight, 1,

        &cbcrTextureRef)

The penultimate step is to get the Metal texture from the Core Video texture:

    let cbcrTexture = CVMetalTextureGetTexture((cbcrTextureRef?.takeUnretainedValue())!)

...and finally, we need to release that texture reference:

  cbcrTextureRef?.release()

These steps are repeated for the luma plane, replacing '1' for '0' in CVPixelBufferGetHeightOfPlane and CVMetalTextureCacheCreateTextureFromImage.

I've created a small compute shader based on McZonk's fragment shader to convert the two textures to a single RGB texture. This shader combines the Y and CbCr channels into a single RGB output using a color matrix. One thing to note is that the CbCr texture is half the size of the Y plane because of the way the colors are sampled, so my shader looks like:

kernelvoid YCbCrColorConversion(texture2d<float, access::read> yTexture [[texture(0)]],
                                       texture2d<float, access::read> cbcrTexture [[texture(1)]],
                                       texture2d<float, access::write> outTexture [[texture(2)]],
                                       uint2 gid [[thread_position_in_grid]])
    {
        float3 colorOffset = float3(-(16.0/255.0), -0.5, -0.5);
        float3x3 colorMatrix = float3x3(
                                        float3(1.1641.164, 1.164),
                                        float3(0.000, -0.392, 2.017),
                                        float3(1.596, -0.813, 0.000)
                                        );
        
        uint2 cbcrCoordinates = uint2(gid.x / 2, gid.y / 2); 
        
        float y = yTexture.read(gid).r;
        float2 cbcr = cbcrTexture.read(cbcrCoordinates).rg;
        
        float3 ycbcr = float3(y, cbcr);
        float3 rgb = colorMatrix * (ycbcr + colorOffset);

        outTexture.write(float4(float3(rgb), 1.0), gid);
    }

For the win, I've plugged in the Gaussian blur Metal Performance Shader. I've taken a slightly different approach to my previous implementations in that I'm using an "in place texture". This removes the need for an intermediate texture, so the guts of my Swift Metal code looks like this:

    commandEncoder.setTexture(ytexture, atIndex: 0)
    commandEncoder.setTexture(cbcrTexture, atIndex: 1)
    commandEncoder.setTexture(drawable.texture, atIndex: 2) // out texture
    
    commandEncoder.dispatchThreadgroups(threadgroupsPerGrid, threadsPerThreadgroup: threadsPerThreadgroup)
    
    commandEncoder.endEncoding()
    
    let inPlaceTexture = UnsafeMutablePointer<MTLTexture?>.alloc(1)
    inPlaceTexture.initialize(drawable.texture)
    
    blur.encodeToCommandBuffer(commandBuffer, inPlaceTexture: inPlaceTexture, fallbackCopyAllocator: nil)

    commandBuffer.presentDrawable(drawable)
    

    commandBuffer.commit();

The sigma value of the blur is controlled by a horizontal value slider which has a maximum value of 50 - the Metal Performance Shader implementation of Gaussian blur is ludicrously fast! 

This code works under Xcode 7 beta 3 and iOS 9 beta 3 on my iPad Air 2. I can't guarantee it will work on any other platforms, but it you get it running elsewhere, I'd love to hear about  it!

The project is available at my GitHub repository here.

Viewing all articles
Browse latest Browse all 257

Trending Articles