Barracuda PoseNet Tutorial 2nd Edition Pt. 7
Overview
In this post, we will cover how to create pose skeletons so that we can compare the estimated key point locations to the source video feed.
Create PoseSkeleton
Script
We will implement the functionality for creating pose skeletons in a new script. Open the Scripts
folder in the Assets section and create a new C#
script called PoseSkeleton
. The PoseSkeleton
class will handle creating a single pose skeleton and updating the positions of its key points. We will be creating as many PoseSkeleton
instances as is specified by the maxPoses
variable in the PoseEstimator
script.
Add Required Namespace
We need to add the System
namespace as we will once again be using the Tuple
class.
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using System;
Remove MonoBehaviour
Inheritance
The PoseSkeleton
class does not need to be a MonoBehaviour
so we can remove it.
public class PoseSkeleton
Add Variables
We will need a Transform
array to keep track of the positions of the key point objects in the scene.
We also need a GameObject
array to store the lines connecting the key point objects.
Next, we will create a static string
array to store the names of the key points predicted by the model. The names will be ordered based on their key point id number (e.g. nose is in index 0
).
The number of key point predicted by the model will not change, so we will store the number in a static int
variable.
Much like the parentChildrenTuples
variable in the Utils
script, we will create a Tuple
array to keep track of which key points should be connected by lines. We could actually just use the pairs from parentChildrenTuples
, but the skeleton would look a bit weird.
Instead, we will make a pose skeleton that looks like this.
To help distinguish the different body areas, we will create a Color
array so that we can specify what color we want each line to be.
Lastly, we need is a float
variable to specify the line width for the pose skeleton lines.
// The list of key point GameObjects that make up the pose skeleton
public Transform[] keypoints;
// The GameObjects that contain data for the lines between key points
private GameObject[] lines;
// The names of the body parts that will be detected by the PoseNet model
private static string[] partNames = new string[]{
"nose", "leftEye", "rightEye", "leftEar", "rightEar", "leftShoulder",
"rightShoulder", "leftElbow", "rightElbow", "leftWrist", "rightWrist",
"leftHip", "rightHip", "leftKnee", "rightKnee", "leftAnkle", "rightAnkle"
};
private static int NUM_KEYPOINTS = partNames.Length;
// The pairs of key points that should be connected on a body
private Tuple<int, int>[] jointPairs = new Tuple<int, int>[]{
// Nose to Left Eye
.Create(0, 1),
Tuple// Nose to Right Eye
.Create(0, 2),
Tuple// Left Eye to Left Ear
.Create(1, 3),
Tuple// Right Eye to Right Ear
.Create(2, 4),
Tuple// Left Shoulder to Right Shoulder
.Create(5, 6),
Tuple// Left Shoulder to Left Hip
.Create(5, 11),
Tuple// Right Shoulder to Right Hip
.Create(6, 12),
Tuple// Left Shoulder to Right Hip
.Create(5, 12),
Tuple// Rigth Shoulder to Left Hip
.Create(6, 11),
Tuple// Left Hip to Right Hip
.Create(11, 12),
Tuple// Left Shoulder to Left Elbow
.Create(5, 7),
Tuple// Left Elbow to Left Wrist
.Create(7, 9),
Tuple// Right Shoulder to Right Elbow
.Create(6, 8),
Tuple// Right Elbow to Right Wrist
.Create(8, 10),
Tuple// Left Hip to Left Knee
.Create(11, 13),
Tuple// Left Knee to Left Ankle
.Create(13, 15),
Tuple// Right Hip to Right Knee
.Create(12, 14),
Tuple// Right Knee to Right Ankle
.Create(14, 16)
Tuple};
// Colors for the skeleton lines
private Color[] colors = new Color[] {
// Head
.magenta, Color.magenta, Color.magenta, Color.magenta,
Color// Torso
.red, Color.red, Color.red, Color.red, Color.red, Color.red,
Color// Arms
.green, Color.green, Color.green, Color.green,
Color// Legs
.blue, Color.blue, Color.blue, Color.blue
Color};
// The width for the skeleton lines
private float lineWidth;
// The material for the key point objects
private Material keypointMat;
Create InitializeLine
Method
The first method we will create will handle the initialization of a single line in the pose skeleton.
Method Steps
- Get the starting and ending joint pair indices to indicate what two key point are being connected
- Use the names of the two key points to create the name for the line object
- Create a new standard
GameObject
- Add a
LineRenderer
component to the newGameObject
- Create a new
Material
for the line with the appropriate color from theColor
array - Indicate that the line with only have two points
- Set the line width
/// <summary>
/// Create a line between the key point specified by the start and end point indices
/// </summary>
/// <param name="pairIndex"></param>
/// <param name="startIndex"></param>
/// <param name="endIndex"></param>
/// <param name="width"></param>
/// <param name="color"></param>
private void InitializeLine(int pairIndex, float width, Color color)
{
int startIndex = jointPairs[pairIndex].Item1;
int endIndex = jointPairs[pairIndex].Item2;
// Create new line GameObject
string name = $"{keypoints[startIndex].name}_to_{keypoints[endIndex].name}";
[pairIndex] = new GameObject(name);
lines
// Add LineRenderer component
= lines[pairIndex].AddComponent<LineRenderer>();
LineRenderer lineRenderer // Make LineRenderer Shader Unlit
.material = new Material(Shader.Find("Unlit/Color"));
lineRenderer// Set the material color
.material.color = color;
lineRenderer
// The line will consist of two points
.positionCount = 2;
lineRenderer
// Set the width from the start point
.startWidth = width;
lineRenderer// Set the width from the end point
.endWidth = width;
lineRenderer}
Create InitializeSkeleton
Method
We will call the InitializeLine
method for each joint pair in jointPairs
.
/// <summary>
/// Initialize the pose skeleton
/// </summary>
private void InitializeSkeleton()
{
for (int i = 0; i < jointPairs.Length; i++)
{
InitializeLine(i, lineWidth, colors[i]);
}
}
Create Constructor
Now we can define the class constructor that will initialize the pose skeleton.
Method Steps
- Initialize the
keypoints
array - Create a new material for the key point objects
- Create a new
GameObject
for each key point- Create a sphere
GameObject
- Set the position to the origin
- Set the size of the
GameObject
using the providedpointScale
value - Assign the new material
- Set the name for the object
- Create a sphere
- Set the
lineWidth
value - Initialize the
lines
array - Call the the
InitializeSkeleton
method
public PoseSkeleton(float pointScale = 10f, float lineWidth = 5f)
{
this.keypoints = new Transform[NUM_KEYPOINTS];
= new Material(Shader.Find("Unlit/Color"));
Material keypointMat .color = Color.yellow;
keypointMat
for (int i = 0; i < NUM_KEYPOINTS; i++)
{
this.keypoints[i] = GameObject.CreatePrimitive(PrimitiveType.Sphere).transform;
this.keypoints[i].position = new Vector3(0, 0, 0);
this.keypoints[i].localScale = new Vector3(pointScale, pointScale, 0);
this.keypoints[i].gameObject.GetComponent<MeshRenderer>().material = keypointMat;
this.keypoints[i].gameObject.name = partNames[i];
}
this.lineWidth = lineWidth;
// The number of joint pairs
int numPairs = keypoints.Length + 1;
// Initialize the lines array
= new GameObject[numPairs];
lines
// Initialize the pose skeleton
InitializeSkeleton();
}
Create ToggleSkeleton
Method
Just because we initialize a given number of pose skeletons based on the value for maxPoses
does not mean that the model will find that many poses in an input image. We will need to hide the excess skeletons when they are not needed. We will hide the skeletons by deactivating the associated key point and line objects. We can use the same function to unhide the skeleton when it is needed. This method will be called from the PoseEstimator
script, so it needs to be public
.
/// <summary>
/// Toggles visibility for the skeleton
/// </summary>
/// <param name="show"></param>
public void ToggleSkeleton(bool show)
{
for (int i= 0; i < jointPairs.Length; i++)
{
[i].SetActive(show);
lines[jointPairs[i].Item1].gameObject.SetActive(show);
keypoints[jointPairs[i].Item2].gameObject.SetActive(show);
keypoints}
}
Create Cleanup
Method
When we reduce the max number of poses to estimate, we should remove the skeletons that are no longer needed. This method is nearly identical to the ToggleSkeleton
except that we will be destroying the objects rather than deactivating them.
/// <summary>
/// Clean up skeleton GameObjects
/// </summary>
public void Cleanup()
{
for (int i = 0; i < jointPairs.Length; i++)
{
.Destroy(lines[i]);
GameObject.Destroy(keypoints[jointPairs[i].Item1].gameObject);
GameObject.Destroy(keypoints[jointPairs[i].Item2].gameObject);
GameObject}
}
Create UpdateKeyPointPositions
Method
We will update the key point positions with the latest model output in a new method called UpdateKeyPointPositions
.
This method will take in the following as input:
Keypoint
array for a single pose- The scale value to scale the key point positions from the input resolution to the source resolution
- The source
RenderTexture
- A
bool
to indicate whether to mirror the key point positions when using a webcam - A
float
value to indicate the minimum confidence score a key point needs to have to be displayed
Method Steps
- Iterate through the
Keypoint
array - Hide the key point objects that do not meet the minimum confidence score
- Scale the key point positions from the input resolution up to the source resolution
- Flip the
Y
axis coordinates vertically to compensate for the difference between heatmap indices and scene coordinates - Mirror the
X
axis coordinates if using a webcam - Update the key point object positions with the new coordinate values
/// <summary>
/// Update the positions for the key point GameObjects
/// </summary>
/// <param name="keypoints"></param>
/// <param name="sourceScale"></param>
/// <param name="sourceTexture"></param>
/// <param name="mirrorImage"></param>
/// <param name="minConfidence"></param>
public void UpdateKeyPointPositions(Utils.Keypoint[] keypoints,
float sourceScale, RenderTexture sourceTexture, bool mirrorImage, float minConfidence)
{
// Iterate through the key points
for (int k = 0; k < keypoints.Length; k++)
{
// Check if the current confidence value meets the confidence threshold
if (keypoints[k].score >= minConfidence / 100f)
{
// Activate the current key point GameObject
this.keypoints[k].GetComponent<MeshRenderer>().enabled = true;
}
else
{
// Deactivate the current key point GameObject
this.keypoints[k].GetComponent<MeshRenderer>().enabled = false;
}
// Scale the keypoint position to the original resolution
= keypoints[k].position * sourceScale;
Vector2 coords
// Flip the keypoint position vertically
.y = sourceTexture.height - coords.y;
coords
// Mirror the x position if using a webcam
if (mirrorImage) coords.x = sourceTexture.width - coords.x;
// Update the current key point location
// Set the z value to -1f to place it in front of the video screen
this.keypoints[k].position = new Vector3(coords.x, coords.y, -1f);
}
}
Create UpdateLines
Method
Once we have update the positions of the key point objects in the scene, we need to update the starting and ending coordinates for the skeleton lines. We will do so in a new method called UpdateLines
.
Method Steps
- Get references to the starting and ending key point objects
- Check if both the starting and ending key point objects are visible
- If true
- Make the line object active
- Update the starting position for the line
- Update the ending positions for the line
- if false, deactivate the line object
- If true
/// <summary>
/// Draw the pose skeleton based on the latest location data
/// </summary>
public void UpdateLines()
{
// Iterate through the joint pairs
for (int i = 0; i < jointPairs.Length; i++)
{
// Set the GameObject for the starting key point
= keypoints[jointPairs[i].Item1];
Transform startingKeyPoint // Set the GameObject for the ending key point
= keypoints[jointPairs[i].Item2];
Transform endingKeyPoint
// Check if both the starting and ending key points are active
if (startingKeyPoint.GetComponent<MeshRenderer>().enabled &&
.GetComponent<MeshRenderer>().enabled)
endingKeyPoint{
// Activate the line
[i].SetActive(true);
lines
= lines[i].GetComponent<LineRenderer>();
LineRenderer lineRenderer // Update the starting position
.SetPosition(0, startingKeyPoint.position);
lineRenderer// Update the ending position
.SetPosition(1, endingKeyPoint.position);
lineRenderer}
else
{
// Deactivate the line
[i].SetActive(false);
lines}
}
}
Update PoseEstimator
Script
Back in the PoseEstimator
script, we need to add some new variables to use the PoseSkeleton
class.
Add Public Variables
We will add a couple public float
variables for setting the size of the key point objects and the width of the skeleton lines.
We will also add a public int
variable to specify the minimum confidence value a key point need to have for it to be used for the pose skeleton.
[Tooltip("The size of the pose skeleton key points")]
public float pointScale = 10f;
[Tooltip("The width of the pose skeleton lines")]
public float lineWidth = 5f;
[Tooltip("The minimum confidence level required to display the key point")]
[Range(0, 100)]
public int minConfidence = 70;
Add Private Variables
Lastly we will declare a PoseSkeleton
array to store the pose skeletons.
// Array of pose skeletons
private PoseSkeleton[] skeletons;
Create InitializeSkeletons
Method
We will create a new method called InitializeSkeletons
to populate the skeletons
array. When the performing single pose estimation, the max number of poses will be set to 1
. This method will be called in the Start
method as well as any time the maxPoses
value gets updated.
/// <summary>
/// Initialize pose skeletons
/// </summary>
private void InitializeSkeletons()
{
// Initialize the list of pose skeletons
if (estimationType == EstimationType.SinglePose) maxPoses = 1;
= new PoseSkeleton[maxPoses];
skeletons
// Populate the list of pose skeletons
for (int i = 0; i < maxPoses; i++) skeletons[i] = new PoseSkeleton(pointScale, lineWidth);
}
Modify Start
Method
We will call the InitializeSkeletons
method at the end of the Start
method.
// Initialize pose skeletons
InitializeSkeletons();
Full Code
// Start is called before the first frame update
void Start()
{
if (useWebcam)
{
// Limit application framerate to the target webcam framerate
.targetFrameRate = webcamFPS;
Application
// Create a new WebCamTexture
= new WebCamTexture(webcamDims.x, webcamDims.y, webcamFPS);
webcamTexture
// Start the Camera
.Play();
webcamTexture
// Deactivate the Video Player
.GetComponent<VideoPlayer>().enabled = false;
videoScreen
// Update the videoDims.y
.y = webcamTexture.height;
videoDims// Update the videoDims.x
.x = webcamTexture.width;
videoDims}
else
{
// Update the videoDims.y
.y = (int)videoScreen.GetComponent<VideoPlayer>().height;
videoDims// Update the videoDims.x
.x = (int)videoScreen.GetComponent<VideoPlayer>().width;
videoDims}
// Create a new videoTexture using the current video dimensions
= RenderTexture.GetTemporary(videoDims.x, videoDims.y, 24, RenderTextureFormat.ARGBHalf);
videoTexture
// Initialize the videoScreen
InitializeVideoScreen(videoDims.x, videoDims.y, useWebcam);
// Adjust the camera based on the source video dimensions
InitializeCamera();
// Adjust the input dimensions to maintain the source aspect ratio
= (float)videoTexture.width / videoTexture.height;
aspectRatioScale .x = (int)(imageDims.y * aspectRatioScale);
targetDims.x = targetDims.x;
imageDims
// Initialize the RenderTexture that will store the processed input image
= RenderTexture.GetTemporary(imageDims.x, imageDims.y, 24, RenderTextureFormat.ARGBHalf);
rTex
// Initialize the Barracuda inference engine based on the selected model
InitializeBarracuda();
// Initialize pose skeletons
InitializeSkeletons();
}
Modify Update
Method
At the end of the Update
method, we need to first check if the maxPoses
value has been updated.
We then need to calculate the scale value to upscale the key point positions from the input image resolution to the source video resolution.
We can then iterate through the pose skeletons in the skeleton
array. If there are more pose skeletons than poses returned by the ProcessOutput
method, we will hide the extra pose skeletons.
// Reinitialize pose skeletons
if (maxPoses != skeletons.Length)
{
foreach (PoseSkeleton skeleton in skeletons)
{
.Cleanup();
skeleton}
// Initialize pose skeletons
InitializeSkeletons();
}
// The smallest dimension of the videoTexture
int minDimension = Mathf.Min(videoTexture.width, videoTexture.height);
// The value used to scale the key point locations up to the source resolution
float scale = (float)minDimension / Mathf.Min(imageDims.x, imageDims.y);
// Update the pose skeletons
for (int i = 0; i < skeletons.Length; i++)
{
if (i <= poses.Length - 1)
{
[i].ToggleSkeleton(true);
skeletons
// Update the positions for the key point GameObjects
[i].UpdateKeyPointPositions(poses[i], scale, videoTexture, useWebcam, minConfidence);
skeletons[i].UpdateLines();
skeletons}
else
{
[i].ToggleSkeleton(false);
skeletons}
}
Full Code
// Update is called once per frame
void Update()
{
// Copy webcamTexture to videoTexture if using webcam
if (useWebcam) Graphics.Blit(webcamTexture, videoTexture);
// Prevent the input dimensions from going too low for the model
.x = Mathf.Max(imageDims.x, 64);
imageDims.y = Mathf.Max(imageDims.y, 64);
imageDims
// Update the input dimensions while maintaining the source aspect ratio
if (imageDims.x != targetDims.x)
{
= (float)videoTexture.height / videoTexture.width;
aspectRatioScale .y = (int)(imageDims.x * aspectRatioScale);
targetDims.y = targetDims.y;
imageDims.x = imageDims.x;
targetDims}
if (imageDims.y != targetDims.y)
{
= (float)videoTexture.width / videoTexture.height;
aspectRatioScale .x = (int)(imageDims.y * aspectRatioScale);
targetDims.x = targetDims.x;
imageDims.y = imageDims.y;
targetDims}
// Update the rTex dimensions to the new input dimensions
if (imageDims.x != rTex.width || imageDims.y != rTex.height)
{
.ReleaseTemporary(rTex);
RenderTexture// Assign a temporary RenderTexture with the new dimensions
= RenderTexture.GetTemporary(imageDims.x, imageDims.y, 24, rTex.format);
rTex }
// Copy the src RenderTexture to the new rTex RenderTexture
.Blit(videoTexture, rTex);
Graphics
// Prepare the input image to be fed to the selected model
ProcessImage(rTex);
// Reinitialize Barracuda with the selected model and backend
if (engine.modelType != modelType || engine.workerType != workerType)
{
.worker.Dispose();
engineInitializeBarracuda();
}
// Execute neural network with the provided input
.worker.Execute(input);
engine// Release GPU resources allocated for the Tensor
.Dispose();
input
// Decode the keypoint coordinates from the model output
ProcessOutput(engine.worker);
// Reinitialize pose skeletons
if (maxPoses != skeletons.Length)
{
foreach (PoseSkeleton skeleton in skeletons)
{
.Cleanup();
skeleton}
// Initialize pose skeletons
InitializeSkeletons();
}
// The smallest dimension of the videoTexture
int minDimension = Mathf.Min(videoTexture.width, videoTexture.height);
// The value used to scale the key point locations up to the source resolution
float scale = (float)minDimension / Mathf.Min(imageDims.x, imageDims.y);
// Update the pose skeletons
for (int i = 0; i < skeletons.Length; i++)
{
if (i <= poses.Length - 1)
{
[i].ToggleSkeleton(true);
skeletons
// Update the positions for the key point GameObjects
[i].UpdateKeyPointPositions(poses[i], scale, videoTexture, useWebcam, minConfidence);
skeletons[i].UpdateLines();
skeletons}
else
{
[i].ToggleSkeleton(false);
skeletons}
}
}
Summary
Now we can compare the estimated key point locations to the source video feed.
Previous: Part 6
Project Resources: GitHub Repository
I’m Christian Mills, a deep learning consultant specializing in practical AI implementations. I help clients leverage cutting-edge AI technologies to solve real-world problems.
Interested in working together? Fill out my Quick AI Project Assessment form or learn more about me.