Weekly Recap
Introduction
Wow it is really easy to get off schedule with these weekly recaps. I let over three weeks go by without posting one. Maybe I should try doing daily recaps just so I don’t fall out of the routine as easily. It’s too tempting to put off these weekly posts when I’m in the middle of an actual project.
End-to-End Style Transfer Tutorial
I finally completed the end-to-end style transfer tutorial. I’m a bit irritated that I let myself dump so much time into experimenting with different style transfer models. I think it would have been better to invest that time explaining how others can conduct their own experiments. I might make a post examining how I could have better approached the project and avoided going so far over my time budget. Perhaps it would have helped to set an actual time budget.
I’m in the process of making a follow up post explaining how to use the video style transfer model instead of the model used in the tutorial. I don’t plan on trying to make a super optimized version of the model like in the tutorial. My previous experiments didn’t yield great results when significantly reducing the size of the model. However, I dumped so much time into it that I figure I should at least make a post showing how someone else can tinker with it. I plan to have that finished next week.
Next Project
Before moving on to a completely new project, I plan to make some updates and additions to the PoseNet tutorial series. I realized that there were some unnecessary steps that could be removed for a small performance gain. I also realized I completely forgot to explain how to use the more efficient MobileNet version of the model instead of the ResNet50 version. I didn’t use the MobileNet version in the tutorial because it’s less accurate. However, a reader expressed interest in performing inference with the C# Burst backend. The smaller model would be a much better choice in that instance.
After I update the PoseNet tutorial, I plan to work on getting a facial pose estimation model working with the Barracuda library. Hopefully, that will take less time than the style transfer tutorial. If the model proves incompatible with the current versions of Barracuda, I’ll look into making a C++ plugin so I can use the PyTorch C++ frontend instead.
Links of the Last Few Weeks
Deep Learning
PyTorch 1.8
The latest version of PyTorch has been released and comes with a lot of new updates including beta support for AMD GPUs.
PyTorch3D: A library for deep learning with 3D data
Last year, the FacebookAI research team introduced a deep learning library that aims to make working with 3D data a lot easier. I didn’t learn about it until I decided to click on a video YouTube had been recommending for weeks. The library supports batching of inputs with different sizes. This is helpful since cropping a 3D mesh isn’t as straight forward as cropping 2D images. The library also supports several common operations for 3D data as well as a differentiable rendering API. My mind instantly jumped to wondering what it would take to integrate this with Blender. I’ll plan to make time soon so that I can work through the available tutorials.
Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans
Training models on Multiple GPUs using fastai
This blog post explores different approaches to training models using multiple GPUs.
Introducing Noisy Imagenette
This is a new version of the Imagenette library with noisy labels.
Taming Transformers for High-Resolution Image Synthesis
This is an impressive new model that can be trained to perform a wide variety of image generation tasks. Check out the related Two Minute Papers video to see what it can do.
StyleGAN Components
This blog post covers the key components of a StyleGAN model and uses them to build a basic version of the model.
Multimodal Neurons in Artificial Neural Networks
A recent post by OpenAI that explores how their Contrastive Language-Image Pre-Training (CLIP) model responds to the same concept whether presented literally, symbolically, or conceptually.
Programming
All of the python 3.9 standard library
Organized and hyperlinked index to every module, function, and class in the Python standard library
Dev Simulator: A Coding RPG
This is an upcoming RPG where players build a real full stack web app while playing through the storyline with 8-bit co-workers.
Git scraping
This five minute video demonstrates how to schedule web scrapers using GitHub Actions.
Manim
An animation engine for creating precise animations in Python. This is a fork of the animation engine used in the videos on the 3Blue1Brown YouTube channel.