Advanced AI: Transformers for Computer Vision

Advanced AI: Transformers for Computer Vision

English | MP4 | AVC 1280×720 | AAC 48KHz 2ch | 0h 55m | 165 MB

Transformers are quickly becoming the go-to architecture for many computer vision tasks. If you work in the field, it’s a must-have skill to keep on hand in your AI toolkit. In this course, AI consultant Jonathan Fernandes takes you on a deep dive into the world of transfer learning and transformer model architecture.

Explore the basics of computer vision, image datasets, preprocessing, and image fine-tuning, with hands-on examples and easy-to-follow demonstrations using Google Colab and the Hugging Face library. Discover tips and practical strategies for model training and testing as you go, building out your skill set with the popular inference modeling tools Gradio and Hugging Face Spaces. By the end of this course, you’ll be prepared to design and train larger, more advanced, more sophisticated language models.

Table of Contents

1 Transformers for computer vision
2 What you should know

Transformer Architecture
3 History of transformers
4 Comparing Vision Transformers to BERT

Datasets and Preprocessing
5 Getting set up
6 Getting the data
7 Using datasets
8 Using a pretrained model without fine-tuning
9 Defining a model
10 Preprocessing images
11 A transformed image
12 Getting images in the correct format

Model Training
13 Training arguments
14 Model training
15 Inference in notebook
16 Inference on phone using Gradio
17 Gradio and Hugging Face Spaces

18 Learn more about transformers and large language models