Practical Deep Learning - Lesson 2

We continue our journey into image classifiers in lesson 2 of the fastai course. For this lesson, the course focuses on training a bear classifier model to differentiate between three types of bear: grizzly, black, and teddy. However, instead, I’ll be building a similar classifier to identify different types of Labrador based on their color: black, yellow, and chocolate.

Later on we will learn how to deploy our Labrador classifier to Hugging Face Spaces, using FastHTML for the app UI, rather than Gradio.

Download Labrador Images

As before, the first thing we need to do before training a model is to gather the data! First, let’s define a simple function to search for images on the internet and return a list of URLs. Here, we’re using the DuckDuckGo API for the web search.

def search_images(keywords, max_images=200):
    return L(DDGS().images(keywords, max_results=max_images)).itemgot('image')

We can use this function to download our Labrador images for the three distinct colors: balck, yellow, chocolate.

searches = 'black', 'yellow', 'chocolate'
path = Path('labradors')

if not path.exists():
    for o in searches:
        dest = (path/o)
        dest.mkdir(exist_ok=True, parents=True)
        download_images(dest, urls=search_images(f'{o} labrador photo')[:200])
        time.sleep(5)
        resize_images(dest, max_size=400, dest=dest)

This creates a labradors folder, with three sub-folders black, yellow, and chocolate. We then perform a web search for the terms ‘black labrador photo’, ‘yellow labrador photo’ and ‘chocoloate labrador photo’, and download a maximum of 200 images for each term into the relevant folders (labradors/black, labradors/yellow, labradors/chocolate).

Let’s see all the different types of images downloaded and how many of each there are.

fldrs = ['black', 'yellow', 'chocolate']
all_suffixes = set()

def get_extensions(folders):
    for folder in folders:
        folder_path = path / folder
        image_files = [f for f in folder_path.iterdir() if f.is_file()]
        
        # Add suffixes to the set
        for f in image_files:
            all_suffixes.add(f.suffix.lower())
        
        print(f"{folder}: {len(image_files)} image files")

get_extensions(fldrs)

black: 186 image files
yellow: 192 image files
chocolate: 191 image files

print("\nAll suffixes found:", all_suffixes)


All suffixes found: {'.gif', '.webp', '.jpeg', '.png', '.jpg!d', '.jpg'}

Downloaded Image Cleanup

Before procedding let’s ‘clean’ the downloaded images by removing any corrupted files, and also any that don’t match the file extensions we’re interested in: .jpg, and .jpeg.

Let’s remove any corrupted files first.

failed = verify_images(get_image_files(path))
failed.map(Path.unlink)
len(failed)
print("12 images deleted")

12 images deleted

Now let’s remove all extensions that don’t match .jpg, .jpeg, and .png.

exts = ['.png', '.jpg!d', '.gif', '.webp', '.gif']

def delete_files_by_extension(exts):
    extensions = [ext.lower() for ext in exts]

    # Find matching files before deletion
    files_to_delete = [f for f in path.rglob("*") if f.is_file() and f.suffix.lower() in extensions]
    print(f"Found {len(files_to_delete)} files with specified extensions before deletion.")

    # Delete files
    for file in files_to_delete:
        file.unlink()

    # Confirm how many remain
    remaining = [f for f in path.rglob("*") if f.is_file() and f.suffix.lower() in extensions]
    print(f"{len(remaining)} matching files remain after deletion.")

delete_files_by_extension(exts)

Found 20 files with specified extensions before deletion.
0 matching files remain after deletion.

So we end up with 20 files that were deleted, and the total number of Labrador images for each category are printed below.

all_suffixes = set()
get_extensions(fldrs)

black: 176 image files
yellow: 184 image files
chocolate: 177 image files

Viewing the Dataset

As in lesson 1 we create the DataLoaders object from our downloaded data and view a sample of the dataset which will be fed to the model during training. Here we’re using a combination of RandomResizedCrop and aug_transforms to generate additional synthetic data that will be used to train the model, rather than just the raw downloaded images. This helps improve model performance by feeding it lot’s of variations of the same image (slightly rotated, scaled, different contrast, cropping etc.).

dls = DataBlock(
    blocks=(ImageBlock, CategoryBlock), 
    get_items=get_image_files, 
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=RandomResizedCrop(224, min_scale=0.5),
    batch_tfms=aug_transforms()
).dataloaders(path, bs=32)

dls.show_batch(max_n=16)

Training the Model

We can now train (fine-tune) a model using the fastai API. We can use any readily available model but the ResNet18 CNN model is fine for now. You can view details of this and other Resnet models here. We’ll train the model for 5 epochs initially.

learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(5)

epoch	train_loss	valid_loss	error_rate	time
0	1.120630	0.171037	0.046729	00:02

epoch	train_loss	valid_loss	error_rate	time
0	0.462287	0.182710	0.046729	00:02
1	0.385913	0.249435	0.112150	00:00
2	0.344761	0.284427	0.084112	00:02
3	0.310859	0.308950	0.084112	00:02
4	0.268280	0.317518	0.084112	00:02

Labrador Model Predictions

The model results aren’t exaclty perfect, at around 92% accuracy from the table above. Still, let’s see what we get from some predictions using the validation dataset.

learn.show_results(max_n=16, shuffle=True, figsize=(12, 10))

Here we see that most of the validation sample predictions result in the correct data label. But we can also see an issue with the data too. On some of the downloaded images we have two Labrador dogs displayed. This is likely to confuse the model when they are of different colors.

To see how many predictions were correct/incorrect we can use a confusion matrix to summarize the results.

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

We can see that yellow Labradors performed pretty well. Only once was an image labelled as ‘yellow’ but predicted to be something else, which was ‘black’ in this case. This is likely to be caused by a black Labrador being present in the image (see the third image in the bottom row of the show_results() grid above).

Black and chocolate predictions didn’t do too well, with the choloclate predictions performing the worst out of the three Labrador categories.

We can also use another visualisation of results that summarizes the top losses throughout all predictions of the validation dataset.

interp.plot_top_losses(k=15, nrows=5, figsize=(12, 12))

The plot_top_losses plot reveals that the model’s biggest errors often involve high-confidence misclassifications, particularly confusing similar-looking colors such as ‘black vs chocolate’ or ‘yellow vs chocolate’. Several top-loss images contain two Labradors of different colors, which likely contributes to label ambiguity and model confusion.

While some predictions are incorrect with near-100% confidence—suggesting overfitting, others are correct but with low confidence, indicating room for improvement in certainty. Overall, the model struggles most with visually ambiguous inputs and overlapping class features.

Cleaning Our Labrador Dataset

Fastai includes a really useful notebook widget to visually inspect model predictions and flag incorrect or mislabeled images for deletion or relabeling. The images displayed in the widget for each category are sorted by highest loss, so the most error-prone or uncertain predictions are shown first. You can then scroll to the right to see other predictions with lower associated loss.

from fastai.vision.widgets import ImageClassifierCleaner
cleaner = ImageClassifierCleaner(learn)
cleaner

We can go through each Labrador category and relabel or delete images as necessary. Once done we can delete/move the updated images.

for idx in cleaner.delete(): cleaner.fns[idx].unlink()
for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)

Retraining the Model

Now we have identified and fixed some issues with our data, let’s re-run the training and see if the error_rate improves.

dls = DataBlock(
    blocks=(ImageBlock, CategoryBlock), 
    get_items=get_image_files, 
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=RandomResizedCrop(224, min_scale=0.5),
    batch_tfms=aug_transforms()
).dataloaders(path, bs=32)

learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(5)

epoch	train_loss	valid_loss	error_rate	time
0	1.169166	0.489635	0.076190	00:02

epoch	train_loss	valid_loss	error_rate	time
0	0.306015	0.538382	0.057143	00:02
1	0.244973	0.650789	0.095238	00:02
2	0.251994	0.655468	0.066667	00:02
3	0.216400	0.676472	0.076190	00:02
4	0.192956	0.663825	0.066667	00:02

We can see that the error_rate has now improved from 91.6% accuaracy to 93.4% just by cleaning our dataset! Let’s take a final look again at the show_results plot, confusion matrix and plot_top_losses() plot.

learn.show_results(max_n=16, shuffle=True, figsize=(12, 10))

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

interp.plot_top_losses(k=15, nrows=5, figsize=(12, 12))

We can see that even though we relabelled some images, and removed other images containing two or more Labrador dogs, there are quite a few that remain. If we went through the entire dataset to rectify this then we’d expect the training error_rate to improve even further.

Deploying the Model

Now that the Labrador prediction model is performing reasonably well we’d like to share it online so others can use it for predictions too. This is what is meant by model deployment.

To make this happen though we need to build a simple user interface that will allow images to be uploaded and fed to the model for prediction. Also, how do we export a learned model and use it in a web application? In the course material Jeremy used Gradio for the UI but I’ll be using FastHTML, a modern Python framework for building web applications.

Export the Trained Model

Fastai makes it easy to export a trained model as a single file. This makes the model portable for use in other applications, such as our web app! To do this we can simply call export() on our learner object that wa sthe result of model training.

learn.export()

This exports the model locally to the same folder as the Jupyter notebook file.

!ls -sh

total 51M
 45M export.pkl  5.5M index.ipynb  4.0K labradors

Our trained model weighs in at around 45MB in size.

Let’s download three images for each Labrador color, resize them to 128 x 128 pixels, and save them locally to test our exported model.

colors = ['black', 'yellow', 'chocolate']
fig, axs = plt.subplots(1, 3, figsize=(6, 2.5))

for ax, c in zip(axs, colors):
    fname = f'{c}.jpg'
    url = search_images(f'{c} labrador photo', max_images=1)[0]
    download_url(url, fname, show_progress=False)

    img = Image.open(fname).resize((128, 128))
    img.save(fname)

    ax.imshow(img)
    ax.set_title(f'{c.capitalize()} labrador')
    ax.axis('off')

plt.tight_layout()
plt.show()

for c in ['black', 'yellow', 'chocolate']:
    path = Path(f'{c}.jpg')
    if path.exists():
        with Image.open(path) as img:
            dims = img.size  # (width, height)
        size_kb = path.stat().st_size / 1024
        print(f'{path.name}: {dims[0]}x{dims[1]}, {size_kb:.1f} KB')
    else:
        print(f'{path.name} not found.')

black.jpg: 128x128, 3.2 KB
yellow.jpg: 128x128, 4.7 KB
chocolate.jpg: 128x128, 4.4 KB

Model Import and Test

To test our model we can import it and store a reference to it in labrador_learner so it can be used for inference (making predictions on new image data).

labrador_learner = load_learner('export.pkl')
black_lab = labrador_learner.predict('black.jpg')
yellow_lab = labrador_learner.predict('yellow.jpg')
chocolate_lab = labrador_learner.predict('chocolate.jpg')

black_lab, yellow_lab, chocolate_lab

(('black', tensor(0), tensor([9.9983e-01, 1.5655e-05, 1.5291e-04])),
 ('yellow', tensor(2), tensor([6.6400e-07, 5.8236e-08, 1.0000e+00])),
 ('chocolate', tensor(1), tensor([2.7038e-05, 9.9981e-01, 1.6754e-04])))

In all three cases the model predicted what type of labrador was present in the input images, with at least 99% accuracy.

Creating the Web App

Now the model has been tested and shown to be making accurate predictions we can deploy it to Hugging Face Spaces to share it publicly. As mentioned above I’ll be using FastHTML for the app UI. The FastHTML code required to build the web app is more verbose than the corresponding Gradio code, but it’s not opinionated and is very flexible. I used TailwindCSS for the CSS styles.

I developed the web app locally in just a couple of hours. The user interface looks like this.

img = Image.open('scrn.png')
display(img)

The full code for the app is just under 300 lines of code but is mostly HTML and TailwindCSS class definitions. The Python code to do the inferencing is almost identical to what we have seen already.

# Model inference
labrador_learner = load_learner('export.pkl')
prediction = labrador_learner.predict(filename)
    
# Extract prediction data
label, class_idx, probabilities = prediction
confidence = probabilities[class_idx.item()].item() * 100

Deploying to Hugging Face Spaces

It’s fairly straightforward to deploy a FastHTML app to Hugging Face Spaces, and this is made even easier if you follow the official deployment guide provided by AnswerDotAI (the team behind FastHTML). Once you have followed the guide, and setup your Hugging Face Space, you can deploy your FastHTML app simply by entering the command: fh_hf_deploy <space_name>.

I did have one difficulty deploying though which took me a while to figure out. The FastHTML app ran fine locally but after deploying to Hugging Face I was getting some obscure error messages surrounding the model PKL file during inferencing (making model predictions). No matter what I tried it refused to load the PKL file properly.

In the end it turned out that, locally, I had trained and exported the model using Python 3.11, but Hugging Face Spaces could not run the PKL file based on that particular version. So, I had to create a new local environment running Python 3.10, reinstall all the necessary dependencies, and train/export the model once more. As soon as I did this and redeployed to Hugging Face Spaces it worked perfectly!

It just goes to show that even an issue as simple as this can sometimes trip you up and cost you a few hours development time.

The Labrador Classifier can be accessed on Hugging Face Spaces here. If you try it out please let me know any thoughts, comments, or suggestions, on Twitter/X. 🙂