Encoding & Recognition
Managing people in photo collections has evolved rapidly with digital photography and widely available pro-grade tools. Most of us now have huge libraries — which makes it surprisingly hard to find all photos of a specific person. This is especially important when you organize by people (movie stars, athletes, supermodels, or simply relatives and friends). To manage people effectively, we need intelligent search that can recognize who’s in a photo.
Modern photo managers like Apple Photos and Google Photos auto-detect faces, making it easier to group and search by person. Even so, challenges remain: accurately identifying people in older or low-quality images, handling duplicates, and keeping metadata consistent. The good news: recent progress in face recognition and person re-identification (ReID) — plus better metadata practices — means much of “people management” can be automated with the right approach.
This short post (the first in a series) shows how to leverage professional-grade methods with approachable tools — so you can keep your library organized and searchable, and take your people management to the next level. We’ll start with the basics: encoding and recognition.
A layered identity model (what we’re building next)
Eventually, we’ll use a layered model:
- Objective layer, observations from a recognition engine: embeddings that represent a person’s face (and optionally body).
- Subjective layer, human-readable assertions: a database of people with names and biographical details.
- Linking layer, mappings that connect the two so your software can “speak both languages.”
This post focuses on the foundation — creating encodings and recognizing matches.
Building your own Recognition Engine
We’ll start with Python-based face recognition, and later extend to full-body ReID. A practical starting point for body ReID is FastReID, a PyTorch-based toolbox with a “model zoo” of pre-trained models. (ReID is about saying “have I seen this person before in my collection?” using body features; face recognition focuses on the face.).
Key idea: a recognizer needs a knowledge base — previously encoded faces (and bodies) of known people. Here’s a simple, effective dataset layout:
/persons_dataset
/Person1
– image1.jpg
– image2.jpg
/Person2
– image1.jpg
– image2.jpg
Aim for 15–20 varied images per person (angles, lighting, expressions). Each folder name is the person label you want the system to learn.
Step 1 — Encode faces
Below a small script that imports some libraries for data handling and the ‘simplest face recognition library’ using dlib‘s state-of-the-art face recognition built with deep learning. The script scans your dataset, extracts 128-D face embeddings, and saves them alongside the corresponding names.
# encode_faces.py
import os, pickle, pathlib
import numpy as np
import face_recognition
DATASET_DIR = pathlib.Path("persons_dataset")
OUT_FILE = "encodings.pkl"
encodings = []
names = []
for person_name in sorted(os.listdir(DATASET_DIR)):
person_dir = DATASET_DIR / person_name
if not person_dir.is_dir():
continue
for img_name in os.listdir(person_dir):
img_path = person_dir / img_name
try:
image = face_recognition.load_image_file(img_path)
# Optional: use 'cnn' model if you have a GPU and dlib compiled with CUDA
boxes = face_recognition.face_locations(image, model="hog")
face_vecs = face_recognition.face_encodings(image, boxes)
if len(face_vecs) == 1:
encodings.append(face_vecs[0])
names.append(person_name)
except Exception as e:
print(f"Skip {img_path}: {e}")
with open(OUT_FILE, "wb") as f:
pickle.dump({"encodings": np.asarray(encodings), "names": np.asarray(names)}, f)
print(f"Saved {len(encodings)} encodings for {len(set(names))} people to {OUT_FILE}")PythonThat’s it, you just created your first photo knowledge base (KB)! Let’s use it.
Step 2 — Recognize faces in an image
We’ll load the encodings in our KB, open a test image, encode the face in this photo and then find the closest known face by Euclidean distance. You control strictness via TOLERANCE (typical range 0.5–0.6; lower = stricter).
# recognize_image.py
import pickle, numpy as np, face_recognition
with open("encodings.pkl", "rb") as f:
db = pickle.load(f)
KNOWN = db["encodings"]
NAMES = db["names"]
TOLERANCE = 0.55 # adjust to taste
def recognize_persons(image_path):
image = face_recognition.load_image_file(image_path)
boxes = face_recognition.face_locations(image, model="hog")
faces = face_recognition.face_encodings(image, boxes)
results = []
for face in faces:
dists = face_recognition.face_distance(KNOWN, face)
idx = int(np.argmin(dists))
if dists[idx] <= TOLERANCE:
results.append(NAMES[idx])
else:
results.append("Unknown")
return results
if __name__ == "__main__":
test_image = "test_image.jpg"
print(", ".join(recognize_persons(test_image)) or "No faces found.")
PythonProcessing only 1 image is not very efficient, so let’s make the script more useful.
Step 3 — Batch process a folder
Below an adapted script to process a whole directory, copying images into subfolders by recognized name (or Unknown).
# batch_process.py
import os, shutil, pathlib, time, pickle, numpy as np
import face_recognition
INPUT_DIR = pathlib.Path("E:/persons_unknown")
OUTPUT_DIR = pathlib.Path("E:/persons_processed")
ENC_FILE = pathlib.Path("E:/persons_dataset/encodings.pkl")
VALID_EXT = {".jpg", ".jpeg", ".png", ".webp"}
TOLERANCE = 0.55
MOVE_FILES = False # set True to move instead of copy
with open(ENC_FILE, "rb") as f:
db = pickle.load(f)
KNOWN = db["encodings"]
NAMES = db["names"]
def save_to_bucket(img_path, label):
dest_dir = OUTPUT_DIR / label
dest_dir.mkdir(parents=True, exist_ok=True)
dest = dest_dir / img_path.name
if MOVE_FILES:
shutil.move(str(img_path), dest)
else:
shutil.copy2(str(img_path), dest)
def process_image(img_path):
try:
image = face_recognition.load_image_file(img_path)
boxes = face_recognition.face_locations(image, model="hog")
faces = face_recognition.face_encodings(image, boxes)
if not faces:
save_to_bucket(img_path, "Unknown")
return
# If multiple faces, save one copy per recognized label (dedup with a set)
labels = set()
for face in faces:
dists = face_recognition.face_distance(KNOWN, face)
idx = int(np.argmin(dists))
label = NAMES[idx] if dists[idx] <= TOLERANCE else "Unknown"
labels.add(label)
for label in labels:
save_to_bucket(img_path, label)
except Exception as e:
print(f"Error {img_path}: {e}")
def main():
start = time.time()
images = [p for p in INPUT_DIR.rglob("*") if p.suffix.lower() in VALID_EXT]
total = len(images)
for i, p in enumerate(images, 1):
process_image(p)
print(f"Progress: {i/total:0.1%} ({i}/{total})")
print(f"Done in {time.time()-start:0.1f}s")
if __name__ == "__main__":
main()
PythonIn 3 small steps we created the means to encode & recognize people’s faces in your photo collections. Later, we will show you how to add a small PyQt6 GUI to pick input/output folders, start the run, show progress and inspect results. For now, these small examples suffice to demonstrate the core mechanics clear and minimal.
Where body ReID fits (next posts)
Face recognition works best when faces are visible and reasonably sharp. Person ReID complements this by using full-body appearance (bodily parts, clothing, silhouette) to cluster or match the same person across photos where the face isn’t clear. Tools like TorchReID provide strong pre-trained models and a store of body embeddings for your known identities. We’ll add this in the next post as an optional layer on top of the face pipeline in the next post.