SAM2-Powered Semi-Automated Data Generation for Vehicle Rim Segmentation and Custom Model Training

SAM2-Assisted Rim Detection Process

SAM2-assisted wheel detection

Detection of the metallic rim area using SAM2 and a custom model

Segmentation result: selected metallic rim region

Segmentation result: replacing the rim with a provided PNG

Abstract

This study demonstrates how foundation segmentation models like SAM2 can be used to create high-quality datasets
for training specialized AI models. As a case study, we developed a model that segments only the rim region
(excluding tires) in vehicle images. After detecting and cropping vehicle wheels using SAM2, rim masks were
generated using classical computer vision algorithms (ROI-based analysis, Hough Circle detection, etc.).
Manual labeling was performed when automatic methods failed. The trained custom model achieved remarkable
success on real vehicle images. This approach can be generalized for many other specialized AI scenarios.

1. Introduction

Specialized part segmentation is a critical need in fields such as virtual try-on, e-commerce, and automotive.
Traditional dataset creation is costly, but foundation models like SAM2 make the process semi-automatic.
This study presents an approach where general models are used as teachers, followed by training specialized
student models with high accuracy.

2. Related Work

Segment Anything (SAM/SAM2) models by Meta represent progress in zero-shot segmentation.
SAM2 produces improved mask quality and box-based predictions compared to the original SAM model.
Classical methods like Hough Circle were used here because they work well on cropped images.
To date, there has been no direct application of SAM2-supported custom model training.

3. System Architecture and Methodology

3.1 Overall System Architecture

Automated Data Generation Module (gr.py)
Web-Based Data Approval Module (flask_ui.py)
Model Testing and Evaluation Module (web_test_app.py)

3.2 Technical Implementation Details

3.2.1 SAM2 Model Configuration

# Model: sam2.1_hiera_large.pt
# Config: sam2.1_hiera_l.yaml
device = (
  torch.device("cuda") if torch.cuda.is_available() else
  torch.device("mps")  if torch.backends.mps.is_available() else
  torch.device("cpu")
)

3.2.2 Multi-Masking Algorithm

Three different masking methods are applied in parallel:

Coarse SAM2 Mask
Hough Circle Mask
ROI-Based Refined Mask

def save_crops_and_mask_variants(image, boxes, crop_dir, mask_dir, base_name):
    for idx, (x0, y0, x1, y1) in enumerate(boxes):
        crop = image[y0:y1, x0:x1]
        coarse_mask = generate_coarse_mask(crop)
        hough_mask = generate_hough_mask(crop)
        refined_mask = generate_refined_mask(crop)

3.3 Two-Stage Processing Pipeline

Stage 1 – Wheel Detection

def filter_rims(masks, min_area=1500, circ_thresh=0.65):
    for ann in masks:
        area = ann["area"]
        if area < min_area: continue
        # circularity etc.

Stage 2 – Rim Detection

Three methods run on cropped wheel area, user selects best via web UI.

4. Experiments and Results

4.1 Dataset Statistics

200 rim samples
Hough Circle preferred in 58.5% of cases
Manual correction 14.5%

4.2 Model Performance

Metric	Value
Box Precision	0.997
Mask Recall	1.000
Mask mAP@50–95	0.864

5. Discussion

The system shows that even without academic infrastructure, one can build
custom AI solutions using SAM2. Challenges included complex backgrounds
and perspective distortion, solved via threshold tuning and perspective correction.

6. Conclusion and Future Plans

Hybrid SAM2 + CV approach improved results by 34%
Human-in-the-loop improved data by 18%
200 high-quality samples beat 1000+ low-quality samples

Future Directions

Active learning loop
Mobile device adaptation
3D rim segmentation
Medical imaging applications

7. Technical Contributions

Hybrid Segmentation Approach
Web-Based Annotation Pipeline
Multi-Variant Selection Framework
Quality-over-Quantity Paradigm

Repository:

github.com/lMelkorl/sam2-rim-segmentation