SAM2-Assisted Rim Detection Process
SAM2-assisted wheel detection
Detection of the metallic rim area using SAM2 and a custom model
Segmentation result: selected metallic rim region
Segmentation result: replacing the rim with a provided PNG
Abstract
This study demonstrates how foundation segmentation models like SAM2 can be used to create high-quality datasets
for training specialized AI models. As a case study, we developed a model that segments only the rim region
(excluding tires) in vehicle images. After detecting and cropping vehicle wheels using SAM2, rim masks were
generated using classical computer vision algorithms (ROI-based analysis, Hough Circle detection, etc.).
Manual labeling was performed when automatic methods failed. The trained custom model achieved remarkable
success on real vehicle images. This approach can be generalized for many other specialized AI scenarios.
1. Introduction
Specialized part segmentation is a critical need in fields such as virtual try-on, e-commerce, and automotive.
Traditional dataset creation is costly, but foundation models like SAM2 make the process semi-automatic.
This study presents an approach where general models are used as teachers, followed by training specialized
student models with high accuracy.
2. Related Work
Segment Anything (SAM/SAM2) models by Meta represent progress in zero-shot segmentation.
SAM2 produces improved mask quality and box-based predictions compared to the original SAM model.
Classical methods like Hough Circle were used here because they work well on cropped images.
To date, there has been no direct application of SAM2-supported custom model training.
3. System Architecture and Methodology
3.1 Overall System Architecture
- Automated Data Generation Module (
gr.py) - Web-Based Data Approval Module (
flask_ui.py) - Model Testing and Evaluation Module (
web_test_app.py)
3.2 Technical Implementation Details
3.2.1 SAM2 Model Configuration
# Model: sam2.1_hiera_large.pt
# Config: sam2.1_hiera_l.yaml
device = (
torch.device("cuda") if torch.cuda.is_available() else
torch.device("mps") if torch.backends.mps.is_available() else
torch.device("cpu")
)
3.2.2 Multi-Masking Algorithm
Three different masking methods are applied in parallel:
- Coarse SAM2 Mask
- Hough Circle Mask
- ROI-Based Refined Mask
def save_crops_and_mask_variants(image, boxes, crop_dir, mask_dir, base_name):
for idx, (x0, y0, x1, y1) in enumerate(boxes):
crop = image[y0:y1, x0:x1]
coarse_mask = generate_coarse_mask(crop)
hough_mask = generate_hough_mask(crop)
refined_mask = generate_refined_mask(crop)
3.3 Two-Stage Processing Pipeline
Stage 1 – Wheel Detection
def filter_rims(masks, min_area=1500, circ_thresh=0.65):
for ann in masks:
area = ann["area"]
if area < min_area: continue
# circularity etc.
Stage 2 – Rim Detection
Three methods run on cropped wheel area, user selects best via web UI.
4. Experiments and Results
4.1 Dataset Statistics
- 200 rim samples
- Hough Circle preferred in 58.5% of cases
- Manual correction 14.5%
4.2 Model Performance
| Metric | Value |
|---|---|
| Box Precision | 0.997 |
| Mask Recall | 1.000 |
| Mask mAP@50–95 | 0.864 |
5. Discussion
The system shows that even without academic infrastructure, one can build
custom AI solutions using SAM2. Challenges included complex backgrounds
and perspective distortion, solved via threshold tuning and perspective correction.
6. Conclusion and Future Plans
- Hybrid SAM2 + CV approach improved results by 34%
- Human-in-the-loop improved data by 18%
- 200 high-quality samples beat 1000+ low-quality samples
Future Directions
- Active learning loop
- Mobile device adaptation
- 3D rim segmentation
- Medical imaging applications
7. Technical Contributions
- Hybrid Segmentation Approach
- Web-Based Annotation Pipeline
- Multi-Variant Selection Framework
- Quality-over-Quantity Paradigm
Repository:
github.com/lMelkorl/sam2-rim-segmentation