{"id":190,"date":"2025-10-03T21:26:48","date_gmt":"2025-10-03T18:26:48","guid":{"rendered":"https:\/\/www.mkensari.com\/?page_id=190"},"modified":"2025-10-03T21:26:48","modified_gmt":"2025-10-03T18:26:48","slug":"sam2-rim-segmentation","status":"publish","type":"page","link":"https:\/\/www.mkensari.com\/index.php\/sam2-rim-segmentation\/","title":{"rendered":"SAM2-Powered Semi-Automated Data Generation for Vehicle Rim Segmentation and Custom Model Training"},"content":{"rendered":"<section class=\"research-paper\">\n<h3>SAM2-Assisted Rim Detection Process<\/h3>\n<div style=\"text-align:center;\">\n    <img decoding=\"async\" src=\"https:\/\/github.com\/user-attachments\/assets\/d16877a0-2ac1-4467-88c3-f9bec183d437\" width=\"700\" alt=\"SAM2-assisted wheel detection\"\/><\/p>\n<p><em>SAM2-assisted wheel detection<\/em><\/p>\n<\/p><\/div>\n<div style=\"text-align:center;\">\n    <img decoding=\"async\" src=\"https:\/\/github.com\/user-attachments\/assets\/121752bc-7702-4c1f-a2b3-07c51488cdd8\" width=\"450\" alt=\"Detection of rim area\"\/><\/p>\n<p><em>Detection of the metallic rim area using SAM2 and a custom model<\/em><\/p>\n<\/p><\/div>\n<div style=\"text-align:center;\">\n    <img decoding=\"async\" src=\"https:\/\/github.com\/user-attachments\/assets\/b48910a7-aef9-4b23-bcf2-875f5a09d4e2\" width=\"350\" alt=\"Segmentation result\"\/><\/p>\n<p><em>Segmentation result: selected metallic rim region<\/em><\/p>\n<\/p><\/div>\n<div style=\"text-align:center;\">\n    <img decoding=\"async\" src=\"https:\/\/github.com\/user-attachments\/assets\/3c27adb9-610d-4a7e-a9c4-8703c238b512\" width=\"45%\" style=\"margin-right:10px;\" alt=\"Before replacement\"\/><br \/>\n    <img decoding=\"async\" src=\"https:\/\/github.com\/user-attachments\/assets\/55562864-2bad-418a-9aa6-bb6ce3b11422\" width=\"45%\" alt=\"After replacement\"\/><\/p>\n<p><em>Segmentation result: replacing the rim with a provided PNG<\/em><\/p>\n<\/p><\/div>\n<h2>Abstract<\/h2>\n<p>\n    This study demonstrates how foundation segmentation models like SAM2 can be used to create high-quality datasets<br \/>\n    for training specialized AI models. As a case study, we developed a model that segments only the rim region<br \/>\n    (excluding tires) in vehicle images. After detecting and cropping vehicle wheels using SAM2, rim masks were<br \/>\n    generated using classical computer vision algorithms (ROI-based analysis, Hough Circle detection, etc.).<br \/>\n    Manual labeling was performed when automatic methods failed. The trained custom model achieved remarkable<br \/>\n    success on real vehicle images. This approach can be generalized for many other specialized AI scenarios.\n  <\/p>\n<h2>1. Introduction<\/h2>\n<p>\n    Specialized part segmentation is a critical need in fields such as virtual try-on, e-commerce, and automotive.<br \/>\n    Traditional dataset creation is costly, but foundation models like SAM2 make the process semi-automatic.<br \/>\n    This study presents an approach where general models are used as teachers, followed by training specialized<br \/>\n    student models with high accuracy.\n  <\/p>\n<h2>2. Related Work<\/h2>\n<p>\n    Segment Anything (SAM\/SAM2) models by Meta represent progress in zero-shot segmentation.<br \/>\n    SAM2 produces improved mask quality and box-based predictions compared to the original SAM model.<br \/>\n    Classical methods like Hough Circle were used here because they work well on cropped images.<br \/>\n    To date, there has been no direct application of SAM2-supported custom model training.\n  <\/p>\n<h2>3. System Architecture and Methodology<\/h2>\n<h3>3.1 Overall System Architecture<\/h3>\n<ul>\n<li><b>Automated Data Generation Module<\/b> (<code>gr.py<\/code>)<\/li>\n<li><b>Web-Based Data Approval Module<\/b> (<code>flask_ui.py<\/code>)<\/li>\n<li><b>Model Testing and Evaluation Module<\/b> (<code>web_test_app.py<\/code>)<\/li>\n<\/ul>\n<h3>3.2 Technical Implementation Details<\/h3>\n<h4>3.2.1 SAM2 Model Configuration<\/h4>\n<pre><code># Model: sam2.1_hiera_large.pt\n# Config: sam2.1_hiera_l.yaml\ndevice = (\n  torch.device(\"cuda\") if torch.cuda.is_available() else\n  torch.device(\"mps\")  if torch.backends.mps.is_available() else\n  torch.device(\"cpu\")\n)\n<\/code><\/pre>\n<h4>3.2.2 Multi-Masking Algorithm<\/h4>\n<p>Three different masking methods are applied in parallel:<\/p>\n<ol>\n<li>Coarse SAM2 Mask<\/li>\n<li>Hough Circle Mask<\/li>\n<li>ROI-Based Refined Mask<\/li>\n<\/ol>\n<pre><code>def save_crops_and_mask_variants(image, boxes, crop_dir, mask_dir, base_name):\n    for idx, (x0, y0, x1, y1) in enumerate(boxes):\n        crop = image[y0:y1, x0:x1]\n        coarse_mask = generate_coarse_mask(crop)\n        hough_mask = generate_hough_mask(crop)\n        refined_mask = generate_refined_mask(crop)\n<\/code><\/pre>\n<h3>3.3 Two-Stage Processing Pipeline<\/h3>\n<h4>Stage 1 \u2013 Wheel Detection<\/h4>\n<pre><code>def filter_rims(masks, min_area=1500, circ_thresh=0.65):\n    for ann in masks:\n        area = ann[\"area\"]\n        if area < min_area: continue\n        # circularity etc.\n<\/code><\/pre>\n<h4>Stage 2 \u2013 Rim Detection<\/h4>\n<p>Three methods run on cropped wheel area, user selects best via web UI.<\/p>\n<h2>4. Experiments and Results<\/h2>\n<h3>4.1 Dataset Statistics<\/h3>\n<ul>\n<li>200 rim samples<\/li>\n<li>Hough Circle preferred in 58.5% of cases<\/li>\n<li>Manual correction 14.5%<\/li>\n<\/ul>\n<h3>4.2 Model Performance<\/h3>\n<table border=\"1\" cellpadding=\"5\" cellspacing=\"0\">\n<tr>\n<th>Metric<\/th>\n<th>Value<\/th>\n<\/tr>\n<tr>\n<td>Box Precision<\/td>\n<td>0.997<\/td>\n<\/tr>\n<tr>\n<td>Mask Recall<\/td>\n<td>1.000<\/td>\n<\/tr>\n<tr>\n<td>Mask mAP@50\u201395<\/td>\n<td>0.864<\/td>\n<\/tr>\n<\/table>\n<h2>5. Discussion<\/h2>\n<p>\n    The system shows that even without academic infrastructure, one can build<br \/>\n    custom AI solutions using SAM2. Challenges included complex backgrounds<br \/>\n    and perspective distortion, solved via threshold tuning and perspective correction.\n  <\/p>\n<h2>6. Conclusion and Future Plans<\/h2>\n<ul>\n<li>Hybrid SAM2 + CV approach improved results by 34%<\/li>\n<li>Human-in-the-loop improved data by 18%<\/li>\n<li>200 high-quality samples beat 1000+ low-quality samples<\/li>\n<\/ul>\n<h3>Future Directions<\/h3>\n<ul>\n<li>Active learning loop<\/li>\n<li>Mobile device adaptation<\/li>\n<li>3D rim segmentation<\/li>\n<li>Medical imaging applications<\/li>\n<\/ul>\n<h2>7. Technical Contributions<\/h2>\n<ol>\n<li>Hybrid Segmentation Approach<\/li>\n<li>Web-Based Annotation Pipeline<\/li>\n<li>Multi-Variant Selection Framework<\/li>\n<li>Quality-over-Quantity Paradigm<\/li>\n<\/ol>\n<p><b>Repository:<\/b><br \/>\n    <a href=\"https:\/\/github.com\/lMelkorl\/sam2-rim-segmentation\" target=\"_blank\"><br \/>\n      github.com\/lMelkorl\/sam2-rim-segmentation<br \/>\n    <\/a>\n  <\/p>\n<\/section>\n","protected":false},"excerpt":{"rendered":"<p>SAM2-Assisted Rim Detection Process SAM2-assisted wheel detection Detection of the metallic rim area using SAM2 and a custom model Segmentation result: selected metallic rim region Segmentation result: replacing the rim with a provided PNG Abstract This study demonstrates how foundation segmentation models like SAM2 can be used to create high-quality datasets for training specialized AI [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-190","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.mkensari.com\/index.php\/wp-json\/wp\/v2\/pages\/190","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.mkensari.com\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.mkensari.com\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.mkensari.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.mkensari.com\/index.php\/wp-json\/wp\/v2\/comments?post=190"}],"version-history":[{"count":1,"href":"https:\/\/www.mkensari.com\/index.php\/wp-json\/wp\/v2\/pages\/190\/revisions"}],"predecessor-version":[{"id":191,"href":"https:\/\/www.mkensari.com\/index.php\/wp-json\/wp\/v2\/pages\/190\/revisions\/191"}],"wp:attachment":[{"href":"https:\/\/www.mkensari.com\/index.php\/wp-json\/wp\/v2\/media?parent=190"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}