From 32c9c809ccc0ec031ef21d922563f7f688913643 Mon Sep 17 00:00:00 2001 From: a2nr Date: Fri, 20 Mar 2026 17:38:13 +0700 Subject: [PATCH] docs: update vision feasibility with split processing architecture Revise vision sensor feasibility study based on feedback: - Pi as camera server only (capture + publish JPEG, no OpenCV) - All OpenCV processing moved to executor (desktop) side - Slot-based object counting with gap detection - HMI camera widget for live feed + ROI configuration - 5 Blockly blocks proposed (snapshot, detect, getSlot, trainColor, hmiSetCamera) Co-Authored-By: Claude Opus 4.6 --- src/amr_vision_node/docs/feasibility.md | 1453 +++++++++++++++-------- 1 file changed, 948 insertions(+), 505 deletions(-) diff --git a/src/amr_vision_node/docs/feasibility.md b/src/amr_vision_node/docs/feasibility.md index dce25fc..31fa709 100644 --- a/src/amr_vision_node/docs/feasibility.md +++ b/src/amr_vision_node/docs/feasibility.md @@ -1,44 +1,94 @@ # Feasibility Study: Vision Sensor for AMR ROS2 K4 > **Date**: 2026-03-20 -> **Scope**: Color/object recognition, object counting, Blockly integration +> **Scope**: Color/object recognition, slot-based object counting, Blockly + HMI integration > **Platform**: Raspberry Pi 4/5 (linux-aarch64) + Desktop (linux-64) --- ## 1. Executive Summary -Implementasi vision sensor pada Kiwi Wheel AMR layak dilakukan menggunakan **OpenCV dengan HSV color thresholding** sebagai pendekatan utama. Pendekatan ini ringan secara komputasi (15-30 FPS pada Raspberry Pi 4 di resolusi 640x480), tidak memerlukan GPU atau model ML, dan dapat diintegrasikan langsung ke arsitektur Blockly yang sudah ada menggunakan pattern yang sama dengan odometry (fetch once, extract many). +Implementasi vision sensor pada Kiwi Wheel AMR layak dilakukan dengan arsitektur **split processing**: Raspberry Pi bertindak sebagai **camera server** (capture + publish snapshot saja), sementara semua pemrosesan OpenCV dilakukan di sisi **executor** (desktop). Hasil capture dapat dipantau melalui **HMI camera widget**, dan pengaturan ROI (Region of Interest) dilakukan secara interaktif di HMI. -**Rekomendasi**: Mulai dari Phase 1 (MVP) — OpenCV direct capture, HSV thresholding, 4 Blockly blocks, color profile JSON. Tidak perlu ROS2 image pipeline di awal. +Pendekatan HSV color thresholding ringan secara komputasi dan berjalan efisien di desktop. Object counting menggunakan model **slot-based** yang mendukung gap detection — posisi tertentu bisa kosong (misal: slot 1 ada, slot 2 kosong, slot 3 ada, slot 4 ada). + +**Rekomendasi**: Phase 1 (MVP) — Pi sebagai camera server, OpenCV di executor, HMI camera widget + ROI config, 5 Blockly blocks. --- ## 2. Requirements Analysis -Berdasarkan brief di readme.md, terdapat 3 kebutuhan utama: - ### R1: Pengenalan Warna dan Obyek + Prosedur Training Warna - Deteksi obyek berdasarkan warna dalam frame kamera - User dapat mendefinisikan (training) warna baru melalui prosedur yang user-friendly - Return data obyek terdeteksi: label warna, posisi di frame, bounding box -### R2: Penghitungan Obyek (Urut Kiri ke Kanan) +### R2: Penghitungan Obyek (Urut Kiri ke Kanan) dengan Gap Detection - Hitung obyek yang tersusun secara sekuensial di pandangan kamera - Urutan berdasarkan posisi horizontal (x-coordinate) dari kiri ke kanan -- Return: total count dan posisi individual tiap obyek +- **Slot-based counting**: setiap slot bisa berisi obyek (present) atau kosong (absent) +- Return: total slots, present count, dan status individual tiap slot -### R3: Integrasi Blockly App +### R3: Integrasi Blockly App + HMI -- Vision blocks harus terintegrasi ke visual programming Blockly yang sudah ada -- Mengikuti pattern yang established: JS block registration, handler decorator, ROS2 action -- User dapat menggunakan vision blocks dalam program Blockly tanpa menulis code +- Vision blocks terintegrasi ke Blockly visual programming +- **HMI camera widget** menampilkan live camera feed +- **ROI configuration** dilakukan secara interaktif di HMI (drag/resize) +- Mengikuti pattern established: JS block registration, handler decorator, ROS2 action --- -## 3. Hardware Options — Kamera untuk Raspberry Pi +## 3. System Architecture — Split Processing + +### 3.1 Overview + +``` +┌─── Raspberry Pi (aarch64) ───┐ ┌─── Desktop (x86_64) ────────────────────────┐ +│ │ │ │ +│ amr_vision_node │ │ blockly_executor │ +│ ├── Camera capture │ │ ├── handlers/vision.py │ +│ ├── JPEG compress │ ROS2│ │ ├── vision_snapshot → return base64 │ +│ └── Publish /vision/snapshot ├────►│ │ ├── vision_detect → OpenCV + results │ +│ (on-demand, low rate) │topic│ │ └── vision_train → sample ROI + save │ +│ │ │ └── OpenCV processing (all compute here) │ +│ gpio_node, pca9685_node, │ │ │ +│ as5600_node (unchanged) │ │ blockly_app (HMI) │ +│ │ │ ├── Camera widget (displays base64 frame) │ +└───────────────────────────────┘ │ ├── ROI overlay (interactive drag/resize) │ + │ ├── Slot dividers + detection overlays │ + │ └── Color training UI │ + └──────────────────────────────────────────────┘ +``` + +### 3.2 Kenapa Split Processing? + +| Aspek | Processing di Pi (versi lama) | Processing di Executor (baru) | +|-------|-------------------------------|-------------------------------| +| Beban Pi | Berat (OpenCV + camera + motor control) | Ringan (camera capture saja) | +| Performa OpenCV | Terbatas (Pi 4: 15-30 FPS) | Cepat (desktop: 100+ FPS) | +| HMI integration | Sulit (image harus dikirim balik ke desktop) | Mudah (image sudah di desktop) | +| ROI interaktif | Tidak mungkin (no GUI di Pi) | Natural (HMI widget di desktop) | +| Debugging | Susah (headless Pi) | Mudah (bisa preview di HMI) | + +### 3.3 Data Flow — On Demand + +Pi tidak perlu publish terus-menerus. Snapshot dipublish hanya saat dibutuhkan (pada rate rendah, misalnya 2-5 Hz, atau trigger-based): + +``` +User clicks "Run" di Blockly + ↓ +HMI program loop (~20 Hz): + 1. executeAction('vision_snapshot', {}) → handler ambil frame cache → return base64 + 2. HMI.setCamera('cam1', base64image) → tampilkan di HMI widget + 3. executeAction('vision_detect', {color:'red', slots:4, roi:{...}}) → OpenCV → results + 4. HMI.setSlotStatus('slots1', results) → update slot indicator di HMI +``` + +--- + +## 4. Hardware — Kamera untuk Raspberry Pi ### Option A: Raspberry Pi Camera Module v2 / v3 @@ -46,145 +96,98 @@ Berdasarkan brief di readme.md, terdapat 3 kebutuhan utama: |-------|--------| | Interface | CSI (MIPI) via ribbon cable | | Resolusi | 8 MP (v2), 12 MP (v3), autofocus pada v3 | -| Kelebihan | Native Pi support, low latency, hardware-accelerated capture via `libcamera`/`picamera2` | -| Kekurangan | Kabel pendek, posisi mounting terbatas, CSI tidak tersedia pada semua konfigurasi Pi | +| Kelebihan | Native Pi support, low latency, hardware-accelerated JPEG encode via `libcamera` | +| Kekurangan | Kabel pendek, posisi mounting terbatas | | Harga | ~$25 (v2), ~$35 (v3) | -### Option B: USB Webcam (Logitech C270, C920, atau sejenisnya) +### Option B: USB Webcam (Logitech C270, C920) | Aspek | Detail | |-------|--------| | Interface | USB (V4L2) | | Resolusi | 720p - 1080p | -| Kelebihan | Plug and play, kabel USB panjang, mudah mounting, tersedia luas, langsung bekerja dengan OpenCV `VideoCapture` | -| Kekurangan | Latency lebih tinggi dari CSI, USB bandwidth contention di Pi, konsumsi daya USB | +| Kelebihan | Plug and play, kabel panjang, mudah mounting | +| Kekurangan | Latency lebih tinggi, USB bandwidth | | Harga | ~$20 (C270), ~$60 (C920) | ### Rekomendasi -**USB webcam untuk prototyping, CSI camera untuk production.** - -Kedua jenis kamera muncul sebagai `/dev/video*` di Linux melalui V4L2. Node harus mengabstraksi akses kamera sehingga keduanya bisa digunakan — cukup ganti device path via ROS2 parameter. - -``` -Camera (CSI atau USB) - ↓ V4L2 (/dev/video0) -OpenCV VideoCapture - ↓ -amr_vision_node -``` +Kedua jenis muncul sebagai `/dev/video*` via V4L2. Node mengabstraksi akses — cukup ganti device path via ROS2 parameter. **USB webcam untuk prototyping** (mudah dipasang, panjang kabel fleksibel). --- -## 4. Software Stack +## 5. Software Stack -### 4.1 OpenCV — Library Utama (Rekomendasi) +### 5.1 Pi Side — Minimal Dependencies -- Tersedia di conda-forge sebagai `py-opencv` -- Berjalan di `linux-64` dan `linux-aarch64` -- Menyediakan semua fungsi yang dibutuhkan: color space conversion, thresholding, contour detection, morphological operations -- Ringan, well-supported di Raspberry Pi -- Tidak memerlukan GPU untuk color detection dasar - -### 4.2 ROS2 Vision Packages (Optional, Phase 2) - -| Package | Fungsi | -|---------|--------| -| `ros-jazzy-cv-bridge` | Konversi antara ROS2 `sensor_msgs/Image` dan OpenCV `cv::Mat` | -| `ros-jazzy-image-transport` | Publishing gambar efisien dengan kompresi | -| `ros-jazzy-camera-info-manager` | Manajemen kalibrasi kamera | - -**Catatan**: Ketersediaan packages di atas dalam RoboStack `robostack-jazzy` channel untuk `linux-aarch64` perlu diverifikasi. Jika tidak tersedia, gunakan OpenCV `VideoCapture` langsung (Phase 1 approach). - -### 4.3 Fallback: OpenCV Direct (Phase 1) - -Untuk Phase 1, gunakan OpenCV `VideoCapture` langsung tanpa ROS2 image pipeline: +Pi hanya butuh OpenCV untuk capture dan JPEG compress. Tidak perlu full OpenCV — bahkan bisa pakai `picamera2` untuk Pi Camera atau `v4l2` langsung. ```python +# Minimal capture di Pi import cv2 - -cap = cv2.VideoCapture("/dev/video0") # atau device index 0 +cap = cv2.VideoCapture("/dev/video0") cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640) cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480) - -ret, frame = cap.read() # BGR numpy array +ret, frame = cap.read() +_, jpeg = cv2.imencode('.jpg', frame, [cv2.IMWRITE_JPEG_QUALITY, 80]) ``` -Pendekatan ini memiliki **zero additional ROS2 dependencies** dan cukup untuk semua kebutuhan di Phase 1. +### 5.2 Executor Side — Full OpenCV Processing + +Desktop menjalankan semua vision processing: +- `py-opencv` dari conda-forge +- Color space conversion (BGR → HSV) +- `cv2.inRange()`, morphological operations, contour detection +- Image annotation (bounding boxes, slot overlays) +- Base64 encoding untuk HMI display + +### 5.3 ROS2 Image Transport + +Untuk transport gambar dari Pi ke executor: + +| Option | Format | Size (640x480) | Pros | Cons | +|--------|--------|-----------------|------|------| +| `sensor_msgs/CompressedImage` | JPEG bytes | ~30-50 KB | Standard ROS2, efficient | Butuh `cv_bridge` di kedua sisi | +| `std_msgs/String` (base64) | Base64 JPEG | ~40-70 KB | Zero extra deps | 33% overhead, tidak standard | +| Custom `VisionSnapshot.msg` | bytes + metadata | ~30-50 KB | Typed, bisa tambah metadata | Perlu define message baru | + +**Rekomendasi Phase 1**: `sensor_msgs/CompressedImage` jika tersedia di RoboStack. Fallback: `std_msgs/String` dengan base64 encoding. --- -## 5. Color Recognition — HSV Thresholding +## 6. Color Recognition — HSV Thresholding -### 5.1 Pipeline - -HSV (Hue-Saturation-Value) color space lebih robust terhadap variasi pencahayaan dibanding RGB karena memisahkan informasi warna (Hue) dari intensitas cahaya (Value). +### 6.1 Pipeline (dijalankan di Executor) ``` +Snapshot JPEG dari Pi + ↓ cv2.imdecode() Frame (BGR) ↓ cv2.cvtColor(frame, cv2.COLOR_BGR2HSV) Frame (HSV) - ↓ cv2.inRange(hsv, lower_bound, upper_bound) + ↓ Crop ke ROI (dari HMI config) +ROI (HSV) + ↓ cv2.inRange(roi, lower_bound, upper_bound) Binary Mask (0/255) - ↓ cv2.erode() + cv2.dilate() ← morphological cleanup + ↓ cv2.erode() + cv2.dilate() Clean Mask ↓ cv2.findContours() Contours - ↓ filter by area (reject noise) + ↓ filter by area Detected Objects ``` -### 5.2 Contoh Implementasi +### 6.2 Color Training Procedure (via HMI) -```python -import cv2 -import numpy as np +Training warna dilakukan secara visual melalui HMI — **bukan** dengan input angka di Blockly block. -def detect_color(frame, lower_hsv, upper_hsv, min_area=500): - """Detect objects of a specific color in a BGR frame. +**Flow**: - Args: - frame: BGR image from camera - lower_hsv: (H, S, V) lower bound, e.g. (0, 100, 100) - upper_hsv: (H, S, V) upper bound, e.g. (10, 255, 255) - min_area: minimum contour area in pixels to filter noise - - Returns: - List of detected objects: [{x, y, w, h, area, cx, cy}, ...] - """ - hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV) - mask = cv2.inRange(hsv, np.array(lower_hsv), np.array(upper_hsv)) - - # Morphological cleanup — remove small noise, fill small holes - kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5)) - mask = cv2.erode(mask, kernel, iterations=1) - mask = cv2.dilate(mask, kernel, iterations=2) - - contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) - - objects = [] - for cnt in contours: - area = cv2.contourArea(cnt) - if area < min_area: - continue - x, y, w, h = cv2.boundingRect(cnt) - cx, cy = x + w // 2, y + h // 2 # centroid - objects.append({"x": x, "y": y, "w": w, "h": h, "area": int(area), "cx": cx, "cy": cy}) - - return objects -``` - -### 5.3 Color Training Procedure - -**Tujuan**: User dapat mendefinisikan warna baru tanpa menulis code. Prosedur dilakukan melalui Blockly block. - -**Langkah-langkah**: - -1. **Persiapan**: User menempatkan objek referensi warna di depan kamera, dengan pencahayaan yang konsisten -2. **Capture**: Node menangkap N frame (default: 10) dari kamera -3. **Sampling**: Dari tiap frame, ambil Region of Interest (ROI) di tengah frame (default: 50x50 pixel) -4. **Kalkulasi**: Hitung median HSV dari semua sample ROI, tentukan range sebagai `median ± tolerance` -5. **Simpan**: Color profile disimpan sebagai JSON file +1. User melihat camera feed di HMI widget +2. User **mengatur ROI** (drag rectangle di HMI) ke area objek referensi +3. User menjalankan block `trainColor` — executor mengambil snapshot, sampling HSV dari ROI +4. Hasil training (HSV range) disimpan sebagai JSON profile +5. User bisa langsung melihat detection result di HMI (bounding boxes overlay) **Format Color Profile** (`~/.amr_vision/colors.json`): @@ -209,354 +212,760 @@ def detect_color(frame, lower_hsv, upper_hsv, min_area=500): } ``` -**Contoh Training Algorithm**: +**Training Algorithm** (dijalankan di executor): ```python -def train_color(cap, color_name, roi_size=50, num_samples=10, tolerance=15): - """Train a color by sampling the center of the camera frame. +def train_color(frame_jpeg, roi, color_name, num_samples=10, tolerance=15): + """Train a color from a ROI on the camera frame. Args: - cap: OpenCV VideoCapture object - color_name: name for the trained color (e.g. "red") - roi_size: size of the square ROI at frame center - num_samples: number of frames to sample - tolerance: HSV range tolerance (+/-) + frame_jpeg: JPEG bytes from Pi snapshot + roi: dict {x, y, w, h} from HMI widget + color_name: label for the trained color + num_samples: frames to average (executor requests multiple snapshots) + tolerance: HSV tolerance (+/-) Returns: - Color profile dict with lower_hsv and upper_hsv + Color profile with lower_hsv and upper_hsv """ - hsv_samples = [] + frame = cv2.imdecode(np.frombuffer(frame_jpeg, np.uint8), cv2.IMREAD_COLOR) + hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV) - for _ in range(num_samples): - ret, frame = cap.read() - if not ret: - continue - hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV) + # Crop to ROI defined by user in HMI + roi_hsv = hsv[roi['y']:roi['y']+roi['h'], roi['x']:roi['x']+roi['w']] + median_hsv = np.median(roi_hsv.reshape(-1, 3), axis=0).astype(int) - h, w = hsv.shape[:2] - cx, cy = w // 2, h // 2 - half = roi_size // 2 - roi = hsv[cy - half:cy + half, cx - half:cx + half] + lower = np.clip(median_hsv - tolerance, [0, 0, 0], [179, 255, 255]).tolist() + upper = np.clip(median_hsv + tolerance, [0, 0, 0], [179, 255, 255]).tolist() - # Median HSV of the ROI - median_hsv = np.median(roi.reshape(-1, 3), axis=0) - hsv_samples.append(median_hsv) - - overall_median = np.median(hsv_samples, axis=0).astype(int) - - lower = np.clip(overall_median - tolerance, [0, 0, 0], [179, 255, 255]).tolist() - upper = np.clip(overall_median + tolerance, [0, 0, 0], [179, 255, 255]).tolist() - - return { - "lower_hsv": lower, - "upper_hsv": upper, - "samples": num_samples, - "tolerance": tolerance, - } + return {"lower_hsv": lower, "upper_hsv": upper, "tolerance": tolerance} ``` -**Catatan tentang Hue wrapping**: Warna merah memiliki Hue di sekitar 0° dan 180° (wrapping). Untuk menangani ini, training procedure harus mendeteksi apakah Hue sample berada di kedua ujung range dan menghasilkan dua range terpisah yang digabung dengan bitwise OR. +**Catatan Hue wrapping**: Warna merah memiliki Hue di sekitar 0° dan 180°. Training procedure harus mendeteksi bimodal distribution dan menghasilkan dua range yang digabung dengan `cv2.bitwise_or()`. --- -## 6. Object Detection & Counting (Left-to-Right) +## 7. Slot-Based Object Counting with Gap Detection -### 6.1 Algoritma +### 7.1 Konsep -Setelah color detection menghasilkan daftar contour per warna: +Berbeda dengan simple counting (hitung semua obyek), slot-based counting membagi area pandang kamera menjadi **N slot** (zona vertikal dari kiri ke kanan). Setiap slot di-inspect: ada obyek (present) atau kosong (absent). -1. **Hitung centroid** tiap obyek: `cx = x + w/2` -2. **Sort by x-coordinate** (ascending) → otomatis urut kiri ke kanan -3. **Assign index** sekuensial: 1, 2, 3, ... -4. **Minimum separation filter**: jika dua obyek terlalu dekat (< `min_distance` pixel), gabungkan sebagai satu obyek — menghindari double-counting dari fragmentasi mask +``` +Camera frame (ROI area): +┌──────────┬──────────┬──────────┬──────────┐ +│ Slot 1 │ Slot 2 │ Slot 3 │ Slot 4 │ +│ │ │ │ │ +│ [RED] │ (empty) │ [BLUE] │ [RED] │ +│ │ │ │ │ +└──────────┴──────────┴──────────┴──────────┘ + ✓ ada ✗ kosong ✓ ada ✓ ada +``` -### 6.2 Output Format +### 7.2 Algoritma + +```python +def detect_slots(frame_jpeg, roi, num_slots, color_profiles, min_area=500): + """Detect objects in each slot of the ROI. + + Args: + frame_jpeg: JPEG bytes from Pi + roi: {x, y, w, h} from HMI + num_slots: number of left-to-right slots + color_profiles: trained color HSV ranges + min_area: minimum contour area + + Returns: + Slot detection results with gap info + """ + frame = cv2.imdecode(np.frombuffer(frame_jpeg, np.uint8), cv2.IMREAD_COLOR) + roi_frame = frame[roi['y']:roi['y']+roi['h'], roi['x']:roi['x']+roi['w']] + hsv = cv2.cvtColor(roi_frame, cv2.COLOR_BGR2HSV) + + # Detect all colored objects + all_objects = [] + for name, profile in color_profiles.items(): + mask = cv2.inRange(hsv, np.array(profile["lower_hsv"]), np.array(profile["upper_hsv"])) + kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5)) + mask = cv2.erode(mask, kernel, iterations=1) + mask = cv2.dilate(mask, kernel, iterations=2) + contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) + + for cnt in contours: + area = cv2.contourArea(cnt) + if area < min_area: + continue + x, y, w, h = cv2.boundingRect(cnt) + cx = x + w // 2 + all_objects.append({"cx": cx, "color": name, "area": int(area), + "x": x, "y": y, "w": w, "h": h}) + + # Assign objects to slots + slot_width = roi['w'] / num_slots + slots = [] + for i in range(num_slots): + slot_left = i * slot_width + slot_right = (i + 1) * slot_width + + # Find objects whose centroid falls within this slot + slot_objects = [o for o in all_objects if slot_left <= o["cx"] < slot_right] + + if slot_objects: + # Take the largest object in the slot + best = max(slot_objects, key=lambda o: o["area"]) + slots.append({ + "slot": i + 1, + "present": True, + "color": best["color"], + "cx": best["cx"], + "cy": best["y"] + best["h"] // 2, + "area": best["area"], + }) + else: + slots.append({ + "slot": i + 1, + "present": False, + }) + + present_count = sum(1 for s in slots if s["present"]) + + return { + "total_slots": num_slots, + "present_count": present_count, + "absent_count": num_slots - present_count, + "slots": slots, + } +``` + +### 7.3 Output Format ```json { - "count": 3, - "objects": [ - {"index": 1, "cx": 120, "cy": 240, "w": 60, "h": 55, "color": "red", "area": 2850}, - {"index": 2, "cx": 320, "cy": 235, "w": 58, "h": 52, "color": "red", "area": 2640}, - {"index": 3, "cx": 510, "cy": 242, "w": 62, "h": 57, "color": "red", "area": 3020} + "total_slots": 4, + "present_count": 3, + "absent_count": 1, + "slots": [ + {"slot": 1, "present": true, "color": "red", "cx": 80, "cy": 120, "area": 2850}, + {"slot": 2, "present": false}, + {"slot": 3, "present": true, "color": "blue", "cx": 320, "cy": 115, "area": 2640}, + {"slot": 4, "present": true, "color": "red", "cx": 480, "cy": 122, "area": 3020} ] } ``` -### 6.3 Multi-Color Detection +### 7.4 Use Cases -Untuk mendeteksi multiple warna sekaligus: - -```python -def detect_all_colors(frame, color_profiles, min_area=500): - all_objects = [] - for name, profile in color_profiles.items(): - objects = detect_color(frame, profile["lower_hsv"], profile["upper_hsv"], min_area) - for obj in objects: - obj["color"] = name - all_objects.extend(objects) - - # Sort all objects left-to-right regardless of color - all_objects.sort(key=lambda o: o["cx"]) - for i, obj in enumerate(all_objects): - obj["index"] = i + 1 - - return {"count": len(all_objects), "objects": all_objects} -``` +| Skenario | Slots | Hasil | +|----------|-------|-------| +| Assembly check: semua posisi terisi | 4 | `present_count: 4, absent_count: 0` | +| Quality control: ada yang kosong | 4 | `present_count: 3, absent_count: 1` → alert | +| Color sorting: urutan warna benar? | 3 | Check `slots[0].color == "red"`, `slots[1].color == "blue"`, ... | +| Counting: berapa total? | N | `present_count` langsung | --- -## 7. ROS2 Node Design — `amr_vision_node` +## 8. ROS2 Node Design — `amr_vision_node` (Pi Camera Server) -### 7.1 Package Type +### 8.1 Prinsip -**ament_python** — konsisten dengan `blockly_executor` karena semua logika adalah OpenCV/Python. +Node ini **hanya menangkap gambar dan mengirimkan snapshot**. Tidak ada pemrosesan OpenCV. Ringan, sederhana, reliable. -### 7.2 Package Structure +### 8.2 Package Structure ``` src/amr_vision_node/ ├── docs/ -│ └── feasibility.md # dokumen ini +│ └── feasibility.md # dokumen ini ├── amr_vision_node/ │ ├── __init__.py -│ ├── vision_node.py # Main ROS2 node -│ ├── color_detector.py # HSV thresholding + contour detection -│ ├── color_trainer.py # Color training / calibration logic -│ └── config/ -│ └── default_colors.json # Default color definitions +│ └── vision_node.py # Camera capture + publish (SATU file saja) ├── resource/ -│ └── amr_vision_node # ament resource marker +│ └── amr_vision_node # ament resource marker ├── package.xml ├── setup.py └── setup.cfg ``` -### 7.3 Node Architecture +**Package type**: ament_python — hanya satu file Python, minimal dependencies. + +### 8.3 Node Architecture ``` -amr_vision_node (Python, ROS2 Node) +amr_vision_node (Python, ROS2 Node — berjalan di Pi) │ -├── Timer callback (configurable, default 10 Hz) +├── Timer callback (configurable, default 2 Hz) │ ├── Capture frame dari kamera (OpenCV VideoCapture) -│ ├── Untuk setiap trained color: detect objects, compute bounding boxes -│ └── Cache detection results (thread-safe) -│ -├── Subscriber: /vision/train (std_msgs/String) -│ ├── Receive JSON: {"color_name": "red", "roi_size": 50, "samples": 10} -│ └── Execute training procedure → save ke colors.json -│ -├── Publisher: /vision/detections (std_msgs/String) -│ └── Publish JSON detection results setiap cycle (untuk executor handler) +│ ├── Encode ke JPEG (cv2.imencode, quality 80) +│ └── Publish ke /vision/snapshot │ └── ROS2 Parameters: ├── camera_device: string = "/dev/video0" ├── frame_width: int = 640 ├── frame_height: int = 480 - ├── publish_rate: double = 10.0 - ├── min_area: int = 500 - └── colors_file: string = "~/.amr_vision/colors.json" + ├── jpeg_quality: int = 80 + └── publish_rate: double = 2.0 ← rendah, hemat bandwidth ``` -### 7.4 Communication Pattern - -Mengikuti 2 pattern yang sudah established di project ini: - -**Read pattern** (seperti `as5600_node` → `odometry_read` handler): -``` -amr_vision_node → publish /vision/detections (JSON string) - ↑ -executor handler (vision_detect) ← lazy-subscribe, cache latest value -``` - -**Write pattern** (seperti `gpio_node` write): -``` -executor handler (vision_train_color) → publish /vision/train (JSON string) - ↓ - amr_vision_node ← subscribe, execute training -``` - -### 7.5 Custom Messages — Tidak Diperlukan untuk Phase 1 - -Hasil deteksi dikembalikan sebagai **JSON string melalui `BlocklyAction.action` yang sudah ada** — identik dengan pattern odometry handler. Ini menghindari kebutuhan custom message baru dan menjaga `blockly_interfaces` tetap minimal. +### 8.4 Contoh Implementasi Node ```python -# handlers/vision.py -@handler("vision_detect") -def handle_vision_detect(params, hardware): - color = params.get("color", "all") - # Read from cache (lazy-subscribed to /vision/detections) - return (True, json.dumps({"count": 3, "objects": [...]})) +"""amr_vision_node — Camera server for Pi. Capture + publish snapshots only.""" + +import cv2 +import rclpy +from rclpy.node import Node +from sensor_msgs.msg import CompressedImage + + +class VisionNode(Node): + def __init__(self): + super().__init__('amr_vision_node') + + # Parameters + self.declare_parameter('camera_device', '/dev/video0') + self.declare_parameter('frame_width', 640) + self.declare_parameter('frame_height', 480) + self.declare_parameter('jpeg_quality', 80) + self.declare_parameter('publish_rate', 2.0) + + device = self.get_parameter('camera_device').value + width = self.get_parameter('frame_width').value + height = self.get_parameter('frame_height').value + self._jpeg_quality = self.get_parameter('jpeg_quality').value + rate = self.get_parameter('publish_rate').value + + # Camera + self._cap = cv2.VideoCapture(device) + self._cap.set(cv2.CAP_PROP_FRAME_WIDTH, width) + self._cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height) + + if not self._cap.isOpened(): + self.get_logger().error(f'Cannot open camera: {device}') + return + + # Publisher + self._pub = self.create_publisher(CompressedImage, '/vision/snapshot', 1) + + # Timer + self.create_timer(1.0 / rate, self._timer_cb) + self.get_logger().info(f'Vision camera started: {device} @ {width}x{height}, {rate} Hz') + + def _timer_cb(self): + ret, frame = self._cap.read() + if not ret: + return + + _, jpeg = cv2.imencode('.jpg', frame, + [cv2.IMWRITE_JPEG_QUALITY, self._jpeg_quality]) + + msg = CompressedImage() + msg.header.stamp = self.get_clock().now().to_msg() + msg.format = 'jpeg' + msg.data = jpeg.tobytes() + self._pub.publish(msg) + + def destroy_node(self): + if self._cap: + self._cap.release() + super().destroy_node() + + +def main(args=None): + rclpy.init(args=args) + node = VisionNode() + rclpy.spin(node) + node.destroy_node() + rclpy.shutdown() ``` -Jika di kemudian hari diperlukan typed messages (Phase 2+), custom messages bisa ditambahkan ke `blockly_interfaces`: +### 8.5 Bandwidth Estimate -``` -# msg/VisionDetection.msg -string color_name -uint16 x -uint16 y -uint16 width -uint16 height -uint32 area -``` +| Resolusi | JPEG Quality | Size per frame | Rate | Bandwidth | +|----------|-------------|----------------|------|-----------| +| 640x480 | 80 | ~30-50 KB | 2 Hz | ~60-100 KB/s | +| 640x480 | 80 | ~30-50 KB | 5 Hz | ~150-250 KB/s | +| 320x240 | 70 | ~10-15 KB | 5 Hz | ~50-75 KB/s | -### 7.6 pixi.toml Dependencies - -```toml -# Tambah ke [dependencies] atau [target.linux-aarch64.dependencies] -py-opencv = "*" - -# Build & run tasks -[tasks.build-vision] -cmd = "colcon build --packages-select amr_vision_node" -depends-on = ["build-interfaces"] - -[tasks.vision-node] -cmd = "ros2 run amr_vision_node vision_node" -depends-on = ["build-vision"] -``` - -Jika `ros-jazzy-cv-bridge` dan `ros-jazzy-image-transport` tersedia di RoboStack, tambahkan untuk Phase 2. Jika tidak, OpenCV `VideoCapture` langsung (zero extra deps) sudah cukup. +Pada 2 Hz dengan 640x480, bandwidth ~100 KB/s — sangat terjangkau untuk WiFi atau ethernet Pi. --- -## 8. Blockly Integration Proposal +## 9. Executor Handler Design — `handlers/vision.py` -### 8.1 Overview — 4 Blocks, Mengikuti Pattern Odometry +### 9.1 Prinsip -| Block | Tipe | Pattern | Deskripsi | -|-------|------|---------|-----------| -| `visionDetect` | ROS2 value block | mirrors `odometryRead.js` | Fetch semua deteksi dari kamera | -| `visionGetCount` | Client-side | mirrors `odometryGet.js` | Extract jumlah obyek | -| `visionGetObject` | Client-side | mirrors `odometryGet.js` | Extract field obyek ke-N | -| `visionTrainColor` | ROS2 statement | mirrors `digitalOut.js` | Trigger training warna | +**Semua pemrosesan OpenCV terjadi di sini** — di sisi executor (desktop). Handler menerima snapshot dari Pi via ROS2 subscription, menjalankan deteksi, dan mengembalikan hasil. -### 8.2 Block 1: `visionDetect` — Fetch Detections +### 9.2 Handler Architecture ``` -┌────────────────────────────────────────────┐ -│ getVision color: [All ▾] │ → output: Object (JSON) -└────────────────────────────────────────────┘ +handlers/vision.py +│ +├── _get_snapshot_subscriber() ← Lazy-subscribe ke /vision/snapshot +│ └── Cache latest JPEG bytes (thread-safe) +│ +├── @handler("vision_snapshot") ← Return base64 image untuk HMI display +│ └── Decode cached JPEG → annotate (optional) → base64 → JSON +│ +├── @handler("vision_detect") ← Full detection pipeline +│ └── Decode → crop ROI → HSV threshold → contour → slot assignment → JSON +│ +└── @handler("vision_train_color") ← Train color from ROI + └── Decode → crop ROI → sample HSV → compute range → save profile ``` -- **Dropdown**: `All`, atau nama warna yang sudah di-training -- **Category**: `Robot` -- **Command**: `vision_detect` +### 9.3 Contoh Implementasi -**Generator** (mengikuti pattern `odometryRead.js`): +```python +"""Vision handlers — all OpenCV processing happens here (executor side).""" + +import base64 +import json +import os +import threading + +import cv2 +import numpy as np + +from . import handler +from .hardware import Hardware + +_COLORS_FILE = os.path.expanduser("~/.amr_vision/colors.json") + + +def _load_color_profiles(): + """Load trained color profiles from disk.""" + if os.path.exists(_COLORS_FILE): + with open(_COLORS_FILE) as f: + return json.load(f).get("colors", {}) + return {} + + +def _save_color_profiles(profiles): + """Save trained color profiles to disk.""" + os.makedirs(os.path.dirname(_COLORS_FILE), exist_ok=True) + with open(_COLORS_FILE, "w") as f: + json.dump({"colors": profiles}, f, indent=2) + + +def _get_snapshot_subscriber(hardware: Hardware): + """Lazy-create subscriber for /vision/snapshot from Pi.""" + if not hasattr(hardware.node, "_vision_snapshot_cache"): + hardware.node._vision_snapshot_cache = None # raw JPEG bytes + hardware.node._vision_snapshot_lock = threading.Lock() + hardware.node._vision_snapshot_sub = None + + if hardware.node._vision_snapshot_sub is None: + from sensor_msgs.msg import CompressedImage + + def _snapshot_cb(msg: CompressedImage): + with hardware.node._vision_snapshot_lock: + hardware.node._vision_snapshot_cache = bytes(msg.data) + + hardware.node._vision_snapshot_sub = hardware.node.create_subscription( + CompressedImage, "/vision/snapshot", _snapshot_cb, 1 + ) + + return hardware.node + + +def _get_cached_frame(hardware: Hardware): + """Get the latest cached JPEG bytes from Pi.""" + node = _get_snapshot_subscriber(hardware) + with node._vision_snapshot_lock: + return node._vision_snapshot_cache + + +def _decode_frame(jpeg_bytes): + """Decode JPEG bytes to OpenCV BGR frame.""" + return cv2.imdecode(np.frombuffer(jpeg_bytes, np.uint8), cv2.IMREAD_COLOR) + + +@handler("vision_snapshot") +def handle_vision_snapshot( + params: dict[str, str], hardware: Hardware +) -> tuple[bool, str]: + """Return latest camera frame as base64 for HMI display.""" + hardware.log("vision_snapshot()") + + if hardware.is_real(): + jpeg_bytes = _get_cached_frame(hardware) + if jpeg_bytes is None: + return (True, json.dumps({"image": None, "error": "no frame"})) + + b64 = base64.b64encode(jpeg_bytes).decode("ascii") + return (True, json.dumps({ + "image": "data:image/jpeg;base64," + b64, + })) + + # Dummy mode — return placeholder + return (True, json.dumps({"image": None, "dummy": True})) + + +@handler("vision_detect") +def handle_vision_detect( + params: dict[str, str], hardware: Hardware +) -> tuple[bool, str]: + """Detect objects in slots using HSV thresholding.""" + color = params.get("color", "all") + num_slots = int(params.get("slots", "1")) + roi_json = params.get("roi", "{}") + hardware.log(f"vision_detect(color={color}, slots={num_slots})") + + data = {"total_slots": num_slots, "present_count": 0, + "absent_count": num_slots, "slots": []} + + if hardware.is_real(): + jpeg_bytes = _get_cached_frame(hardware) + if jpeg_bytes is None: + return (True, json.dumps(data)) + + frame = _decode_frame(jpeg_bytes) + h, w = frame.shape[:2] + roi = json.loads(roi_json) if roi_json != "{}" else {"x": 0, "y": 0, "w": w, "h": h} + + profiles = _load_color_profiles() + if color != "all": + profiles = {k: v for k, v in profiles.items() if k == color} + + # Crop to ROI + rx, ry, rw, rh = roi["x"], roi["y"], roi["w"], roi["h"] + roi_frame = frame[ry:ry+rh, rx:rx+rw] + hsv = cv2.cvtColor(roi_frame, cv2.COLOR_BGR2HSV) + + # Detect all colored objects + all_objects = [] + kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5)) + min_area = int(params.get("min_area", "500")) + + for name, profile in profiles.items(): + lower = np.array(profile["lower_hsv"]) + upper = np.array(profile["upper_hsv"]) + mask = cv2.inRange(hsv, lower, upper) + mask = cv2.erode(mask, kernel, iterations=1) + mask = cv2.dilate(mask, kernel, iterations=2) + contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) + + for cnt in contours: + area = cv2.contourArea(cnt) + if area < min_area: + continue + bx, by, bw, bh = cv2.boundingRect(cnt) + cx = bx + bw // 2 + all_objects.append({"cx": cx, "color": name, "area": int(area)}) + + # Assign to slots + slot_width = rw / num_slots if num_slots > 0 else rw + slots = [] + for i in range(num_slots): + slot_left = i * slot_width + slot_right = (i + 1) * slot_width + slot_objs = [o for o in all_objects if slot_left <= o["cx"] < slot_right] + + if slot_objs: + best = max(slot_objs, key=lambda o: o["area"]) + slots.append({"slot": i + 1, "present": True, + "color": best["color"], "area": best["area"]}) + else: + slots.append({"slot": i + 1, "present": False}) + + present = sum(1 for s in slots if s["present"]) + data = {"total_slots": num_slots, "present_count": present, + "absent_count": num_slots - present, "slots": slots} + + return (True, json.dumps(data)) + + +@handler("vision_train_color") +def handle_vision_train_color( + params: dict[str, str], hardware: Hardware +) -> tuple[bool, str]: + """Train a color by sampling ROI from the current frame.""" + name = params.get("name", "unknown") + roi_json = params.get("roi", "{}") + tolerance = int(params.get("tolerance", "15")) + hardware.log(f"vision_train_color(name={name}, tolerance={tolerance})") + + if hardware.is_real(): + jpeg_bytes = _get_cached_frame(hardware) + if jpeg_bytes is None: + return (True, "No camera frame available") + + frame = _decode_frame(jpeg_bytes) + h, w = frame.shape[:2] + roi = json.loads(roi_json) if roi_json != "{}" else { + "x": w // 4, "y": h // 4, "w": w // 2, "h": h // 2 + } + + hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV) + roi_hsv = hsv[roi["y"]:roi["y"]+roi["h"], roi["x"]:roi["x"]+roi["w"]] + median_hsv = np.median(roi_hsv.reshape(-1, 3), axis=0).astype(int) + + lower = np.clip(median_hsv - tolerance, [0, 0, 0], [179, 255, 255]).tolist() + upper = np.clip(median_hsv + tolerance, [0, 0, 0], [179, 255, 255]).tolist() + + profiles = _load_color_profiles() + profiles[name] = {"lower_hsv": lower, "upper_hsv": upper, "tolerance": tolerance} + _save_color_profiles(profiles) + + return (True, json.dumps({"name": name, "lower_hsv": lower, "upper_hsv": upper})) + + return (True, f"Training color '{name}' (dummy mode)") +``` + +--- + +## 10. HMI Integration — Camera Widget + +### 10.1 New Widget Type: `camera` + +Widget `camera` menampilkan frame kamera dari Pi sebagai `` element, dengan overlay canvas untuk ROI dan detection results. + +**Default grid size**: [4, 3] (lebih besar dari widget biasa karena menampilkan gambar). + +### 10.2 HMI Manager Addition — `hmi-manager.js` + +```javascript +// ─── Camera widget ────────────────────────────────────────────── +// Default size for camera widget +// _defaultSizes.camera = [4, 3]; + +function _renderCamera(body, widget) { + // Image element + var img = document.createElement('img'); + img.className = 'hmi-camera-image'; + img.style.width = '100%'; + img.style.height = 'auto'; + img.style.objectFit = 'contain'; + + if (widget.image) { + img.src = widget.image; // data:image/jpeg;base64,... + } else { + img.alt = 'No camera feed'; + img.style.background = '#333'; + img.style.minHeight = '120px'; + } + body.appendChild(img); + + // Overlay for detection results + if (widget.detections && widget.detections.slots) { + var overlay = document.createElement('div'); + overlay.className = 'hmi-camera-overlay'; + overlay.style.position = 'relative'; + overlay.style.fontSize = '11px'; + overlay.style.marginTop = '4px'; + overlay.style.display = 'flex'; + overlay.style.gap = '2px'; + + widget.detections.slots.forEach(function (slot) { + var indicator = document.createElement('div'); + indicator.className = 'hmi-slot-indicator'; + indicator.style.flex = '1'; + indicator.style.textAlign = 'center'; + indicator.style.padding = '2px 4px'; + indicator.style.borderRadius = '3px'; + + if (slot.present) { + indicator.style.background = '#4caf50'; + indicator.style.color = '#fff'; + indicator.textContent = slot.slot + ': ' + (slot.color || '✓'); + } else { + indicator.style.background = '#f44336'; + indicator.style.color = '#fff'; + indicator.textContent = slot.slot + ': ✗'; + } + overlay.appendChild(indicator); + }); + body.appendChild(overlay); + } +} + +function setCamera(name, image, detections) { + var existing = _widgets.get(name); + if (existing) { + existing.image = image; + existing.detections = detections || null; + _scheduleRender(name); + } else { + addWidget(name, 'camera', { image: image, detections: detections || null }); + } +} +``` + +### 10.3 ROI Configuration via HMI + +ROI disimpan di widget state dan bisa dikonfigurasi via Blockly block parameter: + +```javascript +// Blockly block: hmiSetCameraROI +// Sets the ROI for vision processing — visual feedback in HMI +function setCameraROI(name, x, y, w, h) { + var existing = _widgets.get(name); + if (existing) { + existing.roi = { x: x, y: y, w: w, h: h }; + _scheduleRender(name); + } +} + +// getCameraROI — read current ROI from HMI widget state +function getCameraROI(name) { + var widget = _widgets.get(name); + if (!widget || !widget.roi) return { x: 0, y: 0, w: 640, h: 480 }; + return widget.roi; +} +``` + +**Phase 2 enhancement**: Interactive ROI — user bisa drag/resize rectangle langsung di atas camera feed (pointer events on overlay canvas). + +--- + +## 11. Blockly Integration Proposal + +### 11.1 Overview — 5 Blocks + +| Block | Tipe | Category | Deskripsi | +|-------|------|----------|-----------| +| `visionSnapshot` | ROS2 value (HMI) | Vision | Ambil snapshot + tampilkan di HMI camera widget | +| `visionDetect` | ROS2 value | Vision | Deteksi obyek dalam slots → return JSON | +| `visionGetSlot` | Client-side | Vision | Extract status slot ke-N (present/absent/color) | +| `visionTrainColor` | ROS2 statement | Vision | Training warna dari ROI saat ini | +| `hmiSetCamera` | Client-side (HMI) | HMI | Update HMI camera widget dengan frame + detections | + +### 11.2 Block: `visionSnapshot` + +Ambil snapshot kamera dari Pi melalui executor dan return base64 image. + +``` +┌────────────────────────────────────┐ +│ getSnapshot │ → output: Object {image, ...} +└────────────────────────────────────┘ +``` ```javascript -// blocks/visionDetect.js BlockRegistry.register({ - name: 'visionDetect', - category: 'Robot', - categoryColor: '#5b80a5', + name: 'visionSnapshot', + category: 'Vision', + categoryColor: '#8E24AA', color: '#8E24AA', - tooltip: 'Fetch vision detection data — use with "set variable" block', + tooltip: 'Fetch camera snapshot from Pi — returns image data for HMI display', definition: { init: function () { this.appendDummyInput() - .appendField('getVision') - .appendField(new Blockly.FieldDropdown([ - ['All', 'all'], - ['Red', 'red'], - ['Blue', 'blue'], - ['Green', 'green'] - ]), 'COLOR'); + .appendField('getSnapshot'); this.setOutput(true, null); this.setColour('#8E24AA'); - this.setTooltip('Fetch all vision detections (count, objects[]) from camera'); + this.setTooltip('Fetch latest camera frame as base64 image'); } }, generator: function (block) { - var color = block.getFieldValue('COLOR'); var code = - 'JSON.parse((await executeAction(\'vision_detect\', { color: \'' + color + '\' })).message)'; + 'JSON.parse((await executeAction(\'vision_snapshot\', {})).message)'; return [code, Blockly.JavaScript.ORDER_AWAIT]; } }); ``` -### 8.3 Block 2: `visionGetCount` — Extract Count +### 11.3 Block: `visionDetect` + +Deteksi obyek dalam frame dengan slot-based counting. ``` ┌───────────────────────────────────────────────┐ -│ getVisionCount from [detection ▾] │ → output: Number +│ detectVision color: [All ▾] slots: [■ 4] │ +│ ROI: x [■] y [■] w [■] h [■] │ → output: Object (JSON) └───────────────────────────────────────────────┘ ``` -**Generator** (mengikuti pattern `odometryGet.js`): - ```javascript -// blocks/visionGetCount.js BlockRegistry.register({ - name: 'visionGetCount', - category: 'Robot', - categoryColor: '#5b80a5', + name: 'visionDetect', + category: 'Vision', + categoryColor: '#8E24AA', color: '#8E24AA', - tooltip: 'Get the number of detected objects from vision data', + tooltip: 'Detect objects in slots — supports gap detection (empty slots)', definition: { init: function () { - this.appendValueInput('VAR') - .appendField('getVisionCount') - .appendField('from'); - this.setOutput(true, 'Number'); + this.appendValueInput('SLOTS') + .appendField('detectVision') + .appendField(new Blockly.FieldDropdown([ + ['All colors', 'all'], + ['Red', 'red'], + ['Blue', 'blue'], + ['Green', 'green'] + ]), 'COLOR') + .appendField('slots:'); + this.appendValueInput('ROI_X').appendField('ROI x:').setCheck('Number'); + this.appendValueInput('ROI_Y').appendField('y:').setCheck('Number'); + this.appendValueInput('ROI_W').appendField('w:').setCheck('Number'); + this.appendValueInput('ROI_H').appendField('h:').setCheck('Number'); + this.setInputsInline(true); + this.setOutput(true, null); this.setColour('#8E24AA'); - this.setTooltip('Extract object count from vision data'); + this.setTooltip('Detect objects in N slots — returns {total_slots, present_count, absent_count, slots[]}'); } }, generator: function (block) { - var varCode = Blockly.JavaScript.valueToCode( - block, 'VAR', Blockly.JavaScript.ORDER_MEMBER) || '{}'; - var code = '(' + varCode + '.count)'; - return [code, Blockly.JavaScript.ORDER_MEMBER]; + var color = block.getFieldValue('COLOR'); + var slots = Blockly.JavaScript.valueToCode( + block, 'SLOTS', Blockly.JavaScript.ORDER_ATOMIC) || '1'; + var roiX = Blockly.JavaScript.valueToCode( + block, 'ROI_X', Blockly.JavaScript.ORDER_ATOMIC) || '0'; + var roiY = Blockly.JavaScript.valueToCode( + block, 'ROI_Y', Blockly.JavaScript.ORDER_ATOMIC) || '0'; + var roiW = Blockly.JavaScript.valueToCode( + block, 'ROI_W', Blockly.JavaScript.ORDER_ATOMIC) || '640'; + var roiH = Blockly.JavaScript.valueToCode( + block, 'ROI_H', Blockly.JavaScript.ORDER_ATOMIC) || '480'; + + var code = 'JSON.parse((await executeAction(\'vision_detect\', ' + + '{ color: \'' + color + '\', slots: String(' + slots + '), ' + + 'roi: JSON.stringify({x: ' + roiX + ', y: ' + roiY + + ', w: ' + roiW + ', h: ' + roiH + '}) })).message)'; + return [code, Blockly.JavaScript.ORDER_AWAIT]; } }); ``` -### 8.4 Block 3: `visionGetObject` — Extract Object Field +### 11.4 Block: `visionGetSlot` + +Extract status dari slot tertentu (client-side, no ROS2). ``` -┌───────────────────────────────────────────────────────────┐ -│ getVisionObject [■ index] [X ▾] from [detection ▾] │ → output: Number -└───────────────────────────────────────────────────────────┘ +┌──────────────────────────────────────────────────────────┐ +│ getSlot [■ index] [Present? ▾] from [detection ▾] │ → output: value +└──────────────────────────────────────────────────────────┘ ``` -**Generator**: - ```javascript -// blocks/visionGetObject.js BlockRegistry.register({ - name: 'visionGetObject', - category: 'Robot', - categoryColor: '#5b80a5', + name: 'visionGetSlot', + category: 'Vision', + categoryColor: '#8E24AA', color: '#8E24AA', - tooltip: 'Get a field from a detected object by index (0-based, left to right)', + tooltip: 'Get slot status from detection result (present/absent/color/count)', definition: { init: function () { this.appendValueInput('INDEX') - .appendField('getVisionObject'); + .appendField('getSlot'); this.appendDummyInput() .appendField(new Blockly.FieldDropdown([ - ['Center X', 'cx'], - ['Center Y', 'cy'], - ['Width', 'w'], - ['Height', 'h'], - ['Area', 'area'], - ['Color', 'color'] + ['Present?', 'present'], + ['Color', 'color'], + ['Present Count', 'present_count'], + ['Absent Count', 'absent_count'], + ['Total Slots', 'total_slots'], ]), 'FIELD') .appendField('from'); this.appendValueInput('VAR'); this.setInputsInline(true); this.setOutput(true, null); this.setColour('#8E24AA'); - this.setTooltip('Extract a field from detected object at index'); + this.setTooltip('Extract slot info — index for per-slot, or overall counts'); } }, @@ -566,31 +975,39 @@ BlockRegistry.register({ var field = block.getFieldValue('FIELD'); var varCode = Blockly.JavaScript.valueToCode( block, 'VAR', Blockly.JavaScript.ORDER_MEMBER) || '{}'; - var code = '(' + varCode + '.objects[' + indexCode + '].' + field + ')'; + + var code; + // Overall fields don't need index + if (field === 'present_count' || field === 'absent_count' || field === 'total_slots') { + code = '(' + varCode + '.' + field + ')'; + } else { + // Per-slot fields need index (slot array is 0-based in JS) + code = '(' + varCode + '.slots[' + indexCode + '].' + field + ')'; + } return [code, Blockly.JavaScript.ORDER_MEMBER]; } }); ``` -### 8.5 Block 4: `visionTrainColor` — Train New Color +### 11.5 Block: `visionTrainColor` + +Training warna dari ROI area pada frame kamera saat ini. ``` ┌──────────────────────────────────────────────┐ -│ Vision Train Color name: [input] │ -│ ROI size: [50] samples: [10] │ +│ trainColor name: [input] │ +│ ROI: x [■] y [■] w [■] h [■] │ +│ tolerance: [15] │ └──────────────────────────────────────────────┘ ``` -**Generator**: - ```javascript -// blocks/visionTrainColor.js BlockRegistry.register({ name: 'visionTrainColor', - category: 'Robot', - categoryColor: '#5b80a5', + category: 'Vision', + categoryColor: '#8E24AA', color: '#8E24AA', - tooltip: 'Train a new color — place reference object in front of camera before running', + tooltip: 'Train a new color from ROI area — place colored object in view first', definition: { init: function () { @@ -598,236 +1015,262 @@ BlockRegistry.register({ .appendField('trainColor') .appendField('name:') .appendField(new Blockly.FieldTextInput('red'), 'NAME'); + this.appendValueInput('ROI_X').appendField('ROI x:').setCheck('Number'); + this.appendValueInput('ROI_Y').appendField('y:').setCheck('Number'); + this.appendValueInput('ROI_W').appendField('w:').setCheck('Number'); + this.appendValueInput('ROI_H').appendField('h:').setCheck('Number'); this.appendDummyInput() - .appendField('ROI size:') - .appendField(new Blockly.FieldNumber(50, 10, 200), 'ROI_SIZE') - .appendField('samples:') - .appendField(new Blockly.FieldNumber(10, 1, 50), 'SAMPLES'); + .appendField('tolerance:') + .appendField(new Blockly.FieldNumber(15, 5, 50), 'TOLERANCE'); this.setPreviousStatement(true, null); this.setNextStatement(true, null); this.setColour('#8E24AA'); - this.setTooltip('Train a color by sampling the camera ROI'); + this.setTooltip('Train a color by sampling the ROI area'); } }, generator: function (block) { var name = block.getFieldValue('NAME'); - var roiSize = block.getFieldValue('ROI_SIZE'); - var samples = block.getFieldValue('SAMPLES'); + var tolerance = block.getFieldValue('TOLERANCE'); + var roiX = Blockly.JavaScript.valueToCode( + block, 'ROI_X', Blockly.JavaScript.ORDER_ATOMIC) || '0'; + var roiY = Blockly.JavaScript.valueToCode( + block, 'ROI_Y', Blockly.JavaScript.ORDER_ATOMIC) || '0'; + var roiW = Blockly.JavaScript.valueToCode( + block, 'ROI_W', Blockly.JavaScript.ORDER_ATOMIC) || '640'; + var roiH = Blockly.JavaScript.valueToCode( + block, 'ROI_H', Blockly.JavaScript.ORDER_ATOMIC) || '480'; + var code = 'await executeAction(\'vision_train_color\', ' + - '{ name: \'' + name + '\', roi_size: \'' + roiSize + '\', samples: \'' + samples + '\' });\n'; + '{ name: \'' + name + '\', tolerance: \'' + tolerance + '\', ' + + 'roi: JSON.stringify({x: ' + roiX + ', y: ' + roiY + + ', w: ' + roiW + ', h: ' + roiH + '}) });\n'; return code; } }); ``` -### 8.6 Contoh Penggunaan di Blockly +### 11.6 Block: `hmiSetCamera` + +Update HMI camera widget — menampilkan frame dan detection results. -**Program sederhana — hitung obyek merah**: ``` -┌─ Main Program ──────────────────────────────┐ -│ │ -│ set [det] to [getVision color: Red] │ -│ set [count] to [getVisionCount from [det]] │ -│ print ["Jumlah obyek: " + count] │ -│ │ -│ repeat [count] times with [i]: │ -│ set [x] to [getVisionObject [i] │ -│ [Center X] from [det]] │ -│ print ["Obyek " + (i+1) + " di x=" + x] │ -│ │ -└──────────────────────────────────────────────┘ +┌───────────────────────────────────────────────────────┐ +│ HMI Camera [cam1] frame: [■ snapshot] │ +│ detections: [■ detection_result] (optional) │ +└───────────────────────────────────────────────────────┘ ``` -**Program training warna baru**: -``` -┌─ Main Program ─────────────────────────────────┐ -│ │ -│ print ["Taruh obyek KUNING di depan kamera"] │ -│ delay [3] seconds │ -│ trainColor name: "kuning" │ -│ ROI size: 50 samples: 10 │ -│ print ["Training selesai!"] │ -│ │ -│ set [det] to [getVision color: kuning] │ -│ print ["Terdeteksi: " + getVisionCount [det]] │ -│ │ -└──────────────────────────────────────────────────┘ +```javascript +BlockRegistry.register({ + name: 'hmiSetCamera', + category: 'HMI', + categoryColor: '#00BCD4', + color: '#00BCD4', + tooltip: 'Display camera feed with detection overlay on HMI panel', + + definition: { + init: function () { + this.appendValueInput('FRAME') + .appendField('HMI Camera') + .appendField(new Blockly.FieldTextInput('cam1'), 'NAME') + .appendField('frame:'); + this.appendValueInput('DETECTIONS') + .appendField('detections:'); + this.setPreviousStatement(true, null); + this.setNextStatement(true, null); + this.setColour('#00BCD4'); + this.setTooltip('Display camera image with slot detection overlay in HMI panel'); + } + }, + + generator: function (block) { + var name = block.getFieldValue('NAME'); + var frame = Blockly.JavaScript.valueToCode( + block, 'FRAME', Blockly.JavaScript.ORDER_ATOMIC) || 'null'; + var detections = Blockly.JavaScript.valueToCode( + block, 'DETECTIONS', Blockly.JavaScript.ORDER_ATOMIC) || 'null'; + + return ( + "await highlightBlock('" + block.id + "');\n" + + "HMI.setCamera('" + name + "', (" + frame + ").image, " + detections + ");\n" + ); + } +}); ``` -### 8.7 Handler Python — `handlers/vision.py` +### 11.7 Contoh Program Blockly Lengkap -```python -# handlers/vision.py — auto-discovered, no imports to update -import json -import threading +**Main Program** — Deteksi dan cek slot: +``` +┌─ Main Program ──────────────────────────────────────┐ +│ │ +│ set [det] to [detectVision All slots: 4 │ +│ ROI x:50 y:50 w:540 h:380] │ +│ │ +│ set [present] to [getSlot [0] Present Count [det]] │ +│ set [absent] to [getSlot [0] Absent Count [det]] │ +│ print ["Ada: " + present + ", Kosong: " + absent] │ +│ │ +│ repeat 4 times with [i]: │ +│ if [getSlot [i] Present? from [det]] then: │ +│ set [color] to [getSlot [i] Color from [det]] │ +│ print ["Slot " + (i+1) + ": " + color] │ +│ else: │ +│ print ["Slot " + (i+1) + ": KOSONG"] │ +│ │ +└──────────────────────────────────────────────────────┘ +``` -from . import handler -from .hardware import Hardware +**HMI Program** — Live camera feed + detection overlay: +``` +┌─ HMI Program ───────────────────────────────────────┐ +│ │ +│ set [snap] to [getSnapshot] │ +│ set [det] to [detectVision All slots: 4 │ +│ ROI x:50 y:50 w:540 h:380] │ +│ HMI Camera [cam1] frame: [snap] detections: [det] │ +│ │ +│ HMI Number [present_count] = [getSlot [] Present │ +│ Count from [det]] unit: "pcs" │ +│ │ +└──────────────────────────────────────────────────────┘ +``` - -def _get_vision_subscriber(hardware: Hardware): - """Lazy-create subscriber for /vision/detections.""" - if not hasattr(hardware.node, "_vision_cache"): - hardware.node._vision_cache = {} - hardware.node._vision_lock = threading.Lock() - hardware.node._vision_sub = None - - if hardware.node._vision_sub is None: - from std_msgs.msg import String - - def _vision_cb(msg: String): - with hardware.node._vision_lock: - hardware.node._vision_cache = json.loads(msg.data) - - hardware.node._vision_sub = hardware.node.create_subscription( - String, "/vision/detections", _vision_cb, 10 - ) - - return hardware.node._vision_cache - - -def _get_vision_publisher(hardware: Hardware): - """Lazy-create publisher for /vision/train.""" - if not hasattr(hardware.node, "_vision_train_pub"): - from std_msgs.msg import String - - hardware.node._vision_train_pub = hardware.node.create_publisher( - String, "/vision/train", 10 - ) - return hardware.node._vision_train_pub - - -@handler("vision_detect") -def handle_vision_detect( - params: dict[str, str], hardware: Hardware -) -> tuple[bool, str]: - color = params.get("color", "all") - hardware.log(f"vision_detect(color={color})") - - data = {"count": 0, "objects": []} - - if hardware.is_real(): - cache = _get_vision_subscriber(hardware) - with hardware.node._vision_lock: - if cache: - if color == "all": - data = cache - else: - # Filter by color - filtered = [o for o in cache.get("objects", []) if o.get("color") == color] - data = {"count": len(filtered), "objects": filtered} - - return (True, json.dumps(data)) - - -@handler("vision_train_color") -def handle_vision_train_color( - params: dict[str, str], hardware: Hardware -) -> tuple[bool, str]: - name = params.get("name", "unknown") - roi_size = params.get("roi_size", "50") - samples = params.get("samples", "10") - hardware.log(f"vision_train_color(name={name}, roi_size={roi_size}, samples={samples})") - - if hardware.is_real(): - from std_msgs.msg import String - - pub = _get_vision_publisher(hardware) - msg = String() - msg.data = json.dumps({"color_name": name, "roi_size": int(roi_size), "samples": int(samples)}) - pub.publish(msg) - - return (True, f"Training color '{name}' initiated") +**Training Program** — Train warna baru: +``` +┌─ Main Program ──────────────────────────────────────┐ +│ │ +│ print ["Taruh obyek MERAH di area ROI"] │ +│ delay [3] seconds │ +│ trainColor name: "merah" │ +│ ROI x: 200 y: 150 w: 240 h: 180 │ +│ tolerance: 15 │ +│ print ["Training merah selesai!"] │ +│ │ +│ print ["Taruh obyek BIRU di area ROI"] │ +│ delay [3] seconds │ +│ trainColor name: "biru" │ +│ ROI x: 200 y: 150 w: 240 h: 180 │ +│ tolerance: 15 │ +│ print ["Training biru selesai!"] │ +│ │ +└──────────────────────────────────────────────────────┘ ``` --- -## 9. Implementation Phases +## 12. Implementation Phases -### Phase 1 — Minimum Viable Product (Rekomendasi untuk memulai) +### Phase 1 — MVP | Komponen | Detail | |----------|--------| -| Camera | OpenCV `VideoCapture` langsung (no ROS2 image pipeline) | -| Detection | HSV thresholding + contour detection | -| Training | Capture ROI samples → compute HSV range → save JSON | -| Blockly | 4 blocks: `visionDetect`, `visionGetCount`, `visionGetObject`, `visionTrainColor` | -| Handler | `vision_detect`, `vision_train_color` (pattern identik odometry) | -| Message | Tidak perlu custom message — JSON via `BlocklyAction.action` | -| Platform | Berjalan di Pi 4/5 dan Desktop | +| Pi Node | `amr_vision_node` — capture + publish JPEG only, ament_python | +| Executor | `handlers/vision.py` — OpenCV processing, 3 handlers | +| HMI | `setCamera()` widget — display base64 image + slot indicators | +| Blockly | 5 blocks: `visionSnapshot`, `visionDetect`, `visionGetSlot`, `visionTrainColor`, `hmiSetCamera` | +| Counting | Slot-based with gap detection | +| Color Training | HSV sampling dari ROI, simpan JSON | +| Image Transport | `sensor_msgs/CompressedImage` (fallback: `std_msgs/String` base64) | -**Deliverables**: -- `src/amr_vision_node/` — ROS2 Python package lengkap -- 4 Blockly block files di `src/blockly_app/.../blocks/` -- 2 handler functions di `src/blockly_executor/.../handlers/vision.py` -- Update `manifest.js` dan `pixi.toml` -- Integration tests +**Dependencies baru**: +```toml +# pixi.toml — Desktop +py-opencv = "*" -### Phase 2 — Enhanced (setelah Phase 1 stabil) +# pixi.toml — Pi (minimal, hanya untuk capture) +py-opencv = "*" # atau picamera2 jika CSI camera +``` + +### Phase 2 — Interactive HMI | Komponen | Detail | |----------|--------| -| ROS2 Image Pipeline | `cv_bridge`, `image_transport`, `sensor_msgs/Image` | -| HMI Camera Feed | Widget menampilkan live camera thumbnail di HMI panel | -| ML Color Classifier | k-Nearest Neighbors (KNN) trained on HSV samples | -| Multi-Color | Deteksi beberapa warna secara simultan | -| Custom Messages | `VisionDetection.msg`, `VisionDetections.msg` | +| ROI drag/resize | Pointer events pada canvas overlay di camera widget | +| Slot divider visual | Vertical lines pada camera feed menunjukkan batas slot | +| Detection overlay | Bounding boxes + color labels digambar di canvas | +| Color picker | Training warna via click pada area di camera feed | +| Dynamic color dropdown | Blockly dropdown diisi dari trained colors file | -### Phase 3 — Advanced (future enhancement) +### Phase 3 — Advanced | Komponen | Detail | |----------|--------| -| YOLO Detection | YOLOv8-nano via ONNX Runtime (~5 FPS on Pi 5) | -| Object Tracking | Track objects across frames (persistent ID) | -| Shape Recognition | Deteksi bentuk selain warna (lingkaran, persegi, dll) | +| YOLO Detection | YOLOv8-nano via ONNX Runtime di executor | +| Object Tracking | Persistent ID across frames | +| Camera calibration | Distortion correction, pixel-to-cm mapping | +| Multi-camera | Support beberapa kamera Pi | --- -## 10. Performance Estimates pada Raspberry Pi +## 13. Performance Estimates -Berdasarkan benchmark OpenCV pada Raspberry Pi 4/5 yang dipublikasikan: +### Pi Side (Camera Server Only) -| Operasi | Pi 4 | Pi 5 | -|---------|------|------| -| HSV threshold + contour (640x480) | 15-30 FPS | 30+ FPS | -| Single color detection pipeline | ~10-20 ms/frame | ~5-10 ms/frame | -| 3 colors simultaneously | ~30-50 ms/frame | ~15-25 ms/frame | -| Memory usage (OpenCV + camera buffer) | ~50-100 MB | ~50-100 MB | -| YOLO v8-nano (ONNX Runtime) | ~2-3 FPS | ~5-7 FPS | +| Operasi | Waktu | Bandwidth (2 Hz) | +|---------|-------|-------------------| +| Camera capture (640x480) | ~5-10 ms | — | +| JPEG encode (quality 80) | ~3-5 ms | — | +| ROS2 publish | ~1 ms | ~60-100 KB/s | +| **Total per frame** | **~10-15 ms** | — | -**Handler round-trip** (Blockly → executor → vision_node cache → result): menambah ~10-100 ms, sehingga effective detection rate dari Blockly adalah 5-15 Hz. Cukup memadai untuk sequential object counting yang tidak memerlukan real-time tracking. +Pi hanya butuh ~3% CPU untuk camera server pada 2 Hz. Sisa resources untuk motor control, encoder, GPIO. + +### Executor Side (Desktop Processing) + +| Operasi | Waktu | +|---------|-------| +| JPEG decode | ~2-5 ms | +| HSV convert + threshold (640x480) | ~3-5 ms | +| Contour detection | ~1-2 ms | +| Slot assignment | ~0.1 ms | +| Base64 encode untuk HMI | ~2-3 ms | +| **Total per frame** | **~10-15 ms** | + +Desktop bisa memproses 60-100 FPS. Bottleneck adalah Pi publish rate (2-5 Hz), bukan processing. + +### HMI Display + +| Operasi | Waktu | +|---------|-------| +| Set img src (base64, ~50 KB) | ~5-10 ms | +| Render slot indicators | ~1 ms | +| **Effective HMI camera rate** | **2-5 FPS** | + +2-5 FPS cukup untuk monitoring — ini bukan video streaming, tapi inspection snapshot. --- -## 11. Risks & Mitigations +## 14. Risks & Mitigations | Risk | Impact | Likelihood | Mitigation | |------|--------|------------|------------| -| RoboStack tidak punya `cv_bridge`/`image_transport` untuk aarch64 | Tidak bisa pakai ROS2 image pipeline | Medium | Phase 1 pakai OpenCV `VideoCapture` langsung — zero ROS2 image deps | -| Sensitivitas pencahayaan HSV | Deteksi tidak akurat saat cahaya berubah | High | Training procedure, tolerance parameter adjustable, auto white balance kamera | -| Pi overheat saat continuous vision | Throttling, FPS drop | Medium | Kurangi frame rate, gunakan heatsink/fan, configurable `publish_rate` | -| USB bandwidth contention | Frame drops | Low | Gunakan CSI camera, atau kurangi resolusi | -| Obyek overlapping/occlusion | Count salah | Medium | Minimum separation filter, morphological operations, area filter | -| Hue wrapping untuk warna merah | Training merah gagal | Medium | Deteksi Hue bimodal, gunakan 2 range + bitwise OR | +| `sensor_msgs/CompressedImage` tidak ada di RoboStack aarch64 | Tidak bisa transport image standar | Medium | Fallback ke `std_msgs/String` base64 | +| Sensitivitas pencahayaan HSV | Deteksi tidak akurat | High | Training procedure, adjustable tolerance, HMI live preview untuk verifikasi | +| Base64 image terlalu besar untuk action result | Slow response | Low | JPEG quality tuning, resolusi lebih kecil, atau pisahkan image transport dari detection | +| WiFi latency Pi → Desktop | Frame delay | Medium | Ethernet, atau reduce frame resolution | +| Slot assignment ambigu (obyek di batas slot) | Obyek masuk ke 2 slot | Medium | Gunakan centroid untuk assignment, tambah tolerance zone | +| `py-opencv` tidak tersedia di RoboStack aarch64 | Pi tidak bisa capture | Low | Gunakan `picamera2` untuk Pi Camera, atau `v4l2` raw capture | --- -## 12. Conclusion & Recommendation +## 15. Conclusion & Recommendation ### Kelayakan -Implementasi vision sensor pada AMR ROS2 K4 **layak dilakukan** dengan pendekatan HSV color thresholding menggunakan OpenCV. Pendekatan ini: +Implementasi vision sensor dengan arsitektur split processing **layak dilakukan**: -1. **Ringan secara komputasi** — berjalan 15-30 FPS pada Raspberry Pi 4 tanpa GPU -2. **Memenuhi semua requirements** — color recognition, color training, object counting left-to-right -3. **Terintegrasi natural** ke arsitektur Blockly yang sudah ada — mengikuti pattern odometry yang terbukti (fetch once, extract many via JSON) -4. **Tidak memerlukan custom message baru** — JSON via `BlocklyAction.action` cukup untuk Phase 1 -5. **Inkremental** — Phase 1 bisa dimulai segera, Phase 2/3 bisa ditambahkan saat dibutuhkan +1. **Pi tetap ringan** — hanya camera capture + JPEG publish, ~3% CPU usage +2. **Processing di desktop** — OpenCV berjalan cepat (~10-15 ms/frame), tidak membebani Pi +3. **HMI integration natural** — camera widget + slot indicators menggunakan pattern HMI yang sudah ada +4. **Slot-based counting** — mendukung gap detection (slot kosong), lebih realistis dari simple counting +5. **ROI via HMI** — user-friendly, visual, tidak perlu input angka manual +6. **Inkremental** — Phase 1 bisa dimulai segera, interactive features di Phase 2 -### Rekomendasi Langkah Selanjutnya +### Langkah Selanjutnya -1. **Verifikasi** ketersediaan `py-opencv` di RoboStack `linux-aarch64` channel -2. **Implementasi Phase 1** — `amr_vision_node`, 4 Blockly blocks, 2 handlers -3. **Testing** — integration tests di dummy mode + manual test dengan kamera USB di Pi -4. **Iterasi** — tune HSV parameters, tambah default color profiles, uji berbagai kondisi pencahayaan +1. **Verifikasi** `py-opencv` dan `sensor_msgs` tersedia di RoboStack untuk kedua platform +2. **Implementasi Phase 1** — Pi camera node, executor handlers, HMI camera widget, 5 Blockly blocks +3. **Testing** — dummy mode tests + manual test dengan USB webcam +4. **Iterate** — tune HSV, test berbagai pencahayaan, tambah default color profiles