Name: Gemini Robotics
Author: shanejonas

Skills suchen.../

Gemini Robotics | Skills Pool

cd ~/projects/lerobot
source .venv/bin/activate

# Runs on Pi5 completely offline
python dimos_toy_cleanup.py

# Set API key
export GEMINI_API_KEY="your-key"

# Test connectivity
python ~/.hermes/skills/gemini-robotics/scripts/test_gemini.py

# Run generic task executor
python ~/.hermes/skills/gemini-robotics/scripts/gemini_task_executor.py "pick up the red block and put it in the blue bowl"

# Multi-camera (body + wrist) for better manipulation
python ~/.hermes/skills/gemini-robotics/scripts/gemini_task_executor.py "pick up the screw" --cameras 0 1

# Dry-run mode (no robot movement)
python ~/.hermes/skills/gemini-robotics/scripts/gemini_task_executor.py "pick up the cup" --dry-run

# This fails (langchain deps in dimos.core):
from dimos.manipulation.planning.kinematics.pinocchio_ik import PinocchioIK

# WORKAROUND: Inline the IK class (~80 lines from dimos)
class PinocchioIK:
    def __init__(self, model, data, ee_joint_id): ...
    def solve(self, target_pose, q_init): ...
    def forward_kinematics(self, q): ...

cd ~/projects/lerobot
source .venv/bin/activate
uv pip install dimos  # Installs pinocchio, open3d-unofficial-arm

# Main script location:
python ~/projects/lerobot/dimos_toy_cleanup.py

# Current (needs tuning):
world_x = 0.15  # Fixed
world_y = (cx - 320) / 320 * 0.1  # Horizontal scale
world_z = 0.1 - (cy - 240) / 240 * 0.1  # Depth guess

# TODO: Add camera matrix + hand-eye calibration

# Using uv (recommended for LeRobot)
uv pip install "google-generativeai>=0.8.3"

# Or with pip
pip install "google-generativeai>=0.8.3" pillow opencv-python

User Instruction → Camera Image → Gemini Vision → Object Detection → Action Plan → Robot Execution

class LeKiwiAPI:
    def move(x, y, high):
        """Move arm to normalized coordinates 0-1000.
        high=True lifts above scene (obstacle avoidance).
        high=False places gripper on surface."""
        
    def setGripperState(opened):
        """True=open, False=close"""
        
    def returnToOrigin():
        """Return to home pose"""

File	Purpose	Mode
`gemini_task_executor.py`	Generic task executor with Gemini API	Gemini
`gemini_pick_place_lerobot.py`	Original tutorial example	Gemini
`gemini_toy_cleanup.py`	Specific toy cleanup demo	Gemini
`gemini_robot.py`	Interactive Gemini control	Gemini
`test_gemini.py`	API connectivity test	Gemini
`pick_and_place.py`	Standalone pick/place	Gemini
`dimos_toy_cleanup.py`	Local CV + Pinocchio IK	dimos

export GEMINI_API_KEY="your-key-here"

# Find camera index
python -c "import cv2; [print(f'{i}: {cv2.VideoCapture(i).isOpened()}') for i in range(5)]"

# Use different index
python gemini_task_executor.py "task" --camera 2

cd ~/projects/lerobot && uv pip install google-generativeai>=0.8.3

# Don't use this (pulls in dimos.core → langchain):
from dimos.manipulation.planning.kinematics.pinocchio_ik import PinocchioIK

# Use this (standalone class already in dimos_toy_cleanup.py):
class PinocchioIK:
    """Standalone Pinocchio IK solver (~80 lines from dimos)"""
    def __init__(self, model, data, ee_joint_id): 
        self._model = model
        self._data = data
        self._ee_joint_id = ee_joint_id
        ...

# Try these sign flips per servo:
action = {
    "arm_shoulder_pan": -np.degrees(target_angles[0]),  # or +angle
    "arm_shoulder_pitch": -np.degrees(target_angles[1]),
    "arm_elbow": -np.degrees(target_angles[2]),
    "arm_wrist_pitch": -np.degrees(target_angles[3]),
    "arm_wrist_roll": -np.degrees(target_angles[4]),
    "arm_gripper": -np.degrees(target_angles[5]) * 2,
}

python -c "
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
while True:
    ret, frame = cap.read()
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
    # Adjust these:
    mask = cv2.inRange(hsv, np.array([20, 100, 100]), np.array([35, 255, 255]))
    cv2.imshow('mask', mask)
    if cv2.waitKey(1) == ord('q'): break
"

python gemini_task_executor.py "task" --debug
# Saves to /tmp/lerobot_gemini_view.png

python gemini_task_executor.py "task" --camera 1  # USB camera

from gemini_task_executor import GeminiVisionClient, LeKiwiAPI

vision = GeminiVisionClient(genai)
objects = vision.locate_objects(image, ["red block", "blue bowl"])
plan = vision.generate_plan(image, instruction, detected_objects)
api = LeKiwiAPI(robot)
execute_plan(api, plan)

Mode	Vision	IK/Motion	Network	Speed
Gemini	Gemini-2.5-Flash	Hardcoded/simple	Required	~2-3s/cycle
dimos	OpenCV colors	Pinocchio IK	Offline	~30fps camera

Mode	Vision	IK/Motion	Network	Speed
Gemini	Gemini-2.5-Flash	Hardcoded/simple	Required	~2-3s/cycle
dimos	OpenCV colors	Pinocchio IK	Offline	~30fps camera

Gemini Robotics

Gemini Robotics for LeKiwi

Two Modes

Quick Start - Local Mode (RECOMMENDED for Pi5)

Gemini Robotics

Gemini Robotics for LeKiwi

Two Modes

Quick Start - Local Mode (RECOMMENDED for Pi5)

Quick Start - Gemini API Mode

dimos + Pinocchio (LOCAL)

What It Does

Key Discovery

Setup

Running

Calibration Required

IK Details

What It Does (Gemini Mode)

Model

Install

Architecture

Robot API (Gemini-compatible)

Files

Choosing Between Modes

Troubleshooting

Gemini API Mode

dimos/Local Mode

Advanced Usage

Feishu Perm

Discord

Coding Agent (bash-first)

Apple Notes

Feishu Wiki

Bear Notes