Meet Pico.

Your AI companion that sees, hears, and feels.

An emotionally responsive desktop robot that communicates like a pet — through chirps, expressions, and movement.

Core Concept

What Is PICO?

More than a smart speaker. A companion that truly responds.

“Unlike smart speakers that just answer questions, Pico behaves like a living pet. It's a non-verbal AI companion that communicates through expressive sounds, animated eyes, and head movements.”

It Sees You

Face detection and recognition. Knows your face, remembers you.

It Hears You

Wake-word detection and speech-to-text. Understands your commands.

It Feels Touch

Capacitive touch sensor. Pet it and it purrs.

IDLE

Default state

Subtle breathing sounds

Personality System

A Personality That Breathes

PICO's Emotion Engine is a state machine that processes inputs and transitions between emotions — just like a living creature.

Emotion States

How It Works

→Sensory inputs (camera, mic, touch) trigger emotion transitions
→Each state has its own eye expression, sounds, and head movement
→Transitions are smooth — half-blink between expressions
→Idle behaviors like random blinking and pupil drift add life

Click a state to see it live

IDLE

Capabilities

What Pico Can Do

A complete AI system packed into a tiny companion.

AI-1

Wake Word Detection

"Hey Pico" trigger via ESP-SR, with offline fallback.

AI-2

Speech-to-Text

Google Cloud Speech-to-Text with 60-min free tier.

AI-3

Natural Language Processing

Google Gemini for conversational AI and contextual responses.

AI-4

Sound Bank Communication

20+ expressive WAV files — no TTS, pure personality-driven audio.

Development Methodology

Software-First Approach

Perfect your AI on PC before spending a single rupee on hardware.

Simulate on PC

Develop and test 100% of AI functionality on your laptop using Python. Camera, microphone, and speakers simulate the robot hardware. Zero hardware cost to start.

Python 3.11 + OpenCV + face_recognition
Google Cloud Speech-to-Text + Gemini AI
Sound Bank with 20+ expressive WAV files
Full Emotion Engine state machine

Port to Hardware

Translate validated Python logic to C++ (Arduino framework). Upload firmware to the ESP32-S3-EYE via USB. Same algorithms, embedded performance.

Arduino IDE + ESP-IDF toolchain
ESP-WHO face detection on-device
ESP-SR wake-word recognition
I2S audio + OLED display drivers

Integrate & Build

Assemble the physical robot with 3D-printed shell, servo head tracking, touch sensor, and speaker. Total hardware cost under ₹10,000.

ESP32-S3-EYE + 0.96" OLED display
Pan-tilt servo head (2× SG90)
MAX98357A I2S amp + speaker
3D-printed magnetic shell

Hardware Platform

Built on ESP32-S3

Everything you need, at an accessible price point.

₹0 – ₹0

Total estimated cost • Indian market pricing • Verified suppliers

ESP32-S3-EYE

Dual-core 240MHz processor
8MB PSRAM + 16MB Flash
Built-in 2MP camera + microphone

₹4,200 – ₹5,500

0.96″ OLED (SSD1306)

128×64 I2C display
Self-emissive — no backlight
Perfect for expressive eye animation

₹150 – ₹300

2× SG90 Micro Servo

180° rotation range
1.8 kg·cm torque
Pan-tilt head tracking assembly

₹200 – ₹400

MAX98357A Amplifier

I2S digital audio interface
3W output power
No DAC required — direct ESP32 connection

₹180 – ₹350

3W Speaker (40mm)

4Ω impedance
Full-range driver
Clear sound for chirps and beeps

₹50 – ₹150

Touch Sensor (TTP223)

Capacitive touch detection
Single-pin digital output
Pet-petting interaction trigger

₹30 – ₹60

Technology

The Tech Inside

A complete AI stack — from Python simulation to embedded firmware.

PC Simulation

Phase 1 — Python 3.11

OpenCV— Camera & image processing
face-recognition— Face detection & identification
sounddevice— Audio recording & playback
Google Cloud STT— Speech recognition (free tier)
Google Gemini— Conversational AI

Cloud Services

Free Tier APIs

Speech-to-Text— 60 min/month free
Gemini AI— 60 req/minute free
WiFi Bridge— ESP32 ↔ Cloud

Robot Firmware

Phase 2 — C++ (Arduino)

ESP-WHO— On-device face detection
ESP-SR— Wake-word recognition
FreeRTOS— Real-time OS multitasking
I2S Audio— Digital audio output
SSD1306 Driver— OLED display rendering

Ready to Build?

Start with zero hardware cost. Perfect your AI on your laptop, then bring it to life on the ESP32.

1Clone the PICO repository

2Install Python dependencies

3Run the robot simulator on your PC

4Start modifying the Emotion Engine

Read the Docs View on GitHub