Sound Design
How Pico communicates through sound
PICO Sound Bank Guide
Document Type: Technical Implementation Guide
Version: 1.0
Date: December 5, 2025
Status: Production-Ready
Purpose: Complete guide for creating and managing PICO's non-verbal sound communication system
Table of Contents
- Overview
- Sound Bank Structure
- Technical Specifications
- Sound Design Philosophy
- Creation Methods
- Required Sounds List
- Processing Workflow
- Testing & Quality Control
- Integration Guide
- Sound Mapping Configuration
Overview
PICO's personality is expressed entirely through non-verbal sounds. This guide provides comprehensive instructions for creating a complete sound library that makes PICO feel alive, expressive, and emotionally engaging.
Goal: Create 20-30 high-quality sound effects that convey emotion and personality without using human speech.
Design Philosophy: PICO is a non-verbal pet companion that communicates through pre-recorded sound effects (.wav files) like chirps, purrs, and whistles - similar to R2-D2, Wall-E, or Pokemon.
Sound Bank Structure
Complete Directory Structure
pico_robot/
├── assets/
│ └── sounds/
│ ├── README.md # Sound Bank documentation
│ ├── ATTRIBUTION.md # Source credits for sounds
│ │
│ ├── emotional/ # Emotional expression sounds
│ │ ├── happy_chirp_01.wav # Primary happiness sound
│ │ ├── happy_chirp_02.wav # Variation 1
│ │ ├── happy_chirp_03.wav # Variation 2
│ │ ├── curious_hum_01.wav # Questioning/investigation
│ │ ├── curious_hum_02.wav # Variation
│ │ ├── sad_whimper_01.wav # Disappointment
│ │ ├── excited_whistle_01.wav # High energy
│ │ ├── excited_whistle_02.wav # Variation
│ │ ├── loved_purr_01.wav # Contentment (continuous)
│ │ ├── sleepy_yawn_01.wav # Low energy/tired
│ │ ├── playful_giggle_01.wav # During active play
│ │ ├── surprised_beep_01.wav # Unexpected event
│ │ └── confused_warble_01.wav # Didn't understand
│ │
│ ├── reactions/ # System response sounds
│ │ ├── greeting_beep.wav # Hello sound
│ │ ├── greeting_melody.wav # Morning greeting variation
│ │ ├── acknowledgment_chirp.wav # "I heard you"
│ │ ├── success_ding.wav # Command successful
│ │ ├── error_buzz.wav # Something went wrong
│ │ ├── listening_bing.wav # Wake word detected
│ │ ├── goodbye_chime.wav # Farewell sound
│ │ ├── notification_ping.wav # General alert
│ │ └── alert_urgent.wav # Important notification
│ │
│ ├── ambient/ # Background/continuous sounds
│ │ ├── startup_beep.wav # Power-on sequence
│ │ ├── thinking_hum.wav # Processing indicator
│ │ ├── idle_breathing.wav # Subtle life presence
│ │ ├── working_hum.wav # During long tasks
│ │ ├── night_lullaby.wav # Bedtime sound
│ │ └── morning_chirp.wav # Time-based greeting
│ │
│ └── test/ # Test and calibration sounds
│ ├── test_tone_440hz.wav # Calibration tone
│ ├── test_sweep.wav # Frequency sweep test
│ └── volume_reference.wav # Volume calibration
Naming Conventions
Pattern: emotion_type_variation.wav
Examples:
happy_chirp_01.wav- Happy emotion, chirp type, variation 1curious_hum_01.wav- Curious emotion, hum type, variation 1success_ding.wav- Success reaction, ding type (no variation number if only one)
Rules:
- All lowercase
- Underscore separated
- No spaces or special characters
- Variation numbers: 01, 02, 03 (zero-padded)
- Descriptive type (chirp, hum, beep, ding, buzz, whistle, purr, etc.)
Technical Specifications
Required Audio Format
All sounds MUST meet these specifications:
| Property | Value | Reason | |----------|-------|--------| | Format | WAV (uncompressed) | No decoding overhead, instant playback | | Sample Rate | 16kHz | Optimal for ESP32, good quality, low file size | | Bit Depth | 16-bit | Standard quality, efficient storage | | Channels | Mono | Single speaker, reduces file size by 50% | | Duration | 0.3 - 3.0 seconds | Quick responses, not too short/long | | Max File Size | ~100KB per sound | Memory constraints on ESP32 |
Why These Specs?
-
16kHz: The ESP32-S3 can handle higher, but 16kHz is the sweet spot for:
- Clear sound quality for beeps/chirps
- Efficient memory usage (3 sec @ 16kHz = ~96KB)
- Fast playback startup
-
Mono: PICO has one speaker, stereo is wasted
-
WAV format: No MP3 decoding lag, instant response
Sound Design Philosophy
What Makes a Good PICO Sound?
PICO sounds should be:
- Emotionally Clear: Listeners should immediately understand the emotion
- Non-Human: Should not sound like words or human vocalization
- Endearing: Cute, lovable, not annoying or harsh
- Distinctive: Each sound should be recognizable
- Varied: Multiple variations of common sounds prevent monotony
Inspiration Sources
- R2-D2 (Star Wars): Electronic beeps with personality
- Wall-E: Expressive robotic sounds with emotional range
- Pokemon: Simple syllable-based communication
- BB-8: Rolling beeps and excited chirps
- Portal Turrets: Friendly mechanical sounds
Emotional Sound Characteristics
| Emotion | Pitch | Speed | Complexity | Example Pattern | |---------|-------|-------|------------|-----------------| | Happy | High, rising | Fast | Simple | ↗️ "bee-beep!" | | Sad | Low, falling | Slow | Simple | ↘️ "wooo-omp" | | Curious | Medium, wavering | Medium | Rising question | ↗️? "bee?-oop?" | | Excited | Very high | Very fast | Multiple notes | ↗️↗️↗️ "beepbeepbeep!" | | Confused | Wobbly, uncertain | Variable | Dissonant | ~~ "woo-blee-orp?" | | Loved/Content | Low, rumbling | Slow, steady | Warm, continuous | ~~~~ "purrrr" | | Sleepy | Low, fading | Very slow | Descending | ↘️... "yaaawwwn" |
Creation Methods
Method 1: Software Synthesis (Easiest)
Tools:
- Bfxr (https://www.bfxr.net/) - Free, web-based, perfect for game sounds
- ChipTone (https://sfbgames.itch.io/chiptone) - Retro chiptune generator
- Audacity - Free audio editor with tone generator
Bfxr Workflow for Happy Chirp:
1. Open Bfxr in browser
2. Click "Pickup/Coin" preset
3. Adjust parameters:
- Base Frequency: 800-1000 Hz
- Slide: +0.3 (rising pitch)
- Duration: 0.2-0.4 seconds
- Wave Type: Square or Sawtooth
4. Preview until it sounds cheerful
5. Export as WAV
6. Post-process in Audacity (see Processing Workflow)
Audacity Tone Generator for Beeps:
1. Generate > Tone
2. Waveform: Sine (smooth) or Square (electronic)
3. Frequency: 440-880 Hz
4. Amplitude: 0.5
5. Duration: 0.3 seconds
6. Apply envelope (Attack: 0.01s, Decay: 0.02s)
Method 2: Voice Modulation (Most Expressive)
Tools:
- Audacity with pitch shift
- Adobe Audition (paid, but powerful)
- Your voice + effects
Workflow:
1. Record yourself making sounds:
- "Bee-boop!" (happy)
- "Hmmmm?" (curious)
- "Awwww" (sad)
- "Wheeee!" (excited)
2. In Audacity:
- Effect > Change Pitch: +800-1200% (makes it robotic/cute)
- Effect > Change Speed: +25-50% (makes it snappier)
- Effect > Equalization: Boost 800-2000Hz (electronic tone)
- Effect > Reverb: Small room, 10% (adds space)
3. Trim to 0.5-1.5 seconds
4. Normalize to -3dB
This method creates the most "alive" sounds because they have natural human emotional inflection, just pitched up to sound robotic.
Method 3: Sampling & Editing (Professional)
Sources:
- Freesound.org - Huge library of Creative Commons sounds
- Zapsplat.com - Free sound effects
- BBC Sound Effects - Public domain library
- Field recording - Record real objects
Search terms:
- "robot beep"
- "electronic chirp"
- "game pickup"
- "toy sound"
- "notification"
Workflow:
1. Download promising sounds
2. Import to Audacity
3. Cut to best 1-2 seconds
4. Remove background noise (Effect > Noise Reduction)
5. Adjust pitch/speed to fit PICO's character
6. Apply filters:
- High-pass filter @ 200Hz (remove rumble)
- Boost 1-2kHz (clarity)
7. Normalize to -3dB
8. Export as WAV
Method 4: Hybrid Approach (Recommended)
Combine methods for best results:
Base Sound (Bfxr) + Voice Expression (Audacity) + Polish (Effects)
Example - Creating "Loved Purr":
1. Generate low sine wave in Bfxr (100-200Hz)
2. Record yourself making "prrrr" sound
3. Layer them in Audacity (50% each)
4. Add slight vibrato (Effect > Wahwah, Frequency: 4Hz)
5. Result: Warm, mechanical purr with organic feel
Required Sounds List
Priority 1: Essential Sounds (Must Have)
Emotional Expressions:
1. happy_chirp_01.wav - Primary happiness sound
2. happy_chirp_02.wav - Variation to avoid repetition
3. curious_hum_01.wav - Questioning/investigation
4. sad_whimper_01.wav - Disappointment/sadness
5. excited_whistle_01.wav - High energy excitement
6. loved_purr_01.wav - Contentment when petted
System Responses:
7. listening_bing.wav - Wake word detected
8. acknowledgment_chirp.wav - "I heard you"
9. success_ding.wav - Command executed successfully
10. error_buzz.wav - Something went wrong
Ambient:
11. startup_beep.wav - Power on sequence
12. thinking_hum.wav - Processing indicator
Priority 2: Enhanced Personality (Recommended)
More Emotions:
13. sleepy_yawn_01.wav - Low battery/idle too long
14. confused_warble.wav - Didn't understand
15. surprised_beep.wav - Unexpected face detected
16. greeting_melody.wav - Morning greeting
Interactions:
17. playful_giggle.wav - During active interaction
18. goodbye_chime.wav - User leaving
19. notification_ping.wav - Alert sound
20. alert_urgent.wav - Important notification
Priority 3: Advanced Features (Optional)
Contextual:
21. morning_chirp.wav - Time-based greeting
22. night_lullaby.wav - Bedtime sound
23. working_hum.wav - During long tasks
24. idle_breathing.wav - Subtle life presence
Processing Workflow
Universal Processing Pipeline
Every sound should go through these steps in Audacity:
Step 1: Import & Trim
1. File > Import > Audio
2. Select the best 0.5-3 second segment
3. Edit > Remove Audio or Trim Audio
Step 2: Noise Reduction (if recorded)
1. Select silent part (if any)
2. Effect > Noise Reduction > Get Noise Profile
3. Select all (Ctrl+A)
4. Effect > Noise Reduction > OK (Reduction: 12dB)
Step 3: EQ (Frequency Shaping)
Effect > Filter Curve EQ:
- High-pass at 200Hz (remove rumble)
- Gentle boost at 800-2000Hz (clarity and presence)
- Roll-off above 8kHz (we're at 16kHz sample rate, focus on mids)
Presets:
- Happy sounds: Boost 1-2kHz (bright)
- Sad sounds: Boost 200-400Hz (warm, low)
- Electronic sounds: Notch at 1kHz, boost 2-4kHz (metallic)
Step 4: Dynamics
Effect > Compressor:
- Threshold: -12dB
- Ratio: 3:1
- Attack: 0.01s
- Release: 0.2s
This makes quiet parts louder and prevents clipping.
Step 5: Normalization
Effect > Normalize:
- Normalize peak amplitude to: -3.0 dB
- ✅ Normalize stereo channels independently (if stereo)
This ensures consistent volume across all sounds.
Step 6: Convert to Mono
Tracks > Mix > Mix Stereo Down to Mono
Step 7: Resample to 16kHz
Tracks > Resample:
- New sample rate: 16000 Hz
Step 8: Export
File > Export > Export Audio:
- Format: WAV (Microsoft) signed 16-bit PCM
- Filename: descriptive_name.wav
Testing & Quality Control
Listening Tests
Before finalizing sounds, test them:
-
Emotional Clarity Test
- Play sound without context
- Ask: "What emotion is this?"
- Should be immediately obvious
-
Volume Consistency Test
- Play all sounds in sequence
- All should be roughly same loudness
- Use loudness meter (Audacity: Analyze > Contrast)
-
Speaker Test
- Test on small speaker (similar to PICO's)
- Sounds that are too bass-heavy won't work
- Laptop speakers are good reference
-
Repetition Test
- Play same sound 10 times
- Does it get annoying? (if yes, redesign)
- Is there variation if applicable?
-
Context Test
- Play in sequence: happy > sad > curious
- Transitions should feel natural
- Each should be distinct
Technical Verification
# verify_sounds.py
import soundfile as sf
import os
def verify_sound_files(sounds_dir='assets/sounds/'):
"""Verify all sound files meet specifications"""
issues = []
for root, dirs, files in os.walk(sounds_dir):
for file in files:
if file.endswith('.wav'):
path = os.path.join(root, file)
try:
data, samplerate = sf.read(path)
duration = len(data) / samplerate
channels = 1 if len(data.shape) == 1 else data.shape[1]
# Check specifications
if samplerate != 16000:
issues.append(f"❌ {file}: Wrong sample rate ({samplerate}Hz, should be 16000Hz)")
if channels != 1:
issues.append(f"❌ {file}: Not mono ({channels} channels)")
if duration > 3.0:
issues.append(f"⚠️ {file}: Too long ({duration:.2f}s, max 3.0s)")
if duration < 0.2:
issues.append(f"⚠️ {file}: Very short ({duration:.2f}s)")
# Check for clipping
if abs(data).max() > 0.99:
issues.append(f"⚠️ {file}: Possible clipping detected")
# Check for silence
if abs(data).max() < 0.01:
issues.append(f"❌ {file}: Too quiet or silent")
if not issues:
print(f"✅ {file}: OK ({duration:.2f}s, {samplerate}Hz)")
except Exception as e:
issues.append(f"❌ {file}: Error reading file - {e}")
if issues:
print("\n⚠️ Issues Found:")
for issue in issues:
print(issue)
else:
print("\n✅ All sound files verified!")
Integration Guide
File Organization
Place all sound files in the appropriate category folders as shown in the Sound Bank Structure section above.
Testing in Simulation
# test_sounds.py
from sound_bank import SoundBank
import time
def test_all_sounds():
"""Play all sounds for review"""
sb = SoundBank('assets/sounds/')
print("🎵 Testing PICO Sound Bank\n")
categories = sb.get_available_sounds()
for category, sounds in categories.items():
print(f"\n=== {category.upper()} ===")
for sound in sorted(sounds):
print(f"\nPlaying: {sound}")
input("Press Enter to play...")
sb.play(sound)
time.sleep(2)
print("\n✅ Sound bank test complete!")
if __name__ == "__main__":
test_all_sounds()
Adding New Sounds
- Create sound following this guide
- Save to appropriate category folder
- Restart robot or call
sound_bank.load_all_sounds() - Test in context
- Document in sound_bank.py mapping if needed
Sound Mapping Configuration
emotion_sound_map.json
{
"emotions": {
"happy": {
"sounds": ["happy_chirp_01", "happy_chirp_02", "happy_chirp_03"],
"selection": "random_variation",
"category": "emotional"
},
"curious": {
"sounds": ["curious_hum_01", "curious_hum_02"],
"selection": "random_variation",
"category": "emotional"
},
"sad": {
"sounds": ["sad_whimper_01"],
"selection": "single",
"category": "emotional"
},
"excited": {
"sounds": ["excited_whistle_01", "excited_whistle_02"],
"selection": "random_variation",
"category": "emotional"
},
"loved": {
"sounds": ["loved_purr_01"],
"selection": "continuous",
"category": "emotional"
},
"sleepy": {
"sounds": ["sleepy_yawn_01"],
"selection": "single",
"category": "emotional"
}
},
"reactions": {
"greeting": {
"sounds": ["greeting_beep", "greeting_melody"],
"selection": "context_based",
"category": "reactions"
},
"acknowledgment": {
"sounds": ["acknowledgment_chirp"],
"selection": "single",
"category": "reactions"
},
"success": {
"sounds": ["success_ding"],
"selection": "single",
"category": "reactions"
},
"error": {
"sounds": ["error_buzz"],
"selection": "single",
"category": "reactions"
}
},
"ambient": {
"startup": {
"sounds": ["startup_beep"],
"selection": "single",
"category": "ambient"
},
"thinking": {
"sounds": ["thinking_hum"],
"selection": "looping",
"category": "ambient"
}
}
}
Resources
Free Sound Creation Tools
- Bfxr: https://www.bfxr.net/
- ChipTone: https://sfbgames.itch.io/chiptone
- Audacity: https://www.audacityteam.org/
- LMMS: https://lmms.io/ (music production)
Free Sound Libraries
- Freesound: https://freesound.org/
- Zapsplat: https://www.zapsplat.com/
- BBC Sound Effects: https://sound-effects.bbcrewind.co.uk/
- Sonniss GDC: https://sonniss.com/gameaudiogdc (annual free pack)
Learning Resources
- YouTube: "How to make robot sounds"
- YouTube: "Chiptune sound design tutorial"
- Game Audio 101: Tutorials on game sound design
- r/AudioPost: Reddit community for audio questions
Quick Start Checklist
Ready to create PICO's voice? Follow this:
- [ ] Install Audacity (or use Bfxr web app)
- [ ] Create Priority 1 sounds (12 essential sounds)
- [ ] Process all sounds through standard workflow
- [ ] Verify with verification script
- [ ] Test in simulation
- [ ] Create variations of frequently used sounds
- [ ] Add Priority 2 sounds for more personality
- [ ] Document any custom sounds in sound_bank.py
- [ ] Backup sound library
Conclusion
Creating PICO's sound bank is where its personality truly comes alive. Take your time with this - good sounds make the difference between a robot that feels mechanical and one that feels like a living companion.
Remember: PICO doesn't need to speak words to communicate. A well-crafted chirp can convey more emotion and personality than any sentence ever could.
Happy sound designing! 🎵🤖✨