Name: Interpret Sounds
Author: raphaelsalaja

スキルを検索.../

Interpret Sounds | Skills Pool

ffmpeg -i sprite.mp3 -ss <start_seconds> -t <duration_seconds> -acodec pcm_s16le -ar 44100 output/<name>.wav

ffmpeg -i sprite.mp3 -af silencedetect=noise=-40dB:d=0.05 -f null -

import numpy as np
from scipy.io import wavfile
from scipy.fft import rfft, rfftfreq

sample_rate, data = wavfile.read("sound.wav")
if data.ndim > 1:
    data = data[:, 0]

def analyze_slice(data, sample_rate, start_ms, window_ms=10):
    start = int(sample_rate * start_ms / 1000)
    end = start + int(sample_rate * window_ms / 1000)
    segment = data[start:end].astype(float)
    if len(segment) == 0:
        return None
    segment *= np.hanning(len(segment))
    spectrum = np.abs(rfft(segment))
    freqs = rfftfreq(len(segment), 1 / sample_rate)
    peak_idx = np.argmax(spectrum[1:]) + 1
    return freqs[peak_idx]

Parameter	Method	Maps to
Fundamental frequency	Peak FFT bin at onset (5-10ms in)	`source.frequency`
Pitch envelope	Compare frequency at 0ms, 5ms, 10ms, 20ms, 50ms	`source.frequency: { start, end }`
Active duration	Time above noise floor (-40dB threshold)	`envelope.decay` / `envelope.release`
Attack time	Onset to peak amplitude	`envelope.attack`
Sustain level	Amplitude ratio: sustain region vs peak	`envelope.sustain`
Harmonic content	Ratio of harmonic peaks to fundamental	Infer `source.type`
Peak amplitude	Max absolute sample value, normalized 0-1	`gain`
Stereo balance	L/R channel amplitude ratio	`pan` (-1 to 1)

Pattern	Source type
Fundamental only, harmonics < -40dB	`sine`
Odd harmonics rolling off as 1/n	`triangle`
All harmonics rolling off as 1/n	`sawtooth`
Odd harmonics at roughly equal amplitude	`square`
No clear harmonic structure, broadband energy	`noise`
Custom harmonic profile (none of the above)	`wavetable`

def classify_waveform(spectrum, freqs, fundamental_freq):
    harmonics = []
    for n in range(2, 9):
        target = fundamental_freq * n
        idx = np.argmin(np.abs(freqs - target))
        harmonics.append(spectrum[idx])
    fund_amp = spectrum[np.argmin(np.abs(freqs - fundamental_freq))]
    ratios = [h / fund_amp for h in harmonics]

    if all(r < 0.01 for r in ratios):
        return "sine"
    odd_only = all(ratios[i] < 0.05 for i in [0, 2, 4])
    if odd_only and ratios[1] > 0.05:
        return "square" if ratios[1] > 0.3 else "triangle"
    if all(r > 0.01 for r in ratios[:4]):
        return "sawtooth"
    return "wavetable"

Observation	Filter type	Maps to
High-frequency rolloff steeper than source would produce	`lowpass`	`filter: { type: "lowpass", frequency: <cutoff_hz> }`
Low-frequency rolloff	`highpass`	`filter: { type: "highpass", frequency: <cutoff_hz> }`
Narrow frequency band passes through	`bandpass`	`filter: { type: "bandpass", frequency: <center_hz> }`
Narrow frequency notch removed	`notch`	`filter: { type: "notch", frequency: <notch_hz> }`
Resonant peak near cutoff	High Q	`filter: { ..., resonance: <Q_value> }`

Slope (dB/octave)	Color	Maps to
~0 (flat)	`white`	`source: { type: "noise", color: "white" }`
~-3	`pink`	`source: { type: "noise", color: "pink" }`
~-6	`brown`	`source: { type: "noise", color: "brown" }`

Interpret Sounds

Pipeline

Step 1: Acquire source audio

Interpret Sounds

Pipeline

Step 1: Acquire source audio

Step 2: Extract individual sounds

Step 3: FFT analysis

Core parameters

Inferring oscillator type from harmonics

Noise color classification

Filter detection

Songsee

Video Frames

Gifgrep

Qqbot Media

Camsnap

Openai Whisper Api