Audio Engineering Audio engineering — mastering, mixing, EQ, compression, loudness standards, synthesis, podcast production, music theory, spectrum analysis.
分類 媒體 Built on research from 30 Music cluster projects, 360 PNW musicians deep-dived (S36/SPS), Ableton Live research (ABL), Deep Audio (DAA), Dead Frequencies (DFQ), and High Fidelity amplifier analysis (HFR/HFE).
Expert-level audio engineering covering mastering, mixing, loudness standards, synthesis, podcast production, music theory, and spectrum analysis. Works alongside the ffmpeg-media skill for codec/format operations.
Loudness Standards
Platform Target LUFS True Peak Standard Spotify -14 LUFS -1 dBTP AES streaming Apple Music -16 LUFS -1 dBTP Sound Check YouTube -14 LUFS -1 dBTP ITU-R BS.1770 Podcast (Apple) -16 LUFS -1 dBTP Apple spec Podcast (Spotify)
npx skillvault add Tibsfox/tibsfox-gsd-skill-creator-examples-skills-media-audio-engineering-skill-md
星標 51
更新時間 2026年4月14日
職業 Broadcast TV -24 LUFS -2 dBTP EBU R128
Broadcast US -24 LKFS -2 dBTP ATSC A/85
CD master -9 to -12 LUFS -0.3 dBTP Red Book
Film/Cinema -24 LUFS -1 dBTP SMPTE RP 200
Measurement Commands # Measure integrated loudness (LUFS) with ffmpeg
ffmpeg -i input.wav -af loudnorm=print_format=json -f null - 2>&1 | grep -A20 "Parsed_loudnorm"
# Full EBU R128 scan
ffmpeg -i input.wav -af ebur128=peak=true -f null - 2>&1 | tail -20
# Loudness normalization to -14 LUFS (two-pass for accuracy)
# Pass 1: measure
ffmpeg -i input.wav -af loudnorm=I=-14:LRA=11:TP=-1:print_format=json -f null - 2>&1 > /tmp/loudnorm.json
# Pass 2: apply (use measured values from pass 1)
ffmpeg -i input.wav -af loudnorm=I=-14:LRA=11:TP=-1:measured_I=-18.5:measured_LRA=9.2:measured_TP=-0.5:measured_thresh=-28.3 output.wav
With sox # Normalize peak to -1 dBFS
sox input.wav output.wav gain -n -1
# Compressor (threshold -20dB, ratio 4:1, attack 5ms, release 50ms)
sox input.wav output.wav compand 0.005,0.05 -20,-20,-10,-10,0,-6
# 3-band EQ (low shelf +3dB at 200Hz, mid cut -2dB at 2kHz, high shelf +1dB at 8kHz)
sox input.wav output.wav bass +3 200 equalizer 2000 1q -2 treble +1 8000
# Noise reduction (profile then reduce)
sox noisy.wav -n noiseprof /tmp/noise.prof
sox noisy.wav clean.wav noisered /tmp/noise.prof 0.21
# Generate tone (440Hz sine, 3 seconds)
sox -n -r 44100 -c 1 tone.wav synth 3 sine 440
# Spectrum analysis (generate spectrogram PNG)
sox input.wav -n spectrogram -o spectrum.png
Mastering Chain
Standard Mastering Signal Flow Input → EQ (corrective) → Compression → EQ (tonal) → Stereo Width → Limiting → Dithering → Output
With ffmpeg Filters # Full mastering chain: EQ → compression → limiting → loudness normalization
ffmpeg -i mix.wav -af "\
equalizer=f=80:t=h:w=100:g=2,\
equalizer=f=3000:t=h:w=1000:g=-1.5,\
equalizer=f=12000:t=h:w=2000:g=1,\
acompressor=threshold=-18dB:ratio=3:attack=10:release=100:knee=6,\
alimiter=limit=-1dBFS:level=false,\
loudnorm=I=-14:LRA=11:TP=-1\
" -ar 44100 -sample_fmt s16 mastered.wav
# Dithering (16-bit with triangular dither for CD)
ffmpeg -i master_24bit.wav -af "dither=method=triangular" -sample_fmt s16 -ar 44100 cd_master.wav
EQ Reference
Frequency Bands and Characteristics Band Range Character Common Uses Sub-bass 20-60 Hz Felt, not heard Kick fundamental, sub bass Bass 60-250 Hz Warmth, body Bass guitar, kick punch, vocal warmth Low-mid 250-500 Hz Muddiness zone Cut here to clean up mixes Mid 500-2000 Hz Body, presence Vocal clarity, guitar body Upper-mid 2-4 kHz Presence, bite Vocal intelligibility, guitar attack Presence 4-6 kHz Definition, edge Consonant clarity, string attack Brilliance 6-12 kHz Air, shimmer Cymbals, vocal air, acoustic sparkle Ultra-high 12-20 kHz Air, sparkle Subtle sheen (careful: sibilance)
Common Problem Frequencies
200-300 Hz — boominess in vocals, acoustic guitar
400-600 Hz — cardboard/boxy sound
1-2 kHz — nasal, telephone quality
3-5 kHz — harshness, listening fatigue
6-8 kHz — sibilance (de-ess here)
Compression Reference
Settings by Source Source Threshold Ratio Attack Release Knee Vocals -18 to -12 dB 2:1 to 4:1 5-15 ms 40-80 ms Soft Drums (bus) -15 to -10 dB 3:1 to 6:1 10-30 ms 50-100 ms Hard Bass -15 to -8 dB 3:1 to 8:1 10-30 ms 100-200 ms Hard Acoustic guitar -20 to -12 dB 2:1 to 4:1 10-25 ms 100-150 ms Soft Mix bus -20 to -15 dB 1.5:1 to 2:1 10-30 ms 100-300 ms Soft Podcast -20 to -15 dB 3:1 to 5:1 5-10 ms 50-100 ms Soft
Compression Types
VCA — fast, transparent, precise (SSL, dbx 160)
Optical — smooth, musical, slow (LA-2A, CL 1B)
FET — aggressive, colorful, fast (1176, Distressor)
Variable-mu — warm, glue, gentle (Fairchild 670, Manley Vari-Mu)
Synthesis Reference
Synthesis Types Type How It Works Character Classic Synths Subtractive Oscillator → Filter → Amplifier Warm, analog, rich Minimoog, Prophet-5, Juno-106 FM Operators modulating each other's frequency Metallic, bell-like, bright DX7, FM8 Wavetable Morphing between stored waveforms Evolving, complex, modern PPG Wave, Serum, Vital Granular Tiny audio grains layered and scattered Atmospheric, textural, ambient Granulator, Pigments Additive Sum of individual sine wave partials Precise, organ-like Kawai K5, Razor Physical modeling Mathematical model of physical instrument Realistic, expressive Chromaphone, Pianoteq Sample-based Recorded audio, pitch-shifted and layered Realistic, natural Kontakt, Sampler
ADSR Envelope Quick Reference
Pad : A=500ms, D=200ms, S=0.8, R=1000ms
Pluck : A=1ms, D=200ms, S=0, R=100ms
Bass : A=5ms, D=100ms, S=0.6, R=50ms
Lead : A=10ms, D=50ms, S=0.7, R=200ms
Kick drum : A=0ms, D=150ms, S=0, R=50ms
Music Theory Quick Reference
Circle of Fifths (Major Keys)
Common Chord Progressions Name Numerals Example in C Use Pop I-V-vi-IV C-G-Am-F 80% of pop music Blues I-IV-V C-F-G Blues, rock Jazz ii-V-I ii-V-I Dm7-G7-Cmaj7 Jazz standard Andalusian i-VII-VI-V Am-G-F-E Flamenco, dramatic Canon I-V-vi-iii-IV-I-IV-V C-G-Am-Em-F-C-F-G Pachelbel, ballads Minor blues i-iv-V Am-Dm-E Minor blues
Scales
Major (Ionian) : W-W-H-W-W-W-H
Natural Minor (Aeolian) : W-H-W-W-H-W-W
Pentatonic Major : 1-2-3-5-6
Pentatonic Minor : 1-b3-4-5-b7
Blues : 1-b3-4-#4-5-b7
Dorian : W-H-W-W-W-H-W (minor with raised 6th — jazz, funk)
Mixolydian : W-W-H-W-W-H-W (major with flat 7th — blues rock)
Podcast Production Workflow
Recording # Record from default mic (sox)
sox -d -r 44100 -c 1 -b 16 recording.wav
# Record with ffmpeg (specify ALSA device on Linux)
ffmpeg -f alsa -i default -ar 44100 -ac 1 recording.wav
Processing Chain # 1. Noise reduction
sox recording.wav -n trim 0 0.5 noiseprof /tmp/noise.prof
sox recording.wav clean.wav noisered /tmp/noise.prof 0.21
# 2. Normalize + compress + EQ for voice
ffmpeg -i clean.wav -af "\
highpass=f=80,\
lowpass=f=12000,\
equalizer=f=3000:t=h:w=1000:g=2,\
acompressor=threshold=-20dB:ratio=4:attack=5:release=50,\
loudnorm=I=-16:TP=-1\
" -ar 44100 podcast_ready.wav
# 3. Export MP3 for distribution
ffmpeg -i podcast_ready.wav -c:a libmp3lame -b:a 128k \
-metadata title="Episode Title" \
-metadata artist="Show Name" \
-metadata album="Podcast Name" \
-metadata genre="Podcast" \
episode.mp3
# 4. Generate waveform for show notes
ffmpeg -i episode.mp3 -filter_complex "showwavespic=s=1920x200:colors=0x1a1a2e" -frames:v 1 waveform.png
# Set all metadata
ffmpeg -i episode.mp3 -c copy \
-metadata title="EP 42: The Memory Architecture" \
-metadata artist="GSD Podcast" \
-metadata album="Getting Shit Done" \
-metadata track="42" \
-metadata date="2026" \
-metadata comment="LOD-tiered memory system deep dive" \
tagged.mp3
BPM and Key Detection
With ffmpeg/aubio # Install aubio for beat/pitch detection
# apt install aubio-tools
# BPM detection
aubiotempo input.wav
# Pitch/key detection
aubiopitch -i input.wav -p yinfft
# Onset detection (transient markers)
aubioonset input.wav
With sox # Generate stats (includes RMS, peak, DC offset)
sox input.wav -n stats 2>&1
Sample Rate / Bit Depth Reference Format Sample Rate Bit Depth Use CD 44.1 kHz 16-bit Consumer playback DVD 48 kHz 24-bit Video soundtrack Hi-Res 96 kHz 24-bit Audiophile streaming Studio 96-192 kHz 32-bit float Recording/mixing Podcast 44.1 kHz 16-bit Voice distribution Phone/VoIP 8-16 kHz 16-bit Voice calls
Conversion # Downsample from 96kHz/24-bit to 44.1kHz/16-bit with dither
sox input_96_24.wav -r 44100 -b 16 output_441_16.wav dither -s
# Same with ffmpeg
ffmpeg -i input_96_24.wav -ar 44100 -sample_fmt s16 -af "dither=method=triangular" output.wav
ffmpeg-media — codec/format operations, video+audio conversion
ffmpeg-processor agent — media processing specialist
gource-visualizer — repository visualization with audio sync capability
Audio research: ABL, DAA, DFQ, HFR, HFE, S36/SPS (360 musicians)
When This Skill Activates
Audio mastering, mixing, EQ, compression
Loudness measurement and normalization (LUFS, EBU R128)
Podcast recording, editing, production
Music theory questions (chords, scales, progressions)
Synthesis design (FM, subtractive, granular, wavetable)
Spectrum analysis and audio visualization
Sample rate/bit depth conversion
Noise reduction and audio cleanup
BPM/key detection
02
Target Levels by Platform
Audio Engineering | Skills Pool