SilentR is a professional-grade web application designed to streamline the post-production workflow for podcasters and audio engineers. By automating the tedious process of silence removal, it creates "jump-cut" style edits across hours of audio in minutes. Unlike simple tools, SilentR offers a cinematic, non-linear editor experience right in the browser.
Core Capabilities
- Smart Silence Detection: Utilizes a custom algorithm to analyze the audio noise floor and detect silence with user-configurable thresholds (dB) and duration (ms).
- Cinematic Visualizer: A high-performance waveform viewer that supports zooming down to the sample level and smooth scrolling, optimized for long recordings.
- Non-Destructive Editing: All cuts are virtual. Users can adjust "Keep" and "Remove" regions endlessly before finalizing the export.
- Batch Processing: A manifest-based system allows users to upload a JSON file containing metadata for hundreds of tracks, which are then processed in parallel workers.
- Multi-Format/Multi-Track: Support for stereo, mono, and multi-channel audio files across MP3, WAV, AAC, and OGG formats.
Technical Deep Dive
SilentR pushes the boundaries of what is possible in a web browser, leveraging modern Web APIs to keep data local and secure.
Browser-Based Audio Processing
To ensure privacy and avoid massive server bandwidth costs, SilentR processes audio entirely on the client side using WebAssembly (WASM).
- FFmpeg.wasm: We compiled the legendary FFmpeg library to WASM to handle file decoding and encoding. This allows the app to support virtually any audio format without sending a single byte to a server.
- AudioWorklet: Use of the AudioWorklet API allows for glitch-free audio processing on a separate thread from the main UI, ensuring the interface remains responsive even during heavy computation.
Waveform Rendering Engine
Rendering a waveform for a 2-hour audio file is memory-intensive.
- Canvas Optimization: We use an offscreen canvas technique to pre-render the waveform in chunks. As the user scrolls, we only draw the visible chunks.
- Peak Decimation: Instead of rendering every sample, we calculate min/max peaks for different zoom levels (LOD - Level of Details), reducing the dataset size by 99% for zoomed-out views while maintaining visual accuracy.
Application Architecture
- State Management: Built with Zustand to manage the complex state of the editor (cursor position, selection regions, zoom level, history stack).
- SPA Performance: The app is a Single Page Application (SPA) optimized with Vite. We use aggressive code splitting to load the heavy audio processing modules only when a file is actually imported.
Technology Stack
- Frontend: React, TypeScript, Vite, TailwindCSS
- Audio Core: Web Audio API, AudioWorklets, FFmpeg.wasm
- Visualization: HTML5 Canvas, React Konva (for overlay editors)
- State: Zustand, Immer
- Deployment: Vercel (Edge Network)
Challenges & Solutions
Challenge: Memory limits in the browser (Tab crash).
Solution: Browsers cap memory usage per tab (often ~2GB). Attempting to load a decoded 3-hour WAV file into memory as a Float32Array can crash the tab. We implemented a "Streaming Decode" approach where we process the file in 30-second buffers, calculate the silence regions, and then discard the raw PCM data, storing only the metadata. We only re-decode specific chunks when the user hits "Play".
Challenge: Synchronization between Visuals and Audio.
Solution: The AudioContext.currentTime is accurate but updates asynchronously from the requestAnimationFrame loop. We created a custom hook useAudioSync that interpolates the playhead position between frames to ensure silky-smooth cursor movement that perfectly aligns with the sound, essential for precise editing.