Skip to content

Instantly share code, notes, and snippets.

@bseddon
Created June 4, 2025 09:14
Show Gist options
  • Save bseddon/5e5a3497412dda6f56497bbee9718b31 to your computer and use it in GitHub Desktop.
Save bseddon/5e5a3497412dda6f56497bbee9718b31 to your computer and use it in GitHub Desktop.
Splitting audio files in silent spaces using PHP
<?php
/*
* This file is part of PHP-FFmpeg.
*
* (c) Alchemy <[email protected]>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace FFMpeg\Format\Audio;
require_once __DIR__.'/SilenceDetectListener.php';
use FFMpeg\Format\Listener\SilenceDetectListener;
use Evenement\EventEmitter;
use FFMpeg\Exception\InvalidArgumentException;
use FFMpeg\FFProbe;
use FFMpeg\Format\AudioInterface;
use FFMpeg\Format\ProgressableInterface;
use FFMpeg\Media\MediaTypeInterface;
class NullAudio extends EventEmitter implements AudioInterface, ProgressableInterface
{
/** @var string */
protected $audioCodec = null;
/** @var int */
protected $audioKiloBitrate = null;
/** @var int */
protected $audioChannels = null;
public $silences = [];
/**
* {@inheritDoc}
*/
public function getAvailableAudioCodecs()
{
return [];
}
/**
* {@inheritdoc}
*/
public function getExtraParams()
{
return ['-af', 'silencedetect=n=-50dB:d=1', '-f', 'null'];
}
/**
* {@inheritdoc}
*/
public function getAudioCodec()
{
return $this->audioCodec;
}
/**
* Not supported.
*
* @param string $audioCodec
*
*/
public function setAudioCodec($audioCodec)
{
// $this->audioCodec = $audioCodec;
return $this;
}
/**
* {@inheritdoc}
*/
public function getAudioKiloBitrate()
{
return $this->audioKiloBitrate;
}
/**
*Not supported.
*
* @param int $kiloBitrate
*
* @throws InvalidArgumentException
*/
public function setAudioKiloBitrate($kiloBitrate)
{
return $this;
}
/**
* {@inheritdoc}
*/
public function getAudioChannels()
{
return $this->audioChannels;
}
/**
* Not suported.
*
* @param int $channels
*
* @throws InvalidArgumentException
*/
public function setAudioChannels($channels)
{
return $this;
}
/**
* {@inheritdoc}
*/
public function createProgressListener(MediaTypeInterface $media, FFProbe $ffprobe, $pass, $total, $duration = 0)
{
$format = $this;
$listener = new SilenceDetectListener($media, $format);
$listener->on('silence', function () use ($media, $format) {
// $format->emit('progress', array_merge([$media, $format], func_get_args()));
echo('Silence detected in ' . $media->getPathfile());
});
return [$listener];
}
/**
* {@inheritDoc}
*/
public function getPasses()
{
return 1;
}
}

At the moment OpenAI's Whisper service limits the size of files to be transcribed to 20MB. That's about 20-25 minutes at a 192K bit rate. To transcribe bigger audio files its then necessary to split them. The OpenAI page about Whisper recommends the file is split during moments of silence so sentences are not split across two files

FFMPEG is able to both detect silences and split files and doing this manually is easy. This repository implements a great wrapper around FFMPEG for PHP that support programmatic control of FFMPEG and this is really useful for anyone responsible for a WordPress site.

The repository code surfaces the FFMPEG features needed to splt files but not its silence detection capability. This Gist provides two files each containing a class that make it possible to capture silence information. I'm not an expert on the repo code so there may be a better simpler way but it works for me. This Gist says a few things about the files and provides an example of using them.

The wrapper capture output from FFMPEG as it runs and provides that information as a set of event data that classes can register to listen for. SilenceDetectListener.php implements a class to handle FFMPEG output about silence information within an audio file. Silence information is returned as pairs of lines, the first is the time code of the start of the silence, the second the end and duration.

[silencedetect @ 000001daa80cb5c0] silence_start: 555.333016
[silencedetect @ 000001daa80cb5c0] silence_end: 563.114649 | silence_duration: 7.781633

The code includes a regular expression to parse an arbitrary number of these pairs to create a set of records that are stored in a format which is a segue to the second file. NullAudio.php implements a class that can be a 'format' required by the repo code's save function. The save function activates FFMPEG to, for example, transcode to a different format say MP3 to WMA so the format implementation represents the output format expected. In the case of silence detection, there is no output audio but even so, the repo code save function requires a format so this class implements a null format. The two other essential things it does is:

  • register the silence detection class and
  • define the specific noise detection parameters to be used In this implementation of the NullAudio class, the parameters are hard-coded but the class can be modified to take parameters in a constructor or via a 'set' method.
require_once 'misc_pages\audio\NullAudio.php';

// Instantiate the main class and open an audio file
$ffmpeg = FFMpeg\FFMpeg::create();
$audio = $ffmpeg->open( $filePath );

// NullAudio is a custom format that does not actually save the audio, but allows us to process it and detect silences
$format = new FFMpeg\Format\Audio\NullAudio();
// Using this format, the command run will be
// ffmpeg -i $filePath -af silencedetect=n=-50dB:d=1 -f null -
// Look in the NullAudio class 
$audio->save( $format, '-');

echo print_r( $format->silences, true );
<?php
namespace FFMpeg\Format\Listener;
use Alchemy\BinaryDriver\Listeners\ListenerInterface;
use Evenement\EventEmitter;
use FFMpeg\Exception\RuntimeException;
use FFMpeg\Format\Audio\NullAudio;
use FFMpeg\Media\AbstractStreamableMedia;
/**
* Parses ffmpeg silencedetect information. An example:
*
* <pre>
* [silencedetect @ 000001daa80cb5c0] silence_start: 555.333016
* [silencedetect @ 000001daa80cb5c0] silence_end: 563.114649 | silence_duration: 7.781633
* </pre>
*
* @author Robert Gruendler <[email protected]>
*/
class SilenceDetectListener extends EventEmitter implements ListenerInterface
{
/**
* @type AbstractStreamableMedia $media
*/
public $media = null;
/**
* @type NullAudio $audioFormat
*/
public $audioFormat = null;
/**
* SilenceDetectProgressListener constructor.
*/
function __construct( $media, NullAudio $audioFormat = null )
{
if ( ! $media instanceof AbstractStreamableMedia )
{
throw new RuntimeException( 'SilenceDetectProgressListener requires a valid FFMpeg instance.' );
}
if ( ! $audioFormat instanceof NullAudio )
{
throw new RuntimeException( 'SilenceDetectProgressListener requires a NullAudio instance.' );
}
$this->media = $media;
$this->audioFormat = $audioFormat;
}
public function getPattern()
{
return '/\[silencedetect.*?\]\s(silence_start:\s(?<start>.*))|(silence_end:\s(?<end>.*)\s\|\ssilence_duration:\s(?<duration>.*))/';
}
/**
* {@inheritdoc}
*/
public function handle($type, $data)
{
$matches = [];
if ( preg_match_all( $this->getPattern(), $data, $matches ) === false )
{
return null;
}
if ( ! $this->media )
{
return null;
}
// Each odd index in $matches corresponds to a silence_start, and each even index corresponds to a silence_end.
// We will create a list of silences with start, end, and duration.
foreach ( $matches['start'] as $index => $start )
{
// If it's an even index, skip
if ( $index % 2 !== 0 )
{
continue;
}
if ( isset( $matches['end'][ $index + 1 ] ) && isset( $matches['duration'][ $index + 1] ) )
{
$this->audioFormat->silences[] = [
'start' => (float) $start,
'end' => (float) $matches['end'][ $index + 1],
'duration' => (float) $matches['duration'][ $index + 1 ],
];
}
}
}
/**
* {@inheritdoc}
*/
public function forwardedEvents()
{
return [];
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment