Skip to content

Instantly share code, notes, and snippets.

@dxrcy
Created May 25, 2023 02:05
Show Gist options
  • Save dxrcy/fff85c79d4977b8b705e09bfab4183d0 to your computer and use it in GitHub Desktop.
Save dxrcy/fff85c79d4977b8b705e09bfab4183d0 to your computer and use it in GitHub Desktop.
Extract audio from every Powerpoint file in a folder.
#!/bin/sh
# === PPTX Audio ===
# Extract audio from every Powerpoint file in a folder, and concatenates into one file.
# Requires `ffmpeg` to be installed.
# See config below for directories configuration.
# Only compatible with `m4a` MPEG format for output, as that is what Powerpoint uses
# Biggest speed loss is using the `unzip` command
# =============== CONFIG =================
# Directory of inputs pptx files
in='pptx'
# Output file for audio
# Must be m4a, unless ffmpeg command changed
out='audio.m4a'
# Temporary file directory
temp='/tmp/pptx-audio'
# ========================================
# Clean temp dir
[ -d "$temp" ] && rm -r "$temp"
mkdir "$temp"
mkdir "$temp/pptx" # Extracted pptx data
mkdir "$temp/audio" # All audio files
# Unzip files, save audio paths to file for ffmpeg to read
echo 'Extracting audio from all pptx files...'
inputlist=""
for pptx in $in/*.pptx; do
# Get name, without path or file extension, replaced spaces with underscores
name=$(basename -- "$pptx")
name=${name%.*}
name=${name// /_}
# Unzip pptx to temp folder
echo 'Unzipping pptx file...'
unzip "$pptx" -d "$temp/pptx" \
|| break
echo 'Moving out all audio files.'
# Find all audio files (m4a)
for audio in $temp/pptx/ppt/media/*.m4a; do
# Move audio file to temporary folder
id=$(basename -- "$audio")
mv "$audio" "$temp/audio/${name}_$id"
done
# Delete extracted folder, to save disk space during process
rm -r "$temp/pptx"
done
# Create audio inputs file for ffmpeg
echo 'Including all audio files as inputs.'
inputlist=""
for audio in $temp/audio/*; do
# Set path relative to audio.txt file
audio=${audio#$temp/}
echo "including: $audio"
# Add to list of inputs
inputlist+="file '$audio'\n"
done
echo -e "$inputlist" > "$temp/audio.txt"
# Run ffmpeg command, with audio inputs from text file
# `-c copy` tells ffmpeg to copy data, without re-encoding (same codec) (very fast!)
echo 'Concatenating audio files...'
ffmpeg -y \
-f concat \
-i "$temp/audio.txt" \
-c copy \
"$out" \
&& echo "Saved to $out"
# Remove temp dir
rm -r "$temp"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment