vunb · February 14, 2021 18:04 · vunb · Oct 24, 2013
diff --git a/clean_audio.sh b/clean_audio.sh
 # Create background noise profile from mp3
 /usr/bin/sox noise.mp3 -n noiseprof noise.prof
 
 # Remove noise from mp3 using profile
 /usr/bin/sox input.mp3 output.mp3 noisered noise.prof 0.21
 
 # Remove silence from mp3
 /usr/bin/sox input.mp3 output.mp3 silence -l 1 0.3 5% -1 2.0 5%
 
 # Remove noise and silence in a single command
 /usr/bin/sox input.mp3 output.mp3 noisered noise.prof 0.21 silence -l 1 0.3 5% -1 2.0 5%
 
 # Batch process files
 /usr/bin/find . -type f -name "*.mp3" -mmin +30 -exec sox -S --multi-threaded -buffer 131072 {} /path/to/output/{} noisered noise.prof 0.21 silence -l 1 0.3 5% -1 2.0 5% \;
 
 # Remove insignificant files
 /usr/bin/find . -type f -name "*.mp3" -mmin +30 -size -500k -delete
diff --git a/CMU Sphinx - Speech Recognition b/CMU Sphinx - Speech Recognition
 CMU Sphinx 
 http://cmusphinx.sourceforge.net/wiki/tutorialam
 http://www.speech.cs.cmu.edu/sphinxman/scriptman1.html


 PocketSphinx
 http://ghatage.com/2012/12/voice-to-text-in-linux-using-pocketsphinx/
 http://ghatage.com/2012/12/make-pocketsphinx-recognize-new-words/



 Languague model Adaptation:
 http://pwnetics.wordpress.com/2011/07/01/sphinx-4-language-model-adaptation/
diff --git a/Convert to sphinxAudioFormat b/Convert to sphinxAudioFormat
 1. Convert wav sang định dạng chuẩn vào của sphinx:
 Input File     : 'resampled.wav'
 Channels       : 1
 Sample Rate    : 16000
 Precision      : 16-bit
 Duration       : 00:00:02.62 = 41878 samples ~ 196.303 CDDA sectors
 Sample Encoding: 16-bit Signed Integer PCM


 2. Lệnh chuyển đổi 1 file:
 Run: sox [input.wav] -r 16k -e signed -b 16 -c 1 [output.wav]
 Short: sox [input.wav] -r 16k [output.wav]


 Before: 

 [vi@Manlab wav]$ file khong8k.wav 
 KHOONG0010.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 8000 Hz

 [vi@Manlab wav]$ soxi khong8k.wav 

 Input File     : 'khong8k.wav'
 Channels       : 1
 Sample Rate    : 8000
 Precision      : 16-bit
 Duration       : 00:00:02.62 = 20939 samples ~ 196.303 CDDA sectors
 Sample Encoding: 16-bit Signed Integer PCM

 Full command in-process:

 [vi@Manlab wav]$ sox khong8k.wav -r 16k -e signed -b 16 -c 1 khong16k.wav

 For short with the input above:

 [vi@Manlab wav]$ sox khong8k.wav -r 16k khong16k.wav


 After:

 [vi@Manlab wav]$ file khong16k.wav 
 KHONG16k.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 16000 Hz

 [vi@Manlab wav]$ soxi khong16k.wav 

 Input File     : 'khong16k.wav'
 Channels       : 1
 Sample Rate    : 16000
 Precision      : 16-bit
 Duration       : 00:00:02.62 = 41878 samples ~ 196.303 CDDA sectors
 Sample Encoding: 16-bit Signed Integer PCM


 2. Shell batch:

 [vi@Manlab wav]$ for i in test/* ; do echo $i ; done;
diff --git a/etc - rename config file.sh b/etc - rename config file.sh
 for i in huanluyen_diadiem* ; do mv $i ${i:10}  ; done;
diff --git a/pocketsphinx_continuous b/pocketsphinx_continuous
 Lỗi không mở được thiết bị thu âm khi sử dụng pocketsphinx_continuous:

 INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
 INFO: continuous.c(367): pocketsphinx_continuous COMPILED ON: Apr  3 2012, AT: 17:50:38

 ad_oss.c(103): Failed to open audio device(/dev/dsp): No such file or directory
 FATAL_ERROR: "continuous.c", line 246: Failed to open audio device


 Solutions:
 1. Install alsa development package and recompile sphinxbase
 Run: yum install alsa-*

 2. If still get the message error: ad_oss.c(103): Failed to open audio device(/dev/dsp): No such file or directory
 Then run: "modprobe snd_pcm_oss" as root

 3. If still get another message error: ad_oss.c(99): Audio device(/dev/dsp) busy
 Then turn off all of applications are recording and using audio device
diff --git a/sox - noise removal b/sox - noise removal
 noiseprof [profile-file]

 Calculate a profile of the audio for use in noise reduction. See the description of the noisered effect for details.

 noisered [profile-file [amount]]

 Reduce noise in the audio signal by profiling and filtering. This effect is moderately effective at removing consistent background noise such as hiss or hum. To use it, first run SoX with the noiseprof effect on a section of audio that ideally would contain silence but in fact contains noise - such sections are typically found at the beginning or the end of a recording. noiseprof will write out a noise profile to profile-file, or to stdout if no profile-file or if ‘−’ is given. E.g.

   sox speech.wav −n trim 0 1.5 noiseprof speech.noise-profile

 To actually remove the noise, run SoX again, this time with the noisered effect; noisered will reduce noise according to a noise profile (which was generated by noiseprof), from profile-file, or from stdin if no profile-file or if ‘−’ is given. E.g.

   sox speech.wav cleaned.wav noisered speech.noise-profile 0.3

 How much noise should be removed is specified by amount-a number between 0 and 1 with a default of 0.5. Higher numbers will remove more noise but present a greater likelihood of removing wanted components of the audio signal. Before replacing an original recording with a noise-reduced version, experiment with different amount values to find the optimal one for your audio; use headphones to check that you are happy with the results, paying particular attention to quieter sections of the audio.

 On most systems, the two stages - profiling and reduction - can be combined using a pipe, e.g.

   sox noisy.wav −n trim 0 1 noiseprof | play noisy.wav noisered
diff --git a/sphinx4Normalize.sh b/sphinx4Normalize.sh
 #!/usr/bin/env bash
 usage="Help, usage: sphinx4Normalize -i /path/to/audio/input/ -o /path/to/audio/output/ [-t wav|mp3]";
 # lay so luong tham so thong qua bien $# 
 if [ $# -eq 0 ]
  then
    echo $usage;
    exit 128;
 fi

 # Duyet danh sach tham so, su dung bien: $@
 intput=""
 output=""
 fileout=""
 type=wav

 while [ "$1" != "" ]; do
    case $1 in
        -i |-di| --input )	shift
                                input=$1
                                ;;
        -o|-do| --output )	shift
        			  output=$1
                                ;;
        -t| --type )	shift
       			  case $1 in
 	       			  wav|mp3)
 		        			  type=$1
 		                      ;;
 	                     esac
                    	 ;;
        -h | --help )  echo $usage
                                exit 0
                                ;;
        * )                     echo $usage
                                exit 1
    esac
    shift
 done

 if [ $input = '' ] || [ $output = '' ] ; then
 	echo $usage
 	exit 128;
 fi

 for i in $input/*$type ; do
 	fileout="$output/`basename $i`";
 	#echo $fileout;
 	echo "Processing $i";
 	sox $i -r 16k -e signed -b 16 -c 1 $fileout
 	echo "Output: $fileout";
 done

 echo "Complete!"
	# Create background noise profile from mp3
	/usr/bin/sox noise.mp3 -n noiseprof noise.prof

	# Remove noise from mp3 using profile
	/usr/bin/sox input.mp3 output.mp3 noisered noise.prof 0.21

	# Remove silence from mp3
	/usr/bin/sox input.mp3 output.mp3 silence -l 1 0.3 5% -1 2.0 5%

	# Remove noise and silence in a single command
	/usr/bin/sox input.mp3 output.mp3 noisered noise.prof 0.21 silence -l 1 0.3 5% -1 2.0 5%

	# Batch process files
	/usr/bin/find . -type f -name "*.mp3" -mmin +30 -exec sox -S --multi-threaded -buffer 131072 {} /path/to/output/{} noisered noise.prof 0.21 silence -l 1 0.3 5% -1 2.0 5% \;

	# Remove insignificant files
	/usr/bin/find . -type f -name "*.mp3" -mmin +30 -size -500k -delete
	CMU Sphinx
	http://cmusphinx.sourceforge.net/wiki/tutorialam
	http://www.speech.cs.cmu.edu/sphinxman/scriptman1.html


	PocketSphinx
	http://ghatage.com/2012/12/voice-to-text-in-linux-using-pocketsphinx/
	http://ghatage.com/2012/12/make-pocketsphinx-recognize-new-words/



	Languague model Adaptation:
	http://pwnetics.wordpress.com/2011/07/01/sphinx-4-language-model-adaptation/
	1. Convert wav sang định dạng chuẩn vào của sphinx:
	Input File : 'resampled.wav'
	Channels : 1
	Sample Rate : 16000
	Precision : 16-bit
	Duration : 00:00:02.62 = 41878 samples ~ 196.303 CDDA sectors
	Sample Encoding: 16-bit Signed Integer PCM


	2. Lệnh chuyển đổi 1 file:
	Run: sox [input.wav] -r 16k -e signed -b 16 -c 1 [output.wav]
	Short: sox [input.wav] -r 16k [output.wav]


	Before:

	[vi@Manlab wav]$ file khong8k.wav
	KHOONG0010.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 8000 Hz

	[vi@Manlab wav]$ soxi khong8k.wav

	Input File : 'khong8k.wav'
	Channels : 1
	Sample Rate : 8000
	Precision : 16-bit
	Duration : 00:00:02.62 = 20939 samples ~ 196.303 CDDA sectors
	Sample Encoding: 16-bit Signed Integer PCM

	Full command in-process:

	[vi@Manlab wav]$ sox khong8k.wav -r 16k -e signed -b 16 -c 1 khong16k.wav

	For short with the input above:

	[vi@Manlab wav]$ sox khong8k.wav -r 16k khong16k.wav


	After:

	[vi@Manlab wav]$ file khong16k.wav
	KHONG16k.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 16000 Hz

	[vi@Manlab wav]$ soxi khong16k.wav

	Input File : 'khong16k.wav'
	Channels : 1
	Sample Rate : 16000
	Precision : 16-bit
	Duration : 00:00:02.62 = 41878 samples ~ 196.303 CDDA sectors
	Sample Encoding: 16-bit Signed Integer PCM


	2. Shell batch:

	[vi@Manlab wav]$ for i in test/* ; do echo $i ; done;
	Lỗi không mở được thiết bị thu âm khi sử dụng pocketsphinx_continuous:

	INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
	INFO: continuous.c(367): pocketsphinx_continuous COMPILED ON: Apr 3 2012, AT: 17:50:38

	ad_oss.c(103): Failed to open audio device(/dev/dsp): No such file or directory
	FATAL_ERROR: "continuous.c", line 246: Failed to open audio device


	Solutions:
	1. Install alsa development package and recompile sphinxbase
	Run: yum install alsa-*

	2. If still get the message error: ad_oss.c(103): Failed to open audio device(/dev/dsp): No such file or directory
	Then run: "modprobe snd_pcm_oss" as root

	3. If still get another message error: ad_oss.c(99): Audio device(/dev/dsp) busy
	Then turn off all of applications are recording and using audio device
	noiseprof [profile-file]

	Calculate a profile of the audio for use in noise reduction. See the description of the noisered effect for details.

	noisered [profile-file [amount]]

	Reduce noise in the audio signal by profiling and filtering. This effect is moderately effective at removing consistent background noise such as hiss or hum. To use it, first run SoX with the noiseprof effect on a section of audio that ideally would contain silence but in fact contains noise - such sections are typically found at the beginning or the end of a recording. noiseprof will write out a noise profile to profile-file, or to stdout if no profile-file or if ‘−’ is given. E.g.

	sox speech.wav −n trim 0 1.5 noiseprof speech.noise-profile

	To actually remove the noise, run SoX again, this time with the noisered effect; noisered will reduce noise according to a noise profile (which was generated by noiseprof), from profile-file, or from stdin if no profile-file or if ‘−’ is given. E.g.

	sox speech.wav cleaned.wav noisered speech.noise-profile 0.3

	How much noise should be removed is specified by amount-a number between 0 and 1 with a default of 0.5. Higher numbers will remove more noise but present a greater likelihood of removing wanted components of the audio signal. Before replacing an original recording with a noise-reduced version, experiment with different amount values to find the optimal one for your audio; use headphones to check that you are happy with the results, paying particular attention to quieter sections of the audio.

	On most systems, the two stages - profiling and reduction - can be combined using a pipe, e.g.

	sox noisy.wav −n trim 0 1 noiseprof \| play noisy.wav noisered
	#!/usr/bin/env bash
	usage="Help, usage: sphinx4Normalize -i /path/to/audio/input/ -o /path/to/audio/output/ [-t wav\|mp3]";
	# lay so luong tham so thong qua bien $#
	if [ $# -eq 0 ]
	then
	echo $usage;
	exit 128;
	fi

	# Duyet danh sach tham so, su dung bien: $@
	intput=""
	output=""
	fileout=""
	type=wav

	while [ "$1" != "" ]; do
	case $1 in
	-i \|-di\| --input ) shift
	input=$1
	;;
	-o\|-do\| --output ) shift
	output=$1
	;;
	-t\| --type ) shift
	case $1 in
	wav\|mp3)
	type=$1
	;;
	esac
	;;
	-h \| --help ) echo $usage
	exit 0
	;;
	* ) echo $usage
	exit 1
	esac
	shift
	done

	if [ $input = '' ] \|\| [ $output = '' ] ; then
	echo $usage
	exit 128;
	fi

	for i in $input/*$type ; do
	fileout="$output/`basename $i`";
	#echo $fileout;
	echo "Processing $i";
	sox $i -r 16k -e signed -b 16 -c 1 $fileout
	echo "Output: $fileout";
	done

	echo "Complete!"