Skip to content

Instantly share code, notes, and snippets.

@PatheticMustan
Last active May 5, 2025 19:39
Show Gist options
  • Save PatheticMustan/b7c937a5eec3994ad531d347ff8dc274 to your computer and use it in GitHub Desktop.
Save PatheticMustan/b7c937a5eec3994ad531d347ff8dc274 to your computer and use it in GitHub Desktop.
What are computers for?
- entertainment
- communication
- navigation
- science
+ unlocks whole new frontier in what is calculatable, what is doable
+ "big data", but slightly irrelevant to topic
Why do Computers need to understand language?
- Just like we use language to communicate meaning/intent to other people, language is one of the ways we can interact with our tools/computers
- Computers can even be a language aid
+ language translation
+ emails, voice transcription
+ voice recording/sending
+ hearing aids
- or we can use it to turn natural language into computer queries to make certain tasks easier (information/action)
+ "What's the temperature at 2pm tomorrow?"
+ "What time is the next bus?"
+ "When does Presti's Bakery close on Wednesday?"
+ "Call Mom"
+ "Text Dad to buy cat food"
+ "Turn on the kitchen lights"
Popular Example
- Virtual Assistants (siri/alexa/google assistant)
- watson
Speech Synthesis
- vocaloids
Natural Language Processing (NLP)
- google translate
- turning voice sample into machine queries
- understanding user intent
Speech Recognition
Difficulties with V
What is Synthetic Speech? History
- text to speech
Common Types
- recording words, putting them together
- recording individual phonetics
- miku --> vocaloids
- how does siri work?
- using ML to create realistic voices
- SOTA ML to recreate voice from just sample
+ https://vocloner.com/tts.php?voiceid=028fcee5c13c45c49d182566cc6d5f8c
+ show live demo?
Where is it commonly used?
- trains
- smart assistants (hey google/siri/alexa)
- smart devices
Popular example, siri
- siri was actually not originally from apple, was originally an app (SRI international AI center)
- show siri timeline (https://en.wikipedia.org/wiki/Siri)
- siri original VA didn't realize she was the voice of siri (https://www.youtube.com/watch?v=QP-iVhdXjPk)
- show change of siri voice over time (https://www.youtube.com/watch?v=QP-iVhdXjPk)
- usecases, how "natural" it's become over the years
- https://www.youtube.com/watch?v=4ryQTkDWmBg
- side tangent: apple fumbled user voice data to google/amazon, who were able to improve voice recognition/naturalness
+ even led to apple being seen as "stagnant" (https://www.youtube.com/watch?v=4ryQTkDWmBg)
- siri delay (https://www.youtube.com/watch?v=nSdvj6yphoY)
- siri not originally created for voice assistant, but rather just "voice recognition"
unrel:
- https://qz.com/1222958/a-siri-creator-is-still-surprised-by-how-much-siri-cant-do
- https://qz.com/1222958/a-siri-creator-is-still-surprised-by-how-much-siri-cant-do
side tangent: Watson (jeopardy)
- https://www.youtube.com/watch?v=P18EdAKuC1U
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment