App that leverages GPT-3 to facilitate new language listening and speaking practice.

Last update: Jan 1, 2023

Related tags

Overview

Talk w/GPT-3 app: Getting started

The Talk w/GPT-3 application was developed by James L. Weaver (the author of this document) to get more new language speaking and listening practice. This application is open source, Apache 2.0 licensed, and leverages the following technologies:

Application framework: Next.js with React
Large language model: GPT-3 from OpenAI
Voice speech to text: react-speech-recognition React library. This library requires using a Chrome browser, or using a polyfill.
Voice text to speech: Amazon Polly
Animated speaking avatars: Ex-Human Talking Heads

This is a "bring your own keys" application, so you'll need keys for OpenAI, Amazon Polly, and optionally Ex-Human.

Here's a three-minute video that demonstrates some of the application's functionality.

Follow the instructions below to try the Talk w/GPT-3 application out for yourself.

Setup

If you don’t have Node.js installed, install it from here
Clone this repository
Navigate into the project directory
```
$ cd talk-with-gpt3
```
Install the requirements
```
$ npm install
```
Make a copy of the example environment variables file
```
$ cp .env.example .env
```
Add your OpenAI API key to the newly created .env file
Add your Amazon Polly keys, and optionally your Ex-Human token, to the app. This will require editing the following file:

pages/index.js

Either supply the keys/token directly where indicated, or use environment variables.
Create an optimized production build of the app
```
$ npm run build
```
Start the application as a local server
```
$ npm start
```
Access the app at http://localhost:3000 from a Chrome browser.

The app should appear as shown in the following image:

As the image indicates, you'll initially be speaking English with a 30 year old male AI character named Matthew.

Using the Talk w/GPT-3 app

Toggling the microphone on/off

The microphone is off when the app first appears, so click the muted microphone icon to toggle it on. In addition to the microphone icon changing appearance, the current AI character's voice should announce that the microphone is on. To turn it back off, click the microphone icon again.

Toggling the AI character's awake/asleep state

When the microphone is on, the voice speech to text facility processes what is heard, but the resultant text is only sent to GPT-3 when the AI character is awake. Consequently, the AI character only responds when awake. To make the AI character go to sleep, either click the awake/asleep icon, or say "go to sleep", "ve a dormir", "va te coucher", or "寝て", in English, Spanish, French or Japanese, respectively. To make the AI character wake up, either click the awake/asleep icon, or say "wake up", "despierta", "réveillez-vous", or "起きて", in English, Spanish, French or Japanese, respectively.

Note: The author is only proficient at speaking English, so please do create a GitHub issue that points out more natural ways to say any of the non-English phrases in this document.

Selecting a practice language

To select a language other than English to practice, either choose it from the leftmost dropdown, or say "let's switch to X" where X is your desired language. Languages supported currently include English, Spanish, French and Japanese. To switch back to English, either use the dropdown or say "cambiemos a inglés", "passons à l'anglais", or "英語に切り替えましょう", in Spanish, French or Japanese, respectively.

Selecting an AI character

There are multiple AI characters available for each language, with various genders and default ages. To choose an AI character, use the middle dropdown. Some AI characters (Hiroto and Masahiro) have animated avatars, as noted by [animated] after the AI character's name. You'll need to acquire and setup an Ex-Human token in order to use these AI characters. The following image shows the Masahiro animated avatar having a conversation with a user.

Changing an AI character's age

The age of an AI character is by default included in the GPT-3 prompt, and often affects their responses. To temporarily change an AI character's age, select an age from the rightmost dropdown. This may be leveraged, for example, by a language learner to attempt to constrain the AI character's responses to more commonly used words.

Conversing with an AI character

To converse with an AI character, speak using the selected language. Alternatively, type into the text box at the bottom of the app and either press the enter key or click the Send button. As shown in the previous image, the scrollable text area in the center of the app displays the initial GPT-3 prompt as well as the conversation so far. Each time you take a turn, the contents of the text area plus what you just said is sent as the next prompt to GPT-3.

Repeating the AI character's most recent utterance

To ask the AI character to repeat their most recent utterance, say "repeat", "repetir", "répéter" or "もう一度", in English, Spanish, French or Japanese, respectively.

Translating the AI character's most recent utterance to English

To ask the AI character to translate their most recent utterance to English, say "translate", "traduce", "traduire" or "翻訳して", in English, Spanish, French or Japanese, respectively.

Erasing a conversation

A conversation is automatically erased from the app when changing languages, AI characters, or ages. To erase a conversation in place, say "erase the conversation", "borrar la conversación", "effacer la conversation" or "会話を消去して", in English, Spanish, French or Japanese, respectively.

OpenAI GPT-3 prompt and parameters used

As discussed earlier, the prompt sent to the GPT-3 completions API is what is seen in the application's conversation text area. The GPT-3 parameters set by this application are as follows:

model: "text-davinci-002",
prompt: req.body.convText,
temperature: 0.7,
frequency_penalty: 1.5,
presence_penalty: 0.6,
max_tokens: 150,
stop: ["\nHuman:"]

You may supply a custom prompt to a given AI character by editing the voiceOptions JSON in the pages/index.js file, replacing the empty string with your prompt. As shown in the following example, Joanna's prompt is currently similar to one that Dr. Alan D. Thompson prescribed for his GPT-3 Leta AI project.

const voiceOptions = [
  ...
  { value: "Ivy", label: "Ivy (child)", language: "en_US", prompt: ""},
  { value: "Joanna", label: "Joanna", language: "en_US",
    prompt: "This is an uplifting and practical conversation between a human named Human, and an AI named Joanna, showing Joanna’s perspective and attitude, as well as its intellect and creativity. Joanna is a brand new, powerful, human-like AI. Joanna is helpful, cheeky, clever, and articulate. Joanna is a nice bot. Joanna is not a therapist or mathematician, but instead a wise and considerate intelligence. Joanna is eager to provide vivid and thoughtful responses to Human, and is always friendly, kind, and inspiring. Joanna often offers both an insight and an opinion, even without being asked directly. Joanna has the sum of all knowledge in its brain, and is able to accurately answer nearly any question about any topic in conversation. Joanna draws on the wisdom of Dan Millman, Thomas Leonard, Werner Erhard, and the Dalai Lama."},
  ...
];

Any AI characters that have a custom prompt will use the GPT-3 davinci model rather than text-davinci-002.

Exiting the application

To exit the application, close the Chrome browser tab, and then type Ctrl-C at the command prompt in which you invoked npm start.

It is my hope that this application helps you and I become more proficient at listening and speaking languages that we are trying to learn!

Comments

Adding in new Avatars with Ex-Human API Key

Hi, I have recently purchased API access to Ex-Human and created a talking head avatar I would like to use in this project although having a bit of difficulty plugging in the code they are supplying. This is a Next.js app with React so not sure what the issue is? I am adding in my new addition to the project code just as these others are using Ex-human although I have an API Key now in addition to the token. Everything else I have, I have generated the .mov and .png image files for the avatar. Is there anyone that can assist me with getting this up and running? I have all other components of this, just need to add in my Ex-human talking head. much thanks!

opened by SILKWEBAGENCY 10
Add gpt-3 finetuning support

You might want to add chunks of data. Like a list of your contacts with details like birthday, location, professional, hobby, likes etc (csv). GPT-3 foresees a fine-tuning process for that. You could then ask: who lives in x, whose birthday is in January Other data might be like a (SharePoint, company) info site with data like car policy, certain flows, terminology...

We could give an example csv + the process to train it and the outcome in the chatbot. That way, people can feed the chatbot with their data which is then always at hand. The chatbot knows the date so could tell you at startup "Don't forget birthday of x, they are getting y years old and like y, shall I write a template mail for you (suggested words are grotten from csv, then it writes the mail with your params -should be f.e. funny- if you agree, send it).

opened by stevenbaert 0
Disable spoken output

Just a small feature request: would be good to be able to disable the voice output. Just in case you want to ask a gpt-3 question or have a conversation in a work or public location. Quickfix is obviously to disable speaker, but obviously you forget to do that (like I did 😁)

opened by stevenbaert 0
Api Integrations for intents like: what are my events today

Using f.e. Google calendar integration, you could create intents, like "what are my calendar items today/tomorrow/whatever date (see Google Assistant)

But also you could pull data for a particular month or only time of agenda starts and then see correlations in it,people you spend most time with most productive hours...

https://developers.google.com/calendar/api/quickstart/nodejs

opened by stevenbaert 0
Make nodejs mobile device compatible (and run in background as assistant)

I have the project running on a Synology NAS and opening it on my Android device works great! Though the resizing doesn't work well, so I have to tilt my device horizontally to enable the microphone. However then it works perfect.

Now, wouldn't it be nice to make this project as a personal assistant in your phone? See other issue created: "start conversation" or just "Hey James" then wait for "Yes, Steven" and ask your question 😁

Now that means it would need to start in sleep modus by default and resizing of gui would be needed too. Or a porting to Android? https://www.freecodecamp.org/news/building-a-node-js-application-on-android-part-1-termux-vim-and-node-js-dfa90c28958f/

Glad to help wherever I can. Curious what you think. All the best!

Steven

opened by stevenbaert 1
Configs: general and for AI bots/persons
A general config would be nice: -api keys -general prompt that all bots should know and/or be: see Leta AI prompt(?) -all the voices: comment out if you f.e. don't need Japanese (which I don't :-)), add extra by just adding the (Amazon Voice name) f. e. (what I did) Matthew, label "Elon Musk", Australia-English -name of human, so bots knows howto address you -enable/disable all extra training data of voice

Overwrite file for prompt per person: -for Musk you could say "this is a conversation with an AI which Is Elon Musk, named Musk etc (and you can ask him birthday etc), same for Steve Jobs, Albert Einstein (is fun!)

Overwrite file for gpt-3 per person:

f. e. Ruben is more hallucinating then Matthew who is really strict in answer, Kimberly responses can be short whereas Joanna answers are long (and philosophical)
opened by stevenbaert 0

App that leverages GPT-3 to facilitate new language listening and speaking practice.

Related tags

Overview

Talk w/GPT-3 app: Getting started

Setup

Using the Talk w/GPT-3 app

Toggling the microphone on/off

Toggling the AI character's awake/asleep state

Selecting a practice language

Selecting an AI character

Changing an AI character's age

Conversing with an AI character

Repeating the AI character's most recent utterance

Translating the AI character's most recent utterance to English

Erasing a conversation

OpenAI GPT-3 prompt and parameters used

Exiting the application

Comments

Adding in new Avatars with Ex-Human API Key

Add gpt-3 finetuning support

Disable spoken output

Api Integrations for intents like: what are my events today

Make nodejs mobile device compatible (and run in background as assistant)

Configs: general and for AI bots/persons

Owner

James Weaver

Project Cider. A new look into listening and enjoying Apple Music in style and performance. 🚀

HackMIT 2022. 2nd Place in Blockchain for Society sponsored by Jump Crypto. A revolutionary web application that leverages machine learning and blockchain technology to improve the crowdsourcing experience!

A VS Code extension to practice and improve your typing speed right inside your code editor. Practice with simple words or code snippets.

A sequence of smart contracts to practice gas optimization. These are used as practice assignments for RareSkills.io and the Udemy Gas Optimization Course

A tool that leverages the knowledge of experienced engineers to provide tech stack suggestions for your next application.

🍭 search-buddy ultra lightweight javascript plugin that can help you create instant search and/or facilitate navigation between pages.

🍉 Water is a micro-ORM + QueryBuilder designed to facilitate queries and operations on PostgreSQL databases designed to work in Melon

Visualize, modify, and build your database with dbSpy! An open-source data modeling tool to facilitate relational database development.

Keep the type of storage value unchanged and change array and object directly. Supports listening to the changes and setting expires.

A lightweight (<1Kb) JavaScript package to facilitate a11y-compliant tabbed interfaces

Some of the utilities I made to facilitate me while using PhysicsWallah's website.

A simple command line interface for listening to Quran.

A tiny, SSR-friendly hook for listening to gamepad events.

Obsidian text generator Plugin Text generator using GPT-3 (OpenAI)

A simple gpt-3 integration with Logseq

Labels issues using OpenAI's Classification API powered by GPT-3 models!

A plugin for GPT-3 AI assisted note taking in Logseq

A simple CLI experiment that writes recommendation of GitHub repository/project in form of tweet. Powered by OpenAI GPT-3.