Voiced Conversation – OpenAI + ElevenLabs
This sequence generates a two-character scripted conversation using OpenAI and ElevenLabs for text-to-speech conversion. It combines human input for customization and AI for generating content and voice synthesis. Below is a step-by-step explanation of the workflow.
1. Defining the Conversation Topic
- Step: The sequence begins by asking the user:
- “What is the conversation about?”
- User Input: The user provides the topic of the conversation, which defines the context for the generated dialogue.
- Purpose: This ensures the AI-generated script is aligned with the user’s goals.
2. Generating the Scripted Dialogue
- Process: The sequence uses OpenAI to generate a short dialogue between two characters based on the provided topic.
- Prompt Details: The AI is instructed to:
- Script a conversation between two characters based on the provided topic.
- Limit each line to a maximum of 10 words.
- Alternate lines between characters.
- Use — to indicate pauses for natural flow.
- Produce a conversation with exactly 6 lines.
3. Human Review and Editing
- Step: The user is presented with the generated script and asked:
- “Please edit as needed.”
- Functionality:
- Users can modify the script to refine tone, content, or phrasing.
- The original script is also provided for reference under Review Material.
4. Text-to-Speech Conversion
- Step: The finalized script is converted into audio using ElevenLabs text-to-speech technology.
- Voices Assigned:
- Two unique voices (predefined in the sequence) are assigned to the characters:
- Voice 1: For odd-numbered lines (e.g., lines 1, 3, 5).
- Voice 2: For even-numbered lines (e.g., lines 2, 4, 6).
- Two unique voices (predefined in the sequence) are assigned to the characters:
- Process: The script lines are split and matched with the respective voices. Each line is processed individually to produce high-quality audio.
5. Iterative Dialogue Playback
- Step: The sequence loops through each line of the script, applying text-to-speech processing for all six lines.
- Logic:
- A mathematical node determines which voice to use based on whether the line number is odd or even.
- This ensures the correct alternation of voices.
6. Sequence Completion
- Step: After processing all six lines of dialogue, the sequence ends.
Version: v1