Structured data is essential for training robust and accurate multimodal AI models. By providing clear examples of how to structure data across different modalities—text, audio, image, and video—we can better understand how to create comprehensive datasets that improve model performance. This section offers further examples of structured data, demonstrating various scenarios and interactions to illustrate the best practices for data organization and annotation.
A customer service representative assists a customer with an issue regarding their internet connection.
{
"conversation": [
{
"turn": 1,
"model": "Human",
"parameters": {
"timestamp": "00:00:01",
"text": {
"content": "I'm having trouble with my internet connection."
},
"audio": {
"file": "audio_segment_1.wav"
},
"video": {
"file": "video_segment_1.mp4"
},
"sync": {
"text_timestamp": "00:00:01",
"audio_timestamp": "00:00:01",
"video_timestamp": "00:00:01"
}
}
},
{
"turn": 2,
"model": "AI",
"parameters": {
"timestamp": "00:00:05",
"text": {
"content": "I'm sorry to hear that. Can you please describe the problem in more detail?"
},
"audio": {
"file": "audio_segment_2.wav"
},
"animation": {
"type": "show_concern",
"duration": "2s"
},
"goal": "Gather detailed information",
"behavior": "Show empathy and gather more details",
"actions": [
{
"type": "speak",
"content": "I'm sorry to hear that. Can you please describe the problem in more detail?"
},
{
"type": "show_concern"
}
],
"sync": {
"text_timestamp": "00:00:05",
"audio_timestamp": "00:00:05",
"animation_timestamp": "00:00:05"
}
}
}
]
}
An AI tutor helps a student understand a mathematical concept, providing explanations and checking for understanding.
{
"conversation": [
{
"turn": 1,
"model": "Human",
"parameters": {
"timestamp": "00:00:01",
"text": {
"content": "Can you explain how to solve quadratic equations?"
},
"audio": {
"file": "audio_segment_1.wav"
},
"video": {
"file": "video_segment_1.mp4"
},
"sync": {
"text_timestamp": "00:00:01",
"audio_timestamp": "00:00:01",
"video_timestamp": "00:00:01"
}
}
},
{
"turn": 2,
"model": "AI",
"parameters": {
"timestamp": "00:00:05",
"text": {
"content": "Sure! A quadratic equation is of the form ax^2 + bx + c = 0. To solve it, you can use the quadratic formula: x = (-b ± √(b^2 - 4ac)) / 2a."
},
"audio": {
"file": "audio_segment_2.wav"
},
"animation": {
"type": "write_equation",
"content": "x = (-b ± √(b^2 - 4ac)) / 2a",
"duration": "5s"
},
"goal": "Explain the concept clearly",
"behavior": "Provide a detailed explanation",
"actions": [
{
"type": "speak",
"content": "Sure! A quadratic equation is of the form ax^2 + bx + c = 0. To solve it, you can use the quadratic formula: x = (-b ± √(b^2 - 4ac)) / 2a."
},
{
"type": "write_equation",
"content": "x = (-b ± √(b^2 - 4ac)) / 2a"
}
],
"sync": {
"text_timestamp": "00:00:05",
"audio_timestamp": "00:00:05",
"animation_timestamp": "00:00:05"
}
}
}
]
}
An AI assistant provides health advice to a user based on their symptoms.
{
"conversation": [
{
"turn": 1,
"model": "Human",
"parameters": {
"timestamp": "00:00:01",
"text": {
"content": "I've been feeling very tired and have a sore throat."
},
"audio": {
"file": "audio_segment_1.wav"
},
"video": {
"file": "video_segment_1.mp4"
},
"sync": {
"text_timestamp": "00:00:01",
"audio_timestamp": "00:00:01",
"video_timestamp": "00:00:01"
}
}
},
{
"turn": 2,
"model": "AI",
"parameters": {
"timestamp": "00:00:05",
"text": {
"content": "I'm sorry to hear that. Based on your symptoms, it could be a common cold or something more serious. I recommend resting, staying hydrated, and consulting a healthcare professional if your symptoms persist."
},
"audio": {
"file": "audio_segment_2.wav"
},
"animation": {
"type": "show_concern",
"duration": "3s"
},
"goal": "Provide health advice",
"behavior": "Show empathy and give practical advice",
"actions": [
{
"type": "speak",
"content": "I'm sorry to hear that. Based on your symptoms, it could be a common cold or something more serious. I recommend resting, staying hydrated, and consulting a healthcare professional if your symptoms persist."
},
{
"type": "show_concern"
}
],
"sync": {
"text_timestamp": "00:00:05",
"audio_timestamp": "00:00:05",
"animation_timestamp": "00:00:05"
}
}
}
]
}