In today's fast-paced digital world, enterprises grapple with an overwhelming number of customer interactions conducted through chats. This communication avalanche comprises approximately a million daily conversations between users, bots, and agents. Given this vast volume, determining conversation outcomes manually is a labor-intensive task for human agents. The complexity of the content means that traditional rule-based checkers fail to automate the process efficiently, thereby necessitating manual oversight and escalating human effort. The laborious task of sifting through chat histories to classify outcomes not only demands considerable time and resources but also limits the contact center teams' ability to focus on more crucial tasks.
To tackle this challenge, we propose an AI-driven solution designed to efficiently classify chat conversations into 3 specific types: "dropoff", "incomplete", and "complete". This classification system aims to provide a clear understanding of customer interactions, enabling quicker and more accurate responses to customer service needs.
This occurs when a conversation prematurely ends, such as when a user stops responding to a bot or agent, possibly after an interactive message that is neither a "service confirmation" nor a welcome menu. Another instance includes users waiting too long for an agent and then not engaging when the dialog opportunity arises.
In this scenario, the conversation concludes without providing the user with the required information. This is identified when a bot asks if the user's needs have been met, and the user explicitly indicates a lack of satisfaction, for example, stating something like "Incomplete".
A conversation is deemed complete when it ends satisfactorily, often recognized by non-interactive messages that end the interaction, service confirmations, or when users rate the service. Even if a conversation doesn’t meet the user’s specific needs but there’s no further interaction, it might still be categorized as complete.
Round 0: Initial Testing
Actions: Data cleansing processes to remove noise and categorize messages and initiate prompt by explaining the instruction including the definition for each target class and formatting the response message.
Round 1: Enhancing Model Understanding
Actions: Refined the prompt instructions and included more detailed definitions for each conversation type. Few-shot examples were introduced to better guide the models.
Round 2: Training and Testing Strategy
Actions: The dataset was split for dedicated training and testing.
Round 3: Expanding Data Testing
Actions: New data samples from August and October were added for testing to ensure model robustness with different data sets.
Round 4: Format and Reasoning Precision
Actions: Output was restricted to a structured JSON format, and reasoning for classification decisions was made concise to improve clarity.
Round 5: Dynamic Few Shots Injection
Actions: Advanced techniques such as Dynamic Few Shots Injection were implemented to clarify label definitions, and more synthesized data from GPT-4o was utilized.
Round 6: Expanding Data Testing
Actions: Focused on expanding the testing datasets to include more recent interactions
The automation reduces the need for extensive human intervention, leading to significant time savings and freeing up personnel to focus on more strategic activities.
The potential for errors and subjective judgment is drastically reduced, ensuring that insights derived from call data are consistent and reliable.
The efficiency gained also enables businesses to analyze larger datasets than previously possible, unlocking new opportunities for understanding customer needs and enhancing service quality.
In today's fast-paced digital world, enterprises grapple with an overwhelming number of customer interactions conducted through chats. This communication avalanche comprises approximately a million daily conversations between users, bots, and agents. Given this vast volume, determining conversation outcomes manually is a labor-intensive task for human agents. The complexity of the content means that traditional rule-based checkers fail to automate the process efficiently, thereby necessitating manual oversight and escalating human effort. The laborious task of sifting through chat histories to classify outcomes not only demands considerable time and resources but also limits the contact center teams' ability to focus on more crucial tasks.