
Data Analytics
How data science helps to better understand customer feedback.
Mar 31, 2025
How would you rate this app? Tell us something about this product! How satisfied were you with our rental car service? These and similar sentences will be familiar to most people. We encounter them constantly in everyday life and, especially in e-commerce, digital services, and online marketing, they are an integral part of the business model. Customer and user experiences are especially worth their weight in gold, particularly in B2C business.
The goal of well-known customer surveys is usually either to identify weaknesses in the product or to generate the so-called „electronic word of mouth“ (eWOM), i.e. „word of mouth“ in German. These goals are not mutually exclusive, however the strategy of how, where, and when the customer is asked for feedback can vary.
Customer Survey to Generate Electronic Word of Mouth
For generating eWOM, it is essential that public forums are targeted and that the customer feedback obtained can be read by other potential customers. Typically, the aim is to maximize the three factors „volume“, „valence“, and „dispersion“.
Volume is the total amount of generated customer feedback.
Valence is the mood (also called „sentiment“) of the customer feedback.
Dispersion is the degree to which feedback is spread across multiple platforms.
Ideally, a company thus generates a large amount of positive feedback on multiple platforms. In this way, customer surveys can be used as a marketing tool and, in the best case, act as a vote of confidence for potential customers. Examples of this include product reviews, which reflect the authentic experiences of real customers and can significantly influence the purchasing decision.

Customer Survey for Collecting Product Improvements
To identify product weaknesses, customer feedback does not necessarily have to be published. It can also be collected anonymously, because the insights from the feedback are used exclusively for internal further development. The type of feedback that is useful here differs from the marketing-oriented use case: while marketing prefers positive feedback, the analysis of critical reviews can show significantly more value in product improvement.
This type of analysis of customer feedback is interesting for various companies:
Companies with an iterative product development process in which the product is constantly being developed further and insights from the previous generation are incorporated into the new generation.
Companies that offer a long-term service and want to uncover problems and optimization potential along their processes.
With certain methodologies, it is possible to systematically evaluate customer feedback and derive product improvements. In the following, we will use an example to illustrate the requirements, methods, and technologies with which such a system can be integrated.
Customer Sentiment Analysis
What exactly is customer feedback, and how can the content of feedback be classified? The research field of „Customer Sentiment Analysis“ deals with these questions. This term describes the subject area around the analysis of customer opinions about a brand, product, topic, or services and is a core part of data science. Customer Sentiment Analysis combines algorithms, psychology, and linguistics to systematically extract information from customer feedback.
The following terms are relevant to understanding how feedback, for example from an online review, can be systematically analyzed:
Opinion (engl. opinion): An online review can contain several opinions. An opinion has two components: an opinion target and a sentiment
Opinion target (engl. sentiment target): This is the entity about which the opinion holder expresses their opinion. It is sometimes also referred to as the topic or „topic“.
Sentiment (engl. sentiment): Sentiment lies on a spectrum between positive and negative.
Opinion holder (engl. opinion holder): The person who expresses an opinion. This does not necessarily have to be the same person as the author of the review.
Time of opinion (engl. time of opinion): The point in time at which the opinion was expressed. This is especially necessary in order to make changes in opinion traceable. This also makes it possible, for example, to determine which product version was reviewed here.
In summary, an opinion holder, for example, creates an online review at a defined point in time. This review contains one or more opinions. Each opinion contains an opinion target with an assigned sentiment. This theory can be applied using an example:
„(1) Overall, I am satisfied with the vehicle, but there are some points that my wife and I do not like. (2) I love the extremely comfortable interior. (3) What really gets on my nerves is the sensitive Lane-Keep Assist. (4) My wife thinks the loading edge in the trunk is too high.“
This review contains four opinions:
Opinion | Opinion target | Sentiment | Opinion holder |
1 | Overall vehicle | Rational positive | Author |
2 | Interior | Emotionally positive | Author |
3 | Lane-Keep Assist | Emotionally negative | Author |
4 | Trunk | Rational negative | Wife |
It makes a fundamental difference whether an opinion is expressed from a rational standpoint, or whether emotions influence the opinion. Together with the general sentiment, this results in a broad spectrum on which the stated opinion can be categorized. The goal is therefore to capture the opinion holder's sentiment as precisely as possible.
Analysis Methods of Customer Sentiment Analysis
The overarching goal of the analysis is to classify opinions, for example by opinion target, sentiment, or opinion holder. In general, two approaches can be distinguished:
Rule-based analysis: For example with the help of a so-called Sentiment Lexicon. Here, rules are provided that are based on human expertise. For example, which adjectives have a positive or negative connotation. However, these models typically have problems correctly interpreting context. The word „long“ can, for example, be positive in the context of „long battery life“. In the context of „long charging time“, „long“ has a negative connotation.
Statistical models: These models are based on calculating probabilities between words. Classification can be performed on the basis of these probabilities. Often these models have to be trained. In these cases, a dataset is labeled to create a ground truth. Based on this, known relationships between words are created. When the model is fed new data, the known relationships can be applied to the new data. Statistical models are therefore somewhat more labor-intensive to train, but they deliver more reliable results. However, there are also models that do not require labels. This is referred to as unsupervised or zero-shot learning. Representatives of this class are, for example, large language models, or LLMs for short.
Challenges in Customer Sentiment Analysis
Detecting sarcastic statements is not trivial for conventional methods. On a literal interpretation of the statement, the opinion here sounds positive at first. Rhetorical questions are similarly challenging, as they can be misunderstood as sincere questions. The general sentiment and emotional state of the opinion holder also influence the objectivity of an opinion. What is needed here is a mechanism that can recognize these aspects and classify them accordingly. Handling multilingual data is not easy either, because translation can introduce distortions into the data.
Customer Sentiment Analysis with LLMs
One possible solution to meet the challenges is the use of LLMs. Suppose we have an international product, such as a vehicle, about which we have collected customer feedback. We want to use the customer feedback for development purposes and place the customers' biggest pain points directly with the development teams. Here, an LLM can be used to clean the data, structure it, and finally classify it. In this case, the opinion target of a piece of feedback would therefore be classified according to the responsible development team.
Customer Sentiment Analysis Process

The figure above exemplarily describes what such a process could look like. The knowledge of the LLM can be further enriched at each individual process step. For example, it can be specified into which classes classification should take place and which area of responsibility each development team has. For extremely technical and domain-specific feedback, a Sentiment Lexicon can be provided that categorizes the terminology. Subjectivity, sarcasm, colloquial language, and rhetorical questions can be interpreted, especially when the LLM is specifically instructed to do so.
LLMs are non-deterministic systems. This means that the output is always subject to a certain random factor. This can lead to problems in some use cases. In the field of customer sentiment analysis, however, the enormous advantages in the interpretation of natural language outweigh these concerns. In addition, it can be argued that the interpretation of an opinion is not deterministic anyway and allows for a certain degree of room for interpretation. Finally, such an LLM procedure can also be supplemented with other statistical methods.
Customer Sentiment Analysis with CarByte
Many customers are concerned with topics such as data security and the provision of LLMs. CarByte has already gathered some experience with the use of LLMs and actively works with customers to find solutions for the given case. In addition, we offer technical expertise in the field of customer sentiment analysis and are also happy to provide advice on best practices and the collection of customer feedback.