Sentiment Analysis with Azure Text Analytics: Reading between the lines

4 min readJun 25, 2020

Today’s enterprises have access to a large volume of raw data of customers’ feedback from multiple sources (e.g. social media, survey responses, support tickets) which can give a great insight of customers’ thoughts and opinions. A challenge with raw data, like answers to open-ended questions, is that they must be structured before analysed.

Azure Cognitive Services can help organisations to extract insights from unstructured data. Azure Text Analytics uses pre-trained models to perform sentiment analysis, key phrase extraction, language detection, and named entity recognition. In this test, I use Azure Text Analytics to identify key phrases and perform a sentiment analysis for more than 2,000 reviews on Amazon. To test Azure Text Analytics I use the Amazon Customer Review Dataset for Watches. A full list of the available datasets can be found here.

Create Text Analytics endpoint

The Azure Cognitive Services interface is very intuitive and you can create the Text Analytics endpoint in a few clicks. Navigate to Cognitive Services and add a new Text Analytics service. Depending on the expected number of API transactions, the appropriate Pricing Tier must be selected to avoid any overage charges. If you use Free Tier the usage is throttled and there are no overage charges.

Prepare the API request

Once the Text Analytics endpoint has been created you can find the API key under “Key and Endpoint” section. We will need this to access the API. In this test I am interested in Sentiment and Key Phrases APIs and the two request URLs used are:

https://{endpoint}/text/analytics/v3.0-preview.1/sentiment

and

https://{endpoint}/text/analytics/v3.0-preview.1/keyPhrases

For this test I used PowerShell to get a random subset from the 960,000+ reviews and test the API. You can use the Invoke-RestMethod as in the example below.

Invoke-RestMethod -Method Post -Uri $requestUri -ContentType 'application/json' -Headers @{"Ocp-Apim-Subscription-Key" = $apiKey} -Body ($requestBody | ConvertTo-Json)

The Request Body variable was built as below and then converted to json.

$requestBody = @{
     documents = @(
          @{ 
               "language"= "en" 
               "id"= "$($reviewId)" 
               "text"= "$($reviewBody)" 
           } 
     )}

Microsoft Documentation here provides more information about the Text Analytics API and you can find Code Samples for more languages.

Sentiment Analysis Results

The Text Analytics Sentiment Analysis API returns a label and a confidence score at sentence and document level. In this test, I was interested in the label at document level and the four possible labels are positive, negative, mixed and neutral, which are determined as below:

Positive: At least one positive sentence in the document and the rest of the sentences are neutral
Negative: At least one negative sentence in the document and the rest of the sentences are neutral
Mixed: At least one negative sentence and at least one positive sentence in the document
Neutral: All sentences in the document are neutral

The label returned was compared against the number of stars in the review. For this test we included all the confidence scores across all labels — low confidence scores can be excluded to improve accuracy. Without excluding any result returned, 62.31% of the 1-star reviews was labelled as Negative, while the 74.9% of the 5-stars reviews was labelled as positive. 4.52% of 1-star reviews was labelled as Positive and 1.58% of 5-stars reviews was labelled as Negative.

Key Phrases Results

With the Text Analytics Key Phrases API you can extract the main points in a collection of documents. In this example, I combine the key phrases and the sentiment to get insight about the main points of the reviews with positive and negative feedback.

62.83% of the reviews with a key phrase “price” were positive, while 3.72% were negative — if the reviews are categorised by the number of stars, then 68.03% was a 5-stars review while 4.83% was a 1-star review. The 17.89% of the reviews with a key phrase “battery” was labelled as negative, 24.39% was labelled positive and 56.91% was labelled as mixed. I checked an additional keyword, “quality”, and 46.19% of the reviews was labelled as positive, while 5.58% as negative.

The combination of key phrases and sentiment analysis or review stars, gives a great insight around the main points in positive and negative reviews, which can help identifying what customers like or dislike about a product.

Final thoughts

Sentiment analysis still has challenges, like inability to understand sarcasm or irony, low performance in specific domains. However, organisations can benefit from cloud-based natural language processing services to track their customers’ feedback and understand how their brand and products are perceived. With many feedback channels nowadays (e.g. emails, reviews, website analytics) enterprises have a great opportunity to understand their customers like never before.