Investigating Microsoft’s Transformation under Satya Nadella

This project was completed as part of Metis data science bootcamp program in February 2021. The blog post walks through how I used Natural Language Processing (NLP) tools in Python to derive interesting observations about the changes in Microsoft’s philosophy and strategy under CEO Satya Nadella, and how he was able to lead a successful turnaround at the company.


Performing due diligence on CEO is a key component of investment research

I’m a firm believer that a qualitative analysis of a company is as important or in some cases, more important than a quantitative analysis when it comes to investing. One of the key components of that analysis is understanding the CEO — his/her philosophy, strategy, and culture he/she brings down on the company.

And the work becomes even more paramount when there is a CEO change. The place where investors get to have a view into a CEO’s mind is when the company hosts quarterly investor calls to discuss financial results. So that’s where I was naturally drawn to see if NLP can be applied.

I chose to look at Microsoft, because it is a perfect example of where a change in CEO catalyzed a transformation in the company.

A clear inflection in Microsoft’s earnings and stock price after Satya Nadella becomes CEO

New CEO Satya Nadella, took over the reins from Steve Ballmer in February 2014, and the chart above shows that the changes he made over the next seven years translated to a financial success as well. My goal was to analyze Microsoft’s earnings transcripts in pre- and post-Satya Nadella days to extract insights about how the company’s philosophy and strategy evolved over time.

Collecting and preprocessing data

I retrieved Microsoft’s earnings transcripts in PDFs from Capital IQ API.

  • 3Q’07–2Q’14: 28 quarters of transcripts in the Steve Ballmer era
  • 3Q’14–2Q’21: 28 quarters of transcripts in the Satya Nadella era

I then went through a pipeline of text preprocessing steps using NLTK and SpaCy:

  • Removed punctuations and numbers
  • Removed stopwords
  • Lemmatization
  • Corrected spelling errors
  • Removed people’s names

Topic modeling — history of struggles, transition, and emergence

I performed topic modeling to get five topics, each comprised of six topic words. Interestingly, applying non-matrix factorization (NMF) on TF-IDF-vectorized corpus gave a neat 14-year history of Microsoft, with the topics coming one after another chronologically to form five distinct periods.

## Create a document-term matrix with TfidfVectorizer
cv_tfidf_msft = TfidfVectorizer(tokenizer=lambda doc: doc, lowercase=False, min_df=2, max_df=0.5)
X_tfidf_msft = cv_tfidf_msft.fit_transform(corpus_msft).toarray()
## Matrix factorization into document-topic and topic-word matrices
nmf_msft = NMF(5)
doc_topic_nmf_msft = nmf_msft.fit_transform(X_tfidf_msft)
topic_word_nmf_msft = nmf_msft.components_
## Get top six words for each topic, ranked by topic coefficient
words_tfidf_msft = cv_tfidf_msft.get_feature_names()
t_nmf_msft = topic_word_nmf_msft.argsort(axis=1)[:,-1:-7:-1]
top_topic_words_nmf_msft = [[words_tfidf_msft[i] for i in topic] for topic in t_nmf_msft]
Topics neatly form periods with distinct themes in Microsoft’s corporate history

The first and third periods come right before and after the 08–09 recession, and I call them the dark days because the words mostly describe Microsoft’s various missteps and struggles. For example, Vista and Kinect were failed product launches. Nokia, Aquantive, and Skype were all disastrous acquisitions. The word legal relates to the company’s antitrust charges. And Microsoft also made several unsuccessful attempts to acquire Yahoo to better compete with Google search, which also ended up being a disaster. It’s interesting you see the word happy here — turns out management overused the word like ‘we are happy with our product’, or ‘we are happy with our results’ during the calls to deflect questions about their missteps.

The second topic has words like economy, condition, and weak that characterize a global recession and Microsoft was obviously not immune to the downturn. Then comes the transition period after Satya Nadella took over the reins, as implied by the words restructuring and transformation. Words like mobility, SaaS, and PaaS reflect his idea of aggressively shifting software to a subscription-based model and making the company’s cloud product Azure the centerpiece of his long-term growth strategy.

Lastly the most recent three years or so are characterized by Microsoft’s attempt to compete at the frontiers of technology using its strength in productivity software, cloud, and developer platforms, rather than playing catch up in things like search, social media, and smartphone.

Keyword frequency — how has Microsoft’s strategy evolved over time?

As a next step, I wanted to take a closer look and see if certain keywords gained or lost importance over time as Microsoft’s strategy and philosophy changed.

## Create a document-term matrix with CountVectorizer
cv_msft = CountVectorizer(tokenizer=lambda doc: doc, lowercase=False, min_df=2)
X_msft = cv_msft.fit_transform(corpus_msft).toarray()
df_msft = pd.DataFrame(X_msft, columns=cv_msft.get_feature_names())
## Get word frequencies for the keywords
df_msft_keywords = df_msft.T.loc[['cloud', 'ai', 'iot', 'saas', 'license', 'piracy', 'hardware', 'shipment']]
## Plot a heatmap of keyword frequencies over time
fig, ax = plt.subplots(1, 1, figsize = (15, 6))
sns.heatmap(df_msft_keywords, cmap='Blues', annot=False)
ax.set_ylabel('Key Words', fontsize=15)
ax.set_xlabel('Fiscal Period', fontsize=15)
A heatmap of how frequently each keyword appears in earnings transcripts over time

I used Seaborn to create a heatmap to show how the frequency of each keyword appearing in earnings transcripts change over time.

As you would expect, the words like cloud, AI and SaaS have increased over time under Satya Nadella. On the other hand, words like license, piracy, and shipment decreased in importance over time as the company moved away from licensed software model and hardware strategy.

A simple visualization of keyword frequencies helps illustrate the two CEOs’ diverging strategies

Some other keywords that I found contrasting relationships were commercial and consumer. It shows that Satya’s another strategy was shifting focus from consumer to commercial customers, as he saw the company’s know-how of serving large enterprise customers as its forte, and they also provided better economics. Another trend is that there is emphasis on margin and deemphasis on cost over time as the company has pulled away from hardware and pushed cloud products.

Lastly, I think the last four keywords in the above chart demonstrate the differences in Steve Ballmer and Satya Nadella’s philosophies that allowed the latter to lead a successful turnaround. The words guidance and forecast were more prevalent in the early years and it shows the company was overly focused on meeting its own short-term financial targets. On the other hand, the words secular and differentiate used often by Satya reflect his focus on creating differentiated products in areas like AI and IoT that have long-term secular growth. Essentially, Microsoft of the old era chose to forgo long-term investments to lower expenses and meet investors’ short-term expectations, while Satya came in and committed the company’s resources to building expensive, yet long-lasting platforms underpinned by structural competitive advantages.

Sentiment analysis — Has the tone gotten more positive or negative over time?

I moved on to sentiment analysis to see if I could gain additional insights on the tone of the earnings call and link it back to changes in philosophy and strategy. I used Loughran-McDonald Financial Sentiment Dictionary to find the count of words in the transcripts that matched those from the dictionary in the positive, negative, and uncertainty categories.

The sentiment of the calls turned pretty negative during the recession, and it also detected a lot of uncertainty during the period. Also, the sentiment was highly positive immediately before and after the financial crisis. It appears that the rest of the periods doesn’t show any particular trend…

Change in sentiment in the Q&A of Microsoft’s earnings calls over time

…Until you look at the words picked up by the sentiment dictionary in the two sets of transcripts (see the table below). While both periods show similar count of positive words, the words themselves are very different. The words happy and improve under Steve Ballmer are used to put a positive spin on the company’s many struggles during this period, while the words opportunity and innovation under Satya are definitely more forward-looking.

Similarly in negative words, difficult, weak, and decline all describe the actual dire situations that Microsoft were in, whereas negative words under Satya are in fact neutral or even positive in the context of describing his strategy (e.g., meeting customer challenges, introducing disruptive products).

Positive and negative words used under the Ballmer and Nadella eras have very different connotations


By using NLP tools to delve into topic models, frequency of words, and sentiment over time, we were able to see a positive change in Microsoft’s philosophy and strategy under Satya Nadella from different angles. His mindset towards joining the secular wave in technology early and building platforms for long-term success allowed the company to reemerge from the brink of irrelevance. We also observed that there were drastic changes in the company’s strategy, such as shifting the focus to cloud and commercial customers. For my future work, I could perform similar NLP analysis on another case where a change in CEO brought about a failed transformation in the company, and then compare the findings with those of Microsoft.

For more information on this project, including code and presentation slides, please check out my GitHub repository here.

If you would like to share ideas about any of my projects, please do not hesitate to contact me on LinkedIn.


A Word2Vec embeddings plot: words that share similar contexts are clustered together



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Mike Choi

Mike Choi

Data scientist looking to create value with a growth mindset. As Warren Buffet once said, “growth is always a component of the calculation of value.”