ChatGPT gained a lot of attraction the past months and for many, it sounds like a revolution.
Microsoft itself quickly announced a deep and exclusive partnership with OpenAI to integrate this technology to its search engine “Bing” to have conversational and enhanced results.
Something like ChatGPT could change Microsoft’s web destiny!
The missed internet revolution
Microsoft is one of the oldest IT companies, born years before the internet. This company shaped many trends and habits in the IT world (UI generalization (Windows 3.11), mice devices,…). It also confronts all the bad sides of IT with first viruses, 0-days bugs and retro-compatibility constraints.
So, Microsoft was very skilled and ready to address any IT challenges.
However, when the internet has been becoming public, something has been missed. Too custom IE browser, not well implemented IEEE standards, not an efficient search engine (remember the slowness to load the search page with a 56k modem..),…
So, the space left by Microsoft let other actors emerge, such as Google, Facebook or Amazon.
Today again, Microsoft has difficulties to find its place in the picture, splitted with legacy tools (Office365 desktop version for instance) and the internet/web user experience (Office365 online version). For those who experimented with the 2 Office365 versions, the features’ compatibility, the collaboration mode, (…) know how painful it is…
But now, with conversational agents, something could finally happen and change that history bug!
The conversational agent hype
In recent years, conversational agents have been becoming very popular. Google Assistant, Siri, Alexa in the lead. Many of you are familiar with that concept that we can split in 4 stages:
- Intent detection: understand the intention and the context of the user request
- Parameters extraction: get the meaningful parts of the intent to extract actionable values
- Execution: perform a query/request, based on the intent and the extracted parameters.
- Feedback: provide the result of the execution
We can take an example:
Ok Google, what will be the weather tomorrow in Paris?
- Intent detection: Know the weather
- Parameters extraction: Location = Paris, Date = Tomorrow
- Execution: Query the weather API with “Paris” and “Tomorrow” as parameters
- Feedback: “Tomorrow, the weather in Paris will be sunny with temperatures max 20° and min 10°”
And you can continue the conversation with that context, asking the weather of the weekend, or the next week,…
That question/response flow also exists in ChatGPT, but the Large Language Model (LLM) behind the scenes is much more impressive. And it explains very well the hype around those tools today!
But… ChatGPT is an anomaly, not a conversational agent
If you take the 4 stages of a conversational agent, ChatGPT only performs the 1st and the 4th step. It extracts nothing and it is unable to perform external requests.
That behavior comes from its background model: GPT-3. Keep in mind that GPT-3 is a text generator. A wonderful and super powerful text generator and do it greatly well.
But it generates only text: Start to write something, it will complete the text with the most probable sequence of words. Poem, summary, press article, fake news, code,…. GPT-3 can generate good and pretty convincing text for almost everything!!
Almost everything, including theater-like text, where two actors have a conversation, for instance Bob and Alice.
In the GPT-3 playground, you can simulate a simple conversation, like that
And now, you can reproduce the exact same conversation with ChatGPT. But this time I’m Bob, and ChatGPT has the role of Alice
Pretty similar! And it’s totally normal, the process of text generation is the same.
So, YES, I know that ChatGPT runs on GPT-3.5, there is reinforcement learning, and the answer is more “conversational”. But the foundations are still the same.
Roughly, ChatGPT is a wrapper of GPT-3 generative AI, and not a conversational agent
Bing ChatGPT version
Despite that statement, Microsoft thought that something great could be built around ChatGPT in combination with a search engine.
The objective is to transform ChatGPT into a truly conversational agent backed on Bing search to query the internet and improve the answers, and to meet the stage 2 and 3 of a conversational agent.
Because ChatGPT is not natively designed to be a conversational agent, Microsoft had to perform surgery and plug engines named Prometheus. Sounds like a hack, but it’s already up and a few selected users had the chance to test it (for the best, and often for the worse. And that latest statement was expected: because it’s not the native feature/behavior of ChatGPT, the current adaptation becomes freakish from time to time, like Frankenstein!)
Google Bard version
On the other side of the internet, and shaken by Microsoft’s bold move, Google also announced its own conversation agent backed on its own LLM: Bard.
Bard uses LamDA, a LLM specially trained for dialogs (D in the name stands for Dialog). LamDA understands the intent, extracts the parameters, uses Google Search (and other Google tools/AI, to curate answers, I guess to eliminate fake news) to get the search response and generates an verbose text answer.
There is too little information about Bard and its real capability. The product is in dog-food phase at Google (and my contacts told me that it is better and better day after day), but no external feedback for now.
Core features and differentiators
Microsoft shot first, for the best and the worst. Google follows suit closely. I have no doubt that both giants will achieve in the next 2 years a great, stable and efficient integration of a conversational agent in their search engine.
So, in the end, the conversational agent will be similar.
So where will the differences happen in the next gen search engine?
1. Firstly, come back in the 2000’s, and ask yourself why Google has dominated the internet compared to the competitors?
Search engine quality, efficiency, reliability is the answer. Today, try Google Search, try Bing Search (without ChatGPT) and compare the results.
What will be the most relevant, the most efficient, the most trustable?
Google Search has this quality.
If each conversational agent uses their respective search engine, and the generative AI reaches the pretty same quality (Microsoft and Google). The difference will be done only on the quality of the search!
2. Then, a conversational agent is great to enhance the search engine answers, but to have a personal assistant able to know you, your profile, your preferences, your data to enrich answers, pre-select the most relevant answer (like train ticket, weather location,…) or achieve tasks on your behalf.
In that context, Google has a very very VERY large base of users, with lots of data on many areas (maps, mails, navigation, shopping,…). This large base has been building for that 20 past years by Google.
And Microsoft won’t be able to fill this gap in the near future.
3. Finally, I would like to enlighten the innovation part. Because YES, GPT-3 is a great product, YES, Microsoft reuse OpenAI amazing work, but where is the innovation?
GPT stands for Generative Pre-trained Transformers. Who did release the transformers?
Google! in 2017. Ok ok, it’s not because you have good researchers that you are the best to leverage the power of your discovery.
But it also means that Google has had that skill for many years now, it can tweak, tune and understand the transformers deeply. It might be the only one able to do that. Or it is already on the next generation of generative AI algorithms to go to the next level!
Innovation is the key in IT. And I don’t see many releases performed by Microsoft. I don’t see any innovative products from Microsoft (in the cloud or elsewhere) except the Hololens.
It’s hard to compete without leading innovations.
Lots of noise for a status quo
Since 2020, Google is focused on building an “Answer engine” instead of a “Search engine”, but without any real demo or perspective of what it is or what it will be.
Finally, Microsoft simply boosted and forced Google to show muscle and to enter in the game.
Microsoft buzz was smart, well done and powerful. But, at the end, when the shock wave will fade, the legacy strength and weakness will stay.
Of course, we will never browse the internet, interact with machines and write things (code, articles, report) in the same way as before, but I’m convinced that the current “web order” will remain!