ChatGPT-4o Enters The Generative Search Race

Generative Search Runners and Riders

There’s a new player in generative search for B2B – and it’s not a search engine

Why was ChatGPT so poor for B2B research in the past?

I didn’t include ChatGPT in my previous research on generative search as it was clear it just couldn’t be a good tool for most B2B research. That’s down to the way ChatGPT is built. It uses a language model that’s built from a fixed set of training data. This brings in limitations of both recency and scope:

Recency

LLMs are built and trained using data from a fixed moment in time. For instance in the case of ChatGPT 3.5, the training process finished in early 2022. ChatGPT 3.5 simply doesn’t know anything about the world after that point, and never will. New versions of the language models are released from time to time but there will always be a time lag between the LLM’s latest model and current reality. For most B2B research that lack of recency is a showstopper – there’s no point drawing up a shortlist of suppliers based on the market dynamics of two years ago, or making a choice of product based on outdated pricing or feature information.

Scope

AI models are built using impressively large sets of training data, but they can’t encompass every single fact from the whole internet. A lot of B2B research is about very detailed, niche questions – such as “what’s the cheapest dog-friendly flexible office in Milton Keynes?”. ChatGPT and other LLMs will never be able to encompass all of those facts within their training sets. So if you ask them a question like that, you’ll find that either they don’t know the answer, or – worse – they will “hallucinate” and invent a plausible but non-factual answer. Neither of those is helpful for a B2B research task.

So if you used ChatGPT 3.5 for B2B research, you’d have obtained an out-of-date, incomplete and quite possibly non-factual answer. Not an attractive option!

A quick note on ChatGPT naming and pricing

Before we go too much further: there are three main “versions” of ChatGPT in play as of today:

ChatGPT-3.5 was the first version to really impact the mass market. It’s still available via a free tier of ChatGPT’s usage model. It was trained on data up to early 2022. It does NOT support the live web links that I’m focussing on today.

ChatGPT-4 was a major new version released in March 2023 and trained on data up to September 2021. It’s only available as a paid tool with subscriptions costing from US$20 per month.

ChatGPT-4o (the “o” stands for “omni”, reflecting its ability to work with different modes of communication such as images and audio) was rolled out around May 2024 and is again a paid tool available via the same subscription as ChatGPT-4. It has a lot of improvements over previous versions including substantially faster response times.

If you don’t mind stumping up the US$20/month for a paid account, I can’t see any reason to ever prefer one of the earlier versions. ChatGPT-4o seems superior to its predecessors in every meaningful way.

So when did ChatGPT become a search engine?

ChatGPT-4o (and indeed its predecessor ChatGPT-4, but NOT the free ChatGPT-3.5) have a lot of different capabilities to link intelligently to other technologies outside the core LLM. One is the ability to carry out a live web search and access other live web data. That is – ChatGPT recognises that it needs data from outside its training set in order to answer a question, creates the relevant web search, browses some of the live pages in the search results and pulls back the results from the search engine and integrates those results into the conversation. Here’s an example:

See how ChatGPT has decided to search for “best coworking space in Oxford UK 2024” in response to my question? And then it has fetched data from the sites that appear in the search results, and integrated all of that into its answer:

If I want, I can click on that search icon in the chat history and see the details:

I guess in principle ChatGPT could use any search engine for this purpose, but in all of the examples I’ve seen it uses Bing. That’s not surprising given the close commercial relationship between Bing’s owner Microsoft and ChatGPT’s owner OpenAI.

Does this approach look familiar? It’s basically the same process used by Bing Chat (now renamed Copilot). So we’re seeing a convergence of sorts between ChatGPT as a general-purpose AI-powered assistant, and the AI-powered search engines like Bing Copilot and Google SGE.

How does ChatGPT perform as a B2B “search engine”?

This all sounds exciting but it’s time to bring out my favourite question about a new technology: “is it any use?”. I’ve put ChatGPT-4o through its paces using my previous research methodology. As a reminder, this uses a standard set of 12 realistic B2B research tasks split across different industries and different stages of the research/buying journey.

The short summary: ChatGPT-4o is good. Really good. The experience is a step change relative to any generative search system I’ve tested before. Here are a few highlights from my testing.

ChatGPT-4o makes intelligent choices about when and how to use external search and external browsing

I showed an example above where ChatGPT-4o took my prompt “what’s the best coworking space in Oxford?” and immediately went to Bing with “best coworking space in Oxford UK 2024”. That’s smart – “Oxford UK” to avoid any confusion with other places, and “2024” in recognition of the implied recency in my question.

ChatGPT-4o is smart – sometimes!

Compare these answers to my test question “how should I split my budget between Google Search and LinkedIn Ads?”. The first is from Bing Chat back in April 2023:

It’s not bad. There’s a suggested numerical answer to the question and a clearly-quoted source. There’s some additional information about the technical details of budgets on the different platforms, which is correct but irrelevant to my original question. But here’s how ChatGPT-4o answered that same question:

Now that’s a longer, much more thoughtful answer. Instead of just “use a 50:50 split”, ChatGPT-4o has taken a broader interpretation of my question and has outlined the strategy one should use for deciding on and monitoring the best split. And it’s a GOOD answer! I’d be happy to receive an answer like that from an expert B2B PPC practitioner.

ChatGPT-4o responds well to follow-up prompts. This is two-edged. If you try to use ChatGPT-4o like a traditional search engine, where you put in a single query and then scroll through a bunch of answers, you will sometimes be disappointed. But if you are willing to learn the habit of using a few follow-up prompts, you’ll get much better results. And it’s not hard to learn some good prompting habits.

Here’s an example from my research:

I found the follow-up prompt “What were your sources for the last answer?” incredibly helpful. I also found that including the sentence “Please check your information is current” within the prompt would usually encourage ChatGPT-4o to check relevant websites.

ChatGPT-4o is fast. Logically this shouldn’t really matter. If a tool saves me an hour of manual research, whether I wait 5 seconds or 50 seconds for the output isn’t a big deal. But previous generative search tools did feel a little sluggish sometimes. ChatGPT-4o has trimmed its response times to the point where it is lot easier to remain fully engaged with the results as they appear. There are still occasional pregnant pauses, but overall the experience feels much more conversational and the end result is more pleasant at a human level. This is especially valuable when follow-up prompts are needed – which they often are. I think most B2B researchers will feel this is a big deal.

ChatGPT-4o still hallucinates. I’ve written before about the “hallucination” issue with generative search. My testing suggests that hallucinations are rarer and more subtle in ChatGPT-4o than in previous generative search tools. But they still exist, and in niche searches they have potentially dangerous consequences. For instance when I used my test question “Is Sharp Ahead a good B2B digital marketing agency?”, ChatGPT-4o referenced a very plausible, but totally imaginary, positive online review. The saving grace here is that ChatGPT-4o will reference its sources, so it’s easier to fact check its answers.

There are no ads, and no evidence that ChatGPT-4o sees Bing Ads. Even when ChatGPT-4o has pulled in search results from Bing that would normally include paid ads, the results that appear in ChatGPT-4o show no trace of the paid content.

There are still a few obvious bugs. For instance, I saw a few examples of formatting problems when I asked ChatGPT-4o to put its results in a table.

And you’ll see that one of the suggestions for “best alternatives to WP Engine” is, erm, WP Engine.

But these bugs were minor and didn’t significantly undermine my overall confidence in the tool or the usefulness of the results.

The learning curve for ChatGPT-4o is shallow and enjoyable. I learned a lot about ChatGPT-4o in just a few hours of methodical testing. I wasn’t frustrated or annoyed at any point. Sometimes the tool didn’t do what I wanted, but I was able to quickly pick up techniques that improved the results. The rapid response time helped here, I’m sure.

I’d like to share a few more of my tips and tricks about how to get the best from ChatGPT-4o as a B2B research tool, but that will have to wait for another blog. (Or get in touch if you want to chat!)

What might this mean for B2B marketers?

I’ve already written some of the possible implications of generative search for the future of B2B marketing here. I think everything I said then still holds true, and the emergence of ChatGPT as a credible B2B search engine introduces another possibility – if ChatGPT is successful, high intent B2B search traffic might disappear from conventional search engines altogether.

ChatGPT is currently offered as a paid tool. Its use is funded by subscriptions, not by advertising. If that stays the same and if ChatGPT is able to win market share from Google, Bing and the other established search engines, paid search might become a much less important part of the B2B marketing mix in future. And SEO will need to change to target ChatGPT’s special use cases.

But will ChatGPT actually displace the existing search engines? In part that will depend on how good a job Google and Bing do with their own generative search experiences. And those are improving all the time.

So it’s time for me to take a fresh look at those and see how the latest evolutions of generative search from Bing and Google compare with the new ChatGPT kid on the block. It’s going to be particularly interesting to compare ChatGPT-4o with the latest developments from Bing – is it better to add generative AI to a search engine, or to integrate a search engine within a generative AI tool? Stay tuned for an update in a future blog post!

If you are interested in how Generative Search might impact your B2B marketing strategies, or if you need help with any other aspect of your B2B digital marketing, please get in touch. We offer a free, no-obligation 30-minute consultation.