June 2, 2021
How to do Path Analysis in Google Analytics 4
Analysis Hub in Google Analytics 4 is a suite of features that allow you to dive deeper into your data. In the past, I have explained how to work with Exploration reports and Funnel Analysis. This time, let’s take a look at another type of report, Path Analysis.
In Universal Analytics (GA3) flow reports were quite limited (that’s why I rarely used them). But with GA4, the improvement is obvious. Even though Path Analysis is not perfect and there is more room for enhancements, you can already achieve more with it than with flow reports in GA3.
In this blog post, I will show you how to use them and will share several ideas of what can you analyze next.
Table of Contents
+ Show table of contents +
- What is Path Analysis?
- #1. Overview of the interface
- #2. Examples of Path Analysis
- #3. Ideas of what to analyze next
- Final words
If you prefer video content, take a look at the tutorial from my Youtube channel.
What is Path Analysis in Google Analytics 4?
Simply put, you can see how your users/visitors are using the website/app. You can see what kind of events/pages occurred first and if you click on a certain branch, you can then see what pages/events occurred after that.
This can help you understand paths that your users/visitors take after completing certain actions.
If it’s still not quite clear, don’t worry. In the second half of this blog post, we will take a look at practical examples.
#1. Overview of the interface
In this section, I will show the describe features that you can use. Later, you will learn how to apply them in practice.
To get started, log in to your Google Analytics 4 property click Analysis > Analysis Hub (on the left sidebar). Then select Path Analysis.
The interface in the Analysis hub is split into 3 main parts/columns:
- Tab settings
- The output (the report/visualization that is generated based on your configuration)
Let’s take a closer look at each part.
This has nothing to do with Google Tag Manager’s variables. The Variables section in the GA4 Exploration reports is the place where you select data that you plan to use in a report:
- Date range
Also, you can change the name of the analysis in the top-left corner of the interface.
To select a different date range, click the date in the top-left corner (below the analysis name) and select whatever you need (like “last 7 days”, “Last 30 days”, etc.).
If you want to see how different groups of your users/visitors are behaving, you should include those segments in the Segments section first (I will later show how to add them to the actual path analysis).
You can choose from several segments that are already included or you can add your own. You can do that by clicking the plus icon:
Then you can either create a custom segment or select a suggested one.
Speaking of custom segments, there are 3 types:
- User segment where you can include data of all users that match certain user criteria. Example: show all events of users who are from the US. This is equivalent to user segments in Universal Analytics.
- Session segment where you can include data of sessions that match certain criteria. Example: show all events of sessions where visitors downloaded an ebook. This is equivalent to session segments in Universal Analytics.
- Event segment where you include ONLY events based on certain criteria. This was not available in Universal Analytics. Example: Show data only of those events of which name is either page_view or session_start.
#1.1.2. Dimensions and metrics
Dimensions in Google Analytics are parameters/attributes of an event, product, transaction, user, etc. Basically, they are attributes that describe something. For example:
- Event name
- Transaction ID
- User’s pricing plan
- Product category, etc.
Metrics help us measure. For example, how many events were made yesterday, how many transactions were completed last week, etc.
Speaking of the actual metric names (and examples) that you can find in GA4, here are some examples:
- Event count
- Number of active users
- Number of transactions
- Engagement rate
However, at the moment of writing this blog post, Path Analysis supports only three metrics:
- Active users
- Event count
- Total users
Now that we got some very basics, here’s the catch in the Analysis hub. If you want to use a metric or a dimension in any analysis technique within that report, you have to include the dimension/metric in the Variables column first. You can do that by clicking the Plus icon and then selecting what you need.
After you select the needed metric/dimension, click Apply button in the top-right corner.
As I’ve said before, you cannot add more metrics to your Path Analysis, they will all be greyed out.
To sum up, the Variables column is responsible for the data input. If you want to use some segments/metrics/dimensions, they must be included in that column. If you are missing something later down the line, you can add new items on the fly. Also, you can change the date range (which, obviously, also affects the scope of data input).
#1.2. Tab settings
In this column, you can configure what the report will look like. First, there is a Technique drop-down menu (where you can select from things like Exploration, Funnel Analysis, etc.). In this blog post, I focus only on the Path Analysis part.
Here you can add one segment to analyze only a certain subset of your data. The comparison here (like Exploration reports) is not available.
#1.2.2. Node type
Unfortunately, you cannot change anything here. This part shows the types of nodes that you can use in the report.
What is a node? Every branch in the path analysis visualization is a node. At the moment, the data can be split by Event name, Page title and screen name, or Page title and screen class. I wish there was one more dimension here, Page Path. Hopefully, this will be added at some point in the future.
#1.2.3. View unique nodes only
Below the Node type, there is a View unique nodes only switch. It’s easiest to explain this feature with an example. Imagine that a visitor lands on your website and you track the following events in this exact order:
If that toggle is disabled, it means that identical subsequent event names (or pages/app screen dimensions) will be also shown. Then the flow will look like this: session_start -> page_view -> page_view -> view_promotion.
If you enable the View unique nodes only, then several subsequent events will be ignored and only the first one will be shown. Having the aforementioned 4 events in mind, the flow will look like this: session_start -> page_view -> view_promotion.
In this section, you can add a dimension that you want to use as an attribute to slice your data. For example, you can add the Device category dimension, and then you will see how your mobile users are behaving vs desktop.
At the bottom of the visualization, you will start seeing a legend. Click on the one that you are interested in and it will be highlighted.
Here you can add a metric that will be displayed (as numbers) in the path analysis visualization. As of the moment of writing this blog post, 3 metrics are supported:
- Active users
- Event count
- Total users
You can add filters that will be used to narrow down a subset of data that you are analyzing. However, I’ve noticed some quirky behavior here.
Here is a situation that I wanted to achieve.
I wanted to see only certain events in the path analysis, page_view, session_start, purchase. Unfortunately, I was not able to make it work. Even if I add one filter Event exactly matches page_view, the output of the report was empty.
If you, like me, want to narrow down just to a certain list of events, create an Event segment instead (where you narrow down to just certain events). See the screenshot below.
However, a filter where I narrowed it down to only mobile users worked well.
#1.2.7. Node filters
If you want to exclude a certain node in your visualization, do the right-click on that node’s name and then Exclude node. Then you can select either to exclude it from all paths or just this one.
That filter then will be added to the Node filters section.
#1.3. The Output (report)
Once you configure all the necessary things in the Variables and Tab Settings columns, the path analysis visualization will be visible. Here you can do several things.
Add new tabs if needed. Every tab can use a different analysis technique (exploration, funnel analysis, etc.). Also, duplicate and delete them by clicking the Triangle icon next to a tab name:
Start over, undo, redo, share the report, download report:
In fact, clicking Start over will unlock a long-requested feature, reverse pathing. In GA3, it was only possible to create flow reports from the beginning (e.g. when someone lands on the homepage).
In GA4 path analysis, you can start from the end (e.g. when someone subscribes to your newsletter). Then you can see what events/pageviews occurred before a visitor subscribed. This can give you some hints and a better understanding of what happens.
#1.3.1. Navigating the Path Analysis report
When the path analysis visualization is ready, click on a branch that you are interested in. Then you will see what visitors did next. Under the page title or event name, you will see a metric. This metric on what you have selected in the Values section of the Tab settings.
If you want to change the starting point (for example, you need a different event or page), you will have to click Start over. This will clear the visualization (but your Variables and Tab Settings will remain unchanged).
Speaking of all the other columns in the visualization, you can change their types, for example, instead of Event Name, you can select Page Title and Screen Name.
If you don’t want to see certain nodes in a particular column, you can click the pencil icon above it and then uncheck events/pages/screen names that you don’t want to see.
Then those values will be hidden in the bottom branch (a.k.a. More).
In Path Analysis, you can have up to 10 columns.
If you want to start from the end (a.k.a. reverse pathing), you will need to click Start over. Then click on the Ending point and select whether the initial column will be based on the event, page title, or some other dimension displayed there.
#2. Examples of Path Analysis
Enough of the theory. It’s time for some practical stuff. In this demonstration, I will be using the official Google Analytics 4 demo account. It allows you to play around with real data that is coming from Google’s Merch Store.
If you want to get access to it too, click this link and then click ACCESS DEMO ACCOUNT.
If you are not sure how to use the demo account, watch this quick tutorial.
#2.1. What happens in the checkout after the begin_checkout event
If you have implemented GA4 e-commerce tracking (you can learn how to do it in my Intermediate/Advanced GTM course), you will start collecting the begin_checkout event that is sent every time a visitor starts a checkout.
Let’s take a look at what your visitors do when this event is fired.
First, click Start over in the top-right corner of the analysis hub. Then select the Event name in the starting point.
Then a sidebar will appear on the right side. Select begin_checkout.
Out of curiosity, let’s disable the View unique nodes only option.
This is a friendly reminder that I am doing this example in the GA4 demo account. When you do this, the data will be different (because we are working with different date ranges).
Now, I will click the first begin_checkout. I see that the next event is page_view but there is one more event hiding under More and the event count there is even larger. I’ll click it.
Another begin_checkout event? Let’s click it.
One more begin_checkout event. That’s not good. Either the tracking setup on the Google Merch store is incorrect or something is wrong with the actual checkout.
In these cases, I usually start doubting the tracking setup. Things can easily break (or maybe they were never correct in the first place).
So I decided to start the checkout on Google’s merch store myself and check the browser console with Data Layer Inspector. Turns out, when the checkout begins, a bunch of the same events is sent to the same GA4 property.
The best way here is to contact the developer of the website (who is responsible for managing dataLayer.push codes) and ask him/her to fix it. Only one begin_checkout event should be sent to GA when a visitor starts to checkout.
And I was able to identify this with help of Path Analysis in Google Analytics 4.
#2.2. An example of reverse pathing
In my Youtube tutorial about path analysis, I also share an example of reverse pathing.
I saw that Google’s merchandising store had a lot of error events tracked. So I clicked Start over, selected the event errors as the ending point of the analysis, and then checked what users were doing before it.
Turns out there was a broken link on the homepage that redirected visitors to a “Page not found” error. In fact, that error was the most popular event that people were doing after they landed on the homepage. Yikes!
You can watch the Youtube video with that path analysis example here.
#3. Ideas of what to analyze next with Path Analysis
If you are looking for more inspiration for what to analyze next with Path Analysis in Google Analytics 4, here are several ideas. I won’t be diving deep into their actual configuration but I believe that after reading this blog post and watching the tutorial, you should be able to do this yourself.
Idea #1. What are your visitors doing once they land on your homepage?
Idea #2. What are your visitors doing in the checkout when they face an error? You can create a session segment where there is one event named begin_checkout and then the other event that is called errors (or whatever is the name of event for errors in your setup). Just keep in mind that neither begin_checkout nor errors is a built-in automatically tracked event. You will need to configure them manually.
Idea #3. What are your visitors doing before they subscribe to a newsletter? Let’s say that every time someone subscribes to your newsletter, a generate_lead event is sent to GA4. Then you could select that event as an ending point of the path analysis and go backwards. Learn what is the most common behavior of your visitors/users in this case.
Idea #4. What are your users do after they login? But for that, you would need to implement the login tracking. Ask a developer to push the login event to the Data Layer every time a user logs in. Then send that event to GA4. If you want to learn more about custom event tracking with GTM and GA4, my course for beginners is exactly what you need.
Path Analysis in Google Analytics 4: Final words
Compared to flow reports in Universal Analytics (GA3), path analysis definitely has some improvement. Just the ability to do reverse pathing is worth a try.
If some of this blog post’s content was a bit difficult to follow, I recommend watching my Youtube tutorial. Topics like these, in my opinion, are learned best when you combine text + video.