Exclude Query Parameters in Google Analytics 4
In this tutorial you will learn how to exclude query parameters from web addresses in Google Analytics 4.
However, I have another tutorial in case you want to completely block page views from GA tracking because of their URL parameters.
You can also use this tutorial to clean other URLs (e.g. video or link addresses) from query strings.
To do this we will create a filter functionality with Google Tag Manager (GTM). It will filter any parameter from a URL string that we define.
This guide requires that you use GTM for your tracking with GA4.
Before we start solving the problem let’s first define what query parameters are to be aligned.
What is a query parameter?
Query parameters are part of the web address for a web page.
A query parameter starts with ? behind the usual web address and assigns values to variables. Several query parameters result in a query string.
Programmers sometimes use query strings so a server can tell from the parameters that it should serve a modified version of the requested web page.
For example, a query string might look like this:
?site=bluerivermountains.com&referrer=google.com
The entire web address including query parameters then looks like this:
https://bluerivermountains.com?site=bluerivermountains.com&referrer=google.com
With that cleared up, let's now learn how to create a filter function for query parameters in GA4 with the help of Google Tag Manager (GTM).
What causes query parameters in the Google Analytics reports?
User input
Every time a web page is loaded, the GA4 library sends the URL of the page to Google's server with an event.
That means that any visitor can theoretically insert a parameter into the URL and thus send it to your GA reports.
Try it. Add the following query parameter to the end of any web address in the browser and press Enter:
?testParameter=true
The page will load without problems in most cases. And so the parameter was also sent to Google Analytics when the page was loaded.
Tracking Services
Many tools and services use parameters in URLs to track clicks on links.
Examples are e.g. Google Ads, where the parameter ?gclid=
is added to the link when clicking on an advertising link.
Another example are UTM parameters, which are used for campaign tracking in Google Analytics.
The only reason these parameters don't show up in Google Analytics reports is that they are automatically filtered out as the parameters are part of Google's tracking system.
So all other non-Google parameters of Tools & Services are not automatically filtered out. There are countless examples, since parameters for tracking are a common solution.
Here is a short list of tracking services and the associated query parameters:
Tracking Service | Query-Parameter |
---|---|
Bing Ads | ?msclkid= |
Facebook Ads | ?fbclid= |
Google DoubleClick | ?gclsrc= |
Adobe Analytics | ?s_kwcid= |
Klaviyo | ?_ke= |
Hubspot | ?hsa_cam= |
Ebay | ?mkcid= |
So if any service adds its custom tracking parameter to a link to your site, you will later find it in the Google Analytics reports.
Why are query parameters a problem?
Parameters are not necessarily a problem. But in some scenarios they generate one. For example:
Data privacy issues
Many website systems use parameters during the registration process to send user data to the backend.
If such websites are tracked with Google Analytics or the Facebook pixel, you are now automatically breaking Google's and Facebook's terms of use, because you send private data to their servers via the query parameters in the web addresses.
Afterwards you either get warnings or, in the worst case, you have to expect the your account to be blocked.you either get warnings or, in the worst case, you have to expect the account to be blocked. On top of that you are also breaking EU data protection rules (GDPR).
Problems with Facebook event matching
If you operate Facebook tracking via the browser and via the server using Facebook's Conversion API, tracking data must be deduplicated. Among other things, for the deduplication the web addresses are used. Query parameters that contain personal data are often only filtered out in one of the two data sources, i.e. either in the browser or on the server. Thus, the event matching scores on Facebook plummet.
Problems with data analysis
Tracking tools mostly treat URL's as ordinary strings. This means that parameters are not filtered out automatically. This creates problems in data analysis since data for the same page path is not grouped.
See the following table as an example:
Video URL | Views |
---|---|
https://myvideos/coolSpring?kjh1249nnj=1 | 3 |
https://myvideos/hotSpring?kasd1249nnj=12 | 1 |
https://myvideos/hotSpring?123456=true | 9 |
https://myvideos/coolSpring?kjdkj49nnj=asasjhb328 | 43 |
https://myvideos/hotSpring?k123nj=false | 2 |
https://myvideos/coolSpring?asf45nj | 15 |
As you can hopefully see in the table above, it's difficult to calculate the sum of views for a video when the video URL's contain parameters.
How to exclude query parameters in GA4
How can we filter out query parameters from web addresses in Google Analytics 4?
First we will create a JavaScript variable that will clear the entire query parameter if there is a previously defined parameter in the URL. We then send the URL to Google Analytics without query parameters.
Here we go!
1. Delete query parameters in GTM
To remove the question mark ?
at the end of the web address and the rest of the query string, the first thing we will do in Google Tag Manager is create a new custom variable of type "Custom JavaScript" called Page Location - Custom
Next, let's add the following custom JS code:
function() {// define parameters to excludevar excludeStrings = ["hsa_acc","fbclid","wbraid","hsa_cam","hsa_grp","hsa_ad","hsCtaTracking","submissionGuid","hsa_src","hsa_tgt","hsa_kw","hsa_mt","hsa_net","hsa_ver","li_fat_id","q","msclkid","ref","cache","_x_tr_sl","_sm_nck"];var addressString = new URL(document.location);var queryString = addressString.search;// check if query string holds any parameters, otherwise just return the url without themif (queryString.indexOf("?") != -1) {// https://stackoverflow.com/questions/901115/how-can-i-get-query-string-values-in-javascriptvar getQueryParamsFromURL = function getQueryParamsFromURL() {var match,search = /([^&=]+)=?([^&]*)/g,decode = function decode(s) {return decodeURIComponent(s);},query = addressString.search.substring(1);var urlParams = {};while ((match = search.exec(query))) {urlParams[decode(match[1])] = decode(match[2]);}return urlParams;};// create param object from query stringvar urlParams = getQueryParamsFromURL();// if it holds any of the defined parameters, remove the key and keep the restObject.keys(urlParams).map(function (key) {if (excludeStrings.includes(key)) delete urlParams[key];});// Create filtered query stringvar queryString = new URLSearchParams(urlParams).toString();// add ? to querystring unless it's emptyif (queryString != "") queryString = "?" + queryString;}// return cleaned URLreturn addressString.origin + addressString.pathname + queryString;}
Now look at the third line of code, which defines the excludeStrings
variable:
An array with a list of parameters for filtering is defined. Each of these strings represents the name of a query parameter. If one of the parameters appears in the web address, it will be deleted.
The rest of the URL and the query string remain intact so that important parameters such as gclid parameters (Google Ads) or UTM parameters (campaign tracking) are not accidentally deleted.
The above parameter list from the code snippet resulted over time from Hubspot and Facebook parameters. If you want, delete all parameters and then add your own parameters. However, note the syntax: "parameter1", "parameter2", "parameter3"
etc.
For example, if you wanted to add a personal parameter named myPersonalParam
, the array would look like this (see end):
// define parameters to excludevar excludeStrings = ["hsa_acc","fbclid","wbraid","hsa_cam","hsa_grp","hsa_ad","hsCtaTracking","submissionGuid","hsa_src","hsa_tgt","hsa_kw","hsa_mt","hsa_net","hsa_ver","li_fat_id","q","msclkid","ref","cache","_x_tr_sl","_sm_nck","myPersonalParam"];
We are nearly finished. Go on.
2. Set GA4 configuration
Next, let's go to the tags in our Google Tag Manager container and open the GA4 configuration tag.
Now add the field page_location to the fields to be defined and define our created JavaScript Page Location - Custom as the value:
This setting overrides the web address we send to Google Analytics with our own custom web address.
That means, if the query parameter name was specified in the code snippet, this custom web address no longer contains the parameter.
Finished.
Filter out all query parameters in GA4
The previous solution is based on the assumption that we have no way of knowing which parameters will be added to the URLs and whether or not they should be filtered out.
The analyst must first notice the parameters in the GA4 reports, then decide if they need to be removed and finally expand the parameter list in the JS variable.
On the one hand, the solution gives the user control, since each filtration is considered at least once; on the other hand, updating the list can be cumbersome.
An alternative solution is to filter out all query parameters, except for the one required by Google Analytics (namely gclid- and utm- campaign parameters).
However, the disadvantage of this approach is that control over the filtered out parameters is lost. Actually, you only know which parameters are retained (gclid and utm parameters) and nothing more.
Such a solution ensures consistent web addresses in Google Analytics without manual effort. Therefore, if you are willing to give up some control, you can put the following JS code in the Page Location - Custom variable instead of the above script:
function() {// define parameters to keep if availablevar includeStrings = ["gclid","utm_","gtm_debug"];var addressString = new URL(document.location);var queryString = addressString.search;// check if query string holds any parameters, otherwise just return the url without themif (queryString.indexOf("?") != -1) {// transpile ES2016 => ES2015var _defineProperty = function (obj, key, value) {if (key in obj) {Object.defineProperty(obj, key, {value: value,enumerable: true,configurable: true,writable: true});} else {obj[key] = value;}return obj;};// https://stackoverflow.com/questions/901115/how-can-i-get-query-string-values-in-javascriptvar getQueryParamsFromURL = function getQueryParamsFromURL() {var match,search = /([^&=]+)=?([^&]*)/g,decode = function decode(s) {return decodeURIComponent(s);},query = addressString.search.substring(1);var urlParams = {};while ((match = search.exec(query))) {urlParams[decode(match[1])] = decode(match[2]);}return urlParams;};var filterParamsFromList = function filterParamsFromList(obj, list) {var urlParamKeysFinal = [];var urlParamKeys = Object.keys(obj);// test each param for availability and create array with final keysfor (var i = 0; i < list.length; i++) {urlParamKeysFinal.push(urlParamKeys.filter(function (key) {return key.includes(list[i]);}));}// merge all keys into one list// https://stackoverflow.com/questions/10865025/merge-flatten-an-array-of-arraysurlParamKeysFinal = [].concat.apply([], urlParamKeysFinal);return urlParamKeysFinal.reduce(function (cur, key) {return Object.assign(cur, _defineProperty({}, key, obj[key]));}, {});};// create param object from query stringvar urlParams = getQueryParamsFromURL(); // Create filtered query stringqueryString = new URLSearchParams(// remove any non-matching keys from param objectfilterParamsFromList(urlParams, includeStrings)).toString();// add ? to querystring unless it's emptyif (queryString != "") queryString = "?" + queryString;}// return cleaned URLreturn addressString.origin + addressString.pathname + queryString;}
Note that at the beginning of the script, the variable includeStrings
is defined with all the parameter names that definitely should always be kept in the URL: gclid and utm parameters.
If there are other parameters you want to be ignored, just add them to the array.
I also added the gtm_debug
parameter. It signals Google Analytics when a page is visited in GTM debug mode. As a result, page views are filtered out of the GA reports during debugging.
Filter query strings from other event parameters in GA4
You can also use the above scripts for other event parameters in GA4. For example for Video URLs or URLs of external links.
Almost at the beginning of my code I define the variable addressString
.
var addressString = new URL(document.location);
Now instead generate the variable from the GTM variable that outputs the video URL:
var addressString = new URL({{Video URL}});
The script will from now on remove the query strings from the video URL.
Then you can replace the {{Video URL}}
in the GA4 event tag with the new JavaScript variable.
Be careful when filtering gclid parameters and UTM parameters
Gclid parameters are query parameters that Google Ads adds to the web address of the landing page when an ad is clicked. For attribution in Google Analytics, it is important that these parameters remain in the URL so that the click can be attributed to the paid search channel.
UTM parameters are campaign parameters that Google Analytics users add to links to your website. Using the utm parameter, the user can later see in the Google Analytics reports exactly which website or campaign a visitor came from.
Gclid parameters and UTM parameters are automatically filtered out by Google Analytics during data processing and are not visible in the reports. These parameters therefore do not have to be filtered out manually with Google Analytics.