Google Analytics Problem
tl;dr
I went on a bit of an adventure working out why some pages didn’t appear in Google Analytics. The answer turned out to be simple (it always does) but it took a while to get there and I enjoyed it.
The problem
I maintain http://www.projectreal.co.uk/. It’s hosted by Github Pages from https://github.com/equalityTime/projectreal and I use GA4 to track how many of our lesson plans are downloaded.
Until the 19th October 2021 all the right pages appeared on the Google Analytics
But since then, none of the session pages (like, http://www.projectreal.co.uk/session1) turn up at all:
I’m going to try and work out what stopped the pages from being seen.
The first hint was that I made some commits to the site on that day.
But those diffs are very simple so I don’t think it’s them.
Maybe that I pushed some other commits that day? (I.e. I’d made a bunch of commits locally but only pushed them on that day)
Can I find out what date commits were pushed?
It turns out that this answer gave me a great way of seeing all the pushes and other actions I’d done on the repo - but unhelpfully I only cloned this particular repo this January. There’s also nothing useful in my work logs.
Tracking down the offending commit
To find out which commit made the difference I’m going to revert the site to a previous state (the 8th October 2021) and see if it works.
19/10/22 15:15:49 git branch mcve
19/10/22 15:15:53 git checkout mcve
19/10/22 15:15:57 git reset --hard 70e0da036de2b7509e89782f740b893a1037c909
19/10/22 15:16:01 git status
19/10/22 15:16:03 git log
19/10/22 15:16:16 git push origin mcve
EDIT: I did the Git stuff in a slightly silly way because of a lack of confidence, but am still quite pleased.
That gave me this:
OMG it worked? The commit from Oct 8th successfully shows the realtime view?
Yep, checked the next morning. Okay. Now to move up the chain.
I worked my way up the chain to commit 382e83c198576605563561566fa9755f1f527320. That commit is the one that makes the difference. It has this sort of diff:
--- a/session1.md
+++ b/session1.md
@@ -2,7 +2,6 @@
layout: page
---
-## Session 1: Fake News
The materials for the session can be found on the PowerPoint (“Fake News”). The PowerPoint has notes whi
ch outline the timings and other information which might be useful. This session should take around 50-6
0 minutes. You do not need any equipment for this session except a projector. The session is based on sm
all group work/paired work.
The aims of this session are to teach students:
diff --git a/session2.md b/session2.md
index bfc16fb..f1b61a9 100644
--- a/session2.md
+++ b/session2.md
@@ -2,7 +2,6 @@
layout: page
---
I rejected this commit earlier because it was only ‘content’, and that couldn’t make a difference to the Analytics. But now I’m thinking that ‘removes the page title’ possibly does.
Let’s find out about titles
Looking back at the failing version, the title isn’t used at all?
Actually, the title is this:
<title>Project Real | Using domain experts and co-creation to combat misinformation online and in person. Shared under a creative commons Attribution-NonCommercial-ShareAlike CC BY-NC-SA licence.</title>
which is the description from the .yml file.
So that’s super strange. The title in the last version that worked is:
<title>Session 2: Fake Photos | Project Real</title>
So Jekyll is taking the title from the first heading in the page? But then something very strange is happening - because then there is a heading, it’s shown twice:
So where does this title come from?
So I look at the _includes files in Jekyll and can’t find a <title> tag. When I look again at the generated html I find:
</script>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1"><!-- Begin Jekyll SEO tag v2.8.0 -->
<title>Project Real | Using domain experts and co-creation to combat misinformation online and in person. Shared under a creative commons Attribution-NonCommercial-ShareAlike CC BY-NC-SA licence.</title>
<meta name="generator" content="Jekyll v3.9.2" />
<meta property="og:title" content="Project Real" />
<meta property="og:locale" content="en_US" />
<meta name="description" content="Using domain experts and co-creation to combat misinformation online and in person. Shared under a creative commons Attribution-NonCommercial-ShareAlike CC BY-NC-SA licence." />
<meta property="og:description" content="Using domain experts and co-creation to combat misinformation online and in person. Shared under a creative commons Attribution-NonCommercial-ShareAlike CC BY-NC-SA licence." />
<link rel="canonical" href="http://www.projectreal.co.uk//session4.html" />
<meta property="og:url" content="http://www.projectreal.co.uk//session4.html" />
<meta property="og:site_name" content="Project Real" />
<meta property="og:type" content="website" />
<meta name="twitter:card" content="summary" />
<meta property="twitter:title" content="Project Real" />
<script type="application/ld+json">
{"@context":"https://schema.org","@type":"WebPage","description":"Using domain experts and co-creation to combat misinformation online and in person. Shared under a creative commons Attribution-NonCommercial-ShareAlike CC BY-NC-SA licence.","headline":"Project Real","url":"http://www.projectreal.co.uk//session4.html"}</script>
<!-- End Jekyll SEO tag -->
<link rel="stylesheet" href="/assets/main.css"><link type="application/atom+xml" rel="alternate" href="http://www.projectreal.co.uk//feed.xml" title="Project Real" /></head>
<body><header class="site-header" role="banner">
Everything in the segment above is in _includes/head.html except the stuff in the “Begin Jekyll SEO tag v2.8.0” tags, which is generated by a `
` tag. That solves a problem from over a year ago - I couldn’t work out why the double header problem was only occurring on site rather than locally - I think that’s the fault of the seo plugin.
Okay, so what’s the plan?
My hypothesis is that removing the SEO plugin will let pages have natural titles, particularly if I also cherry pick the right commit. Let’s try that.
Okay, so I removed the SEO text, but then found that I was getting NO page view notes at all (although it did solve the header problem. None.
Finally working.
It turned out that I hadn’t set the page.title attribute for the affected pages. I hadn’t done it because when I did, this happens:
More to the point - the SEO plugin gets confused when there isn’t a title - so it inserts a title directly above the heading that it JUST took the title from. Sigh.
I used this handy blog to properly remove code from the header while keeping the titles and now I have a functioning site that correctly tracks pages!
Also…
In the process of trying to locate the error I discovered this:
Which I will have to get around to.