Lean web analytics
I don't like Google Analytics. It is slow and invasive. So I'm removing it completely.
I'm still curious about the analytics, because it can answer questions like:
- Which blog posts and articles are interesting to people?
- Are there any broken links that I missed?
- How do people find my content?
At the same time, I don't want to call 3rd party services, let companies set tracking cookies or store personal information for mining.
Here is the current setup:
- Caddy - open source web server with automatic HTTPS
- GoAccess - open source web log analyzer
- Blog - custom static site generator similar to Pelican
Web server
Web traffic is served by caddy. It is a lean web server that handles SSL certificates out-of-the-box.
/etc/caddy/CaddyFile
looks like this:
abdullin.com {
root * /var/www/abdullin.com
file_server
encode zstd gzip
handle_errors {
@404 {
expression {http.error.status_code} == 404
}
rewrite @404 /404.html
file_server
}
log {
output file /var/log/caddy/abdullin.com-access.json
}
}
This serves contents of /var/www/abdullin.com
. It also records structured access logs to /var/log/caddy/abdullin.com-access.json
. These logs are rotated and eventually cleaned up.
Web Analytics
Analytics can be done with goaccess which has caddy plugin. Just install the latest version and execute:
goaccess abdullin.com-access.json --log-format CADDY --ignore-crawlers
Or you can generate a html report:
goaccess access.json --log-format CADDY --ignore-crawlers -o report.html
Goaccess configs are located at /etc/goaccess/goaccess.conf
. You can enable referral details there. My overrides:
exclude-ip MY_IP
#comment these out
#ignore-panel REFERRERS
#ignore-panel KEYPHRASES
It is possible to download MaxMind GeoIP database and use it aggregate visits by country or city:
goaccess access.json --log-format CADDY --ignore-crawlers --geoip-database City.mmdb
Next
This approach is nice, but it stores IP addresses and doesn't display user interaction flows. We try to improve things in (Over) Designing privacy-first analytics.
Published: June 10, 2022.
Next post in Opinionated Tech story: Analyze website logs with clickhouse-local
🤗 Check out my newsletter! It is about building products with ChatGPT and LLMs: latest news, technical insights and my journey. Check out it out