Intro
The main job of a pixel is to extract useful information about the user (URL with all parameters, referrer URL, IP address, fingerprint of a guest user and user ID of a logged in user, etc) and deliver it to our data warehouse.
As you can see, there are 3 steps to this process:
- Collect the information for pageviews and event.
- Send it to the backend.
- Make this data available in our data warehouse.
Third party solutions
The quickest way to set up a 3rd party custom pixel. Companies like https://matomo.org/ and https://snowplowanalytics.com/ provide a pixel that is almost identical to the one from Google Analytics and can give you the raw data.
DIY solution
Pixel code
Both Matomo and Snowplow open-sourced they pixels:
It means that the 1st point from our list has been covered – we just need to host the minified JS of the pixel in our CDN and our pages can start sending pageviews and events.
Backend and storage
After we added a pixel JS to our webpages we need to configure it – provide a URL where to send data to.
Monolith approach
We can implement such endpoint in our production server. It’s the quickest way although our analytics will fight for resources with our users, which is not ideal.
Microservice approach
We can implement a custom backend endpoint to receive requests from pixel and store them in the database.
We won’t add latency for our users experience but we’ll have another server to maintain.
Serverless
We can go serverless and implement an AWS Lambda function that will process pixel requests.
CDN logs
The best approach is to use CDN to log requests. In that scenario, pixel won’t send AJAX requests to our server but will load a 1px image from our CDN: https://our-cdn-url.com/?url=...&referrer_url=...&...
.
It’s the fastest approach because we’re leveraging CDN architecture (requests will hit the server closest to the user).
If you’re going with CDN approach you’ll have to parse CDN logs regularly to extract the pageviews and events data.