Every time someone visits your website, they leave traces. Pages viewed. Buttons clicked. Items added to a cart. This is first-party data — information you collect directly from your own visitors, on your own site. It’s the most valuable data you have, and unlike third-party tracking, it doesn’t require shady workarounds or invasive surveillance.
But first-party data collection has boundaries. Some things you can track freely. Others require explicit consent. And a few things you simply shouldn’t track at all, no matter how useful they seem. In this guide, I’ll walk you through what counts as first-party data, what you can legally collect, and where the lines are drawn.
First-Party vs Second-Party vs Third-Party Data
Before diving into what you can track, let’s clarify the three types of data. This distinction matters because privacy regulations treat them very differently.
| Data Type | Who Collects It | How It’s Collected | Examples | Privacy Risk |
|---|---|---|---|---|
| First-party | You, on your website | Directly from your visitors | Pageviews, form submissions, purchase history | Low (if handled correctly) |
| Second-party | A trusted partner | Shared through a data agreement | Co-marketing data, partner audience insights | Medium |
| Third-party | External companies | Cross-site tracking, cookies, ad networks | Browsing history across sites, ad profiles | High |
First-party data is the gold standard for privacy-first analytics. You collected it. You know the context. Your visitors gave it to you — either explicitly (by filling out a form) or implicitly (by browsing your site). Third-party data, by contrast, is collected by someone else and often aggregated without meaningful consent.
The shift away from third-party cookies is making first-party data even more important. As browsers block cross-site tracking and regulations tighten, businesses that rely on first-party data are better positioned. For more context on this shift, see our guide on how tracking works without cookies.
What Counts as First-Party Data?
First-party data is any information collected directly through your own channels. It falls into two categories: behavioural data (what people do on your site) and declared data (what people tell you).
Behavioural Data (Observed)
- Pageviews — which pages were visited, in what order, and for how long (see why pageviews still matter)
- Sessions — groups of interactions within a time window (learn about sessions in web analytics)
- Click events — button clicks, link clicks, downloads
- Scroll depth — how far down a page someone reads
- Referral source — how the visitor arrived (search, social, direct)
- Device and browser type — screen size, operating system, browser version
- Approximate location — country or city based on IP (without storing the IP)
- Cart actions — items added, removed, or abandoned
Declared Data (Given by the Visitor)
- Email address — from newsletter sign-ups or account creation
- Name and contact details — from forms, checkouts, or support requests
- Preferences — language, notification settings, product interests
- Survey responses — feedback, NPS scores, product reviews
- Purchase history — what someone bought, when, and for how much
What You Can Track Without Consent
Under most privacy frameworks — including GDPR — you can collect certain data without asking for consent, provided you do it properly. The key requirement is that the data must be anonymous or aggregated, meaning it cannot be tied back to a specific individual.
Here’s what typically falls into the consent-free zone:
| Data Point | Consent Needed? | Conditions |
|---|---|---|
| Aggregate pageview counts | No | No individual identification |
| Referral source (aggregated) | No | Not linked to individual sessions |
| Browser/device type (aggregated) | No | Not used to fingerprint individuals |
| Country-level location | No | Derived without storing IP addresses |
| Page performance metrics | No | Technical data, not personal |
| Anonymous event counts | No | E.g., “42 people clicked the signup button” |
This is exactly how privacy-first analytics tools like Plausible, Umami, and Fathom operate. They collect aggregate statistics without identifying individual visitors. No cookies. No IP storage. No consent banner needed.
For a deeper look at whether you need a cookie banner, check our guide: Do you actually need a cookie banner?
What Requires Consent
The moment you start collecting data that can identify — or could be used to identify — an individual, you enter consent territory. Under GDPR, this is called personal data. Under other frameworks like the Australian Privacy Act, similar rules apply.
Data that requires consent includes:
- Email addresses — even for analytics purposes
- Full IP addresses — these are personal data under GDPR
- User IDs or account identifiers — anything that links activity to a person
- Cookies that track individuals — persistent cookies, session cookies for tracking
- Device fingerprints — combining browser, screen, and OS data to identify someone
- Cross-site tracking data — following users between different websites
- Location data (precise) — GPS or neighbourhood-level location
What You Should Never Track
Some data is either illegal to collect, or so risky that no analytics benefit justifies it:
- Health information — unless you’re a health provider with appropriate safeguards
- Racial or ethnic origin
- Political opinions or religious beliefs
- Biometric data — facial recognition, fingerprint data
- Data about children — under-13s (or under-16s in some jurisdictions) require parental consent
- Passwords or payment card numbers — never log these in analytics events
Under GDPR Article 9, these are called special category data and have the strictest protections. Even accidental collection — like a form field that captures health information in free text — can create liability.
How Privacy-First Tools Handle First-Party Data
Privacy-first analytics tools are designed to collect useful first-party data while staying within legal boundaries. Here’s how they do it:
| Technique | What It Does | Used By |
|---|---|---|
| IP hashing/discarding | Converts or removes IP addresses before storage | Plausible, Fathom, Umami |
| No cookies | Avoids persistent identifiers entirely | Plausible, Fathom, Umami, GoatCounter |
| Session estimation | Uses hashed, rotating identifiers instead of cookies | Plausible, Fathom |
| Aggregation | Stores only totals, not individual records | GoatCounter |
| Data minimisation | Collects only what’s needed for the report | All privacy-first tools |
The result is that you get meaningful analytics — traffic sources, top pages, visitor counts, and trends — without ever storing data that could identify a specific person. For a complete guide to setting this up properly, read our privacy-compliant tracking guide.
Practical Advice for Small Businesses
If you’re running a small business website, here’s a straightforward approach to first-party data collection:
- Use a cookie-free analytics tool — Plausible, Fathom, or Umami will give you traffic stats without consent requirements
- Keep form data separate from analytics — your email sign-up form and your analytics tool don’t need to share data
- Don’t track what you won’t use — if you’ll never look at scroll depth data, don’t collect it
- Write a clear privacy policy — explain what you collect and why, even if consent isn’t required
- Set data retention limits — don’t keep analytics data forever. 12–24 months is usually enough
- Review your forms — make sure you’re not accidentally collecting data you don’t need
I worked with a small e-commerce client in Brisbane who was collecting 23 data points on their checkout form. After an audit, we reduced it to 8. Conversion rates went up, privacy risk went down, and they had cleaner data in their analytics. Less really is more.
The Future of First-Party Data
As third-party cookies disappear and privacy regulations expand, first-party data becomes your most reliable source of insight. Businesses that build strong first-party data practices now will have a significant advantage.
This doesn’t mean tracking more. It means tracking better. Collecting the right data, with clear consent where needed, and turning it into action. A small, clean dataset you actually use is worth more than a massive data warehouse you never open.
Frequently Asked Questions
Is first-party data always compliant with GDPR?
Not automatically. First-party data can include personal data (like email addresses), which requires a legal basis for processing under GDPR. Anonymous, aggregated first-party data — like pageview counts — doesn’t require consent. But the moment you can link data to an individual, GDPR applies.
Can I use first-party data for remarketing?
Yes, but only with proper consent. If someone gives you their email address via a form and you want to send them marketing emails, you need explicit opt-in consent. The data may be first-party, but using it for marketing beyond its original purpose requires additional permission.
What’s the difference between first-party data and first-party cookies?
First-party data is the information itself — pageviews, purchases, form submissions. First-party cookies are a mechanism for collecting some of that data. A first-party cookie is set by your domain (not a third party), but it can still track individuals and may require consent depending on its purpose.
Do privacy-first analytics tools collect first-party data?
Yes. Tools like Plausible, Fathom, and Umami collect first-party behavioural data — pageviews, referral sources, device types. However, they do it without cookies and without storing personal identifiers. The data they collect is anonymous by design, which is why they don’t require consent banners.
How long should I keep first-party data?
GDPR’s data minimisation principle says you should keep data only as long as necessary. For analytics, 12–24 months is a common retention period. For declared data like customer emails, keep it as long as the relationship is active plus any legal retention requirements. Set automatic deletion policies where possible.
The Bottom Line
First-party data collection is the foundation of ethical, effective analytics. You can learn a tremendous amount about your visitors — what they want, where they struggle, what makes them convert — without ever crossing a privacy line. The key is knowing the difference between anonymous behavioural data and personal identifiers, and treating each appropriately.
Start with a privacy-first tool. Track what matters. Ask for consent when you need personal data. And remember: the goal isn’t to know everything about your visitors. It’s to know enough to serve them better.