How it works

Five detection layers run on every contact-form submission. Each layer is cheap, fast, and runs in a fixed order.

FormFence doesn't ask your customers to identify traffic lights or pick the bicycles. It runs five checks on every contact-form submission, in a fixed order, and decides whether the message reaches your inbox.

The cheapest checks run first. If any layer blocks a submission, the rest don't run.

Why this is needed alongside Shopify's hCaptcha

Shopify applies invisible hCaptcha to most customer-facing forms by default, including the contact form. hCaptcha is a risk-scoring engine tuned to catch automated form-filling at scale: many submissions, high velocity, machine-like behaviour, suspicious IPs, headless browsers. It does that job well, and you should leave it on.

The shift since 2024 is that the new wave of contact-form spam doesn't look like that. It arrives one carefully-crafted message at a time, on a residential IP, from a plausible-looking email address, with body text written by GPT-4-class models. Those signals don't trip hCaptcha's risk scoring. They look like a real customer right up until you read the body and notice it's pitching SEO services or a crypto launch.

hCaptcha also gives you no visibility. There's no log, no verdict reason, no way to see what was blocked, no way to rescue a false positive. A real customer hitting a challenge and abandoning the form is invisible to you. So is whatever hCaptcha is letting through.

FormFence is built for the layer hCaptcha doesn't cover. The two systems do different jobs and run side by side:

Shopify hCaptcha decides can this submission happen at all (automation-style protection)
FormFence decides is this submission worth your time (content-aware filtering, with a log you can read)

Keep hCaptcha on. Add FormFence on top.

The five layers

1. Honeypot field

The Theme App Extension adds an invisible field to your contact form. Your customers never see it and never type in it. Bots that fill every field they find will fill this one too. Anyone who does is blocked.

This catches the bulk of automated spam before any other check runs. It costs nothing on the storefront (one hidden input element) and nothing on the server (a string comparison).

2. Per-IP rate limit

If one IP address submits more than a small number of contact-form messages in a short window, the later ones are dropped. Real customers don't fill out your contact form five times in a minute. Spam bots routinely do.

The threshold is fixed and not user-configurable.

3. Disposable email detection

The submitter's email is checked against a curated list of throwaway email providers (Mailinator, Guerrilla Mail, and similar). Submissions from those domains are blocked.

The list is updated as new providers surface. If you want a specific domain treated differently, email formfence@harbourlabs.app and we'll review it.

4. Content patterns

The message body and subject are checked against a layered catalogue of detection rules covering every spam family we've seen in real Shopify contact forms:

High-confidence keyword and regex rules that block on a single distinctive phrase (for example "reset your password", "claim your prize", "viagra")
Wordlist density rules that fire when several themed terms appear together (for example romance vocabulary like "lonely + relationship + soulmate")
Weighted-vocabulary scoring that sums small weights across ~80 marketing-spam terms and blocks when the total crosses a threshold. Lets us catch templated spam that uses generic words ("free", "click here", "amazing") only when many co-occur, without false-positive risk

Families currently in the catalogue: phishing, romance / dating, reward / prize, marketing-promo blasts, NSFW, pharma, work-from-home, crypto, SEO outreach, cold web design, gambling, loan scams, Latin filler (lorem-ipsum spam), URL floods, link shorteners, all-caps body, very short body. The catalogue grows from real spam landing in real shops.

5. AI classifier (LLM hybrid)

When the rule-based layers haven't decided, the message goes to an AI classifier (Anthropic Claude Haiku via Vercel AI Gateway, zero data retention) for a second-opinion verdict. The classifier returns spam / not-spam plus a confidence score; FormFence blocks only when the AI is at least 70% confident the message is spam.

The AI step:

Only runs when rules left the verdict as "passed" and the body is long enough to classify (~20+ characters). Clearly-spam messages caught by rules and clearly-clean short messages never trigger it.
Has a 2-second timeout. On timeout or error, the rule-based verdict stands (fail-open). A legitimate customer is never blocked because the AI was slow.
Can be disabled per shop from the Settings page. With it off, only rule-based detection runs and no submission text is ever sent to the AI provider.
Is fully audited per submission: the detail pane shows whether the AI was consulted, what it returned, and at what confidence.

Sensitivity controls strictness across all layers

The Sensitivity setting (Low / Medium / High) tunes how aggressive the rule-based layers are AND scales the weighted-vocabulary scoring threshold:

Setting	Weighted threshold	Effect
Low	× 1.5	Needs strong evidence to block. Lowest false-positive risk.
Medium (default)	× 1.0	Catalogue defaults. Balanced.
High	× 0.5	Single weight-4 term blocks alone. Catches the most spam, more false positives.

If a real enquiry gets caught, you can move it back to the Passed log with one click. See Settings reference for the full description of each level.

What FormFence does store

FormFence stores the contents of each submission so you can read genuine enquiries that pass and review what was blocked. It also stores a geo-IP lookup of the sender at submit time (city + country) so you can spot patterns without us retaining the raw IP itself past 7 days. The full data inventory, retention windows, and sub-processor list is on the Privacy and data page and on our full privacy policy.

What FormFence doesn't do

It doesn't show your customer a CAPTCHA
It doesn't slow down page loads. Detection runs after the form is submitted, not inline
It doesn't train any third-party machine-learning model on your submissions
It doesn't profile customers across stores
It doesn't block submissions based on the customer's identity
It doesn't store the raw sender IP for longer than 7 days. The location shown in the admin is derived from the IP at submit time and stored as city + country only

If you want to dig further into the storefront integration, see the Settings reference.

Was this helpful?

On this page