How to obfuscate email links

gtirloni · February 27, 2023, 6:29pm

I want to add a link to my email but I want to make it harder for non-humans to harvest it.

It seems I cannot have something like “user at domain dot com” in a mailto link. The Markdown gets broken somehow.

Any suggestions?

jsonbecker · February 27, 2023, 7:48pm

FYI I have had a link to my email address on my page for a long time (not my real email, but one that works) and I’ve received 0 spam. That said, your best bet is a real link on say, a Contact page and using robots.txt to make that page not indexed.

gtirloni · February 27, 2023, 8:12pm

thanks for the advice. glad to hear micro.blog is not targetted by spambots (yet?).

I’ve opted to create a typeform form and embed it into the page. it seems to work fine.

sod · February 27, 2023, 8:47pm

There’s no way to protect yourself 100 % from crawlers. If there’s a way for humans to click your mailto link, sophisticated email scrapers will find it. There are strategies like encoding the address (using HTML entities or URL encoding) or obfuscating the address with JavaScript. But they will only trick the most naïve bots.

Instead:

don’t publish your email address and use a (spam protected) form
or use a service like Apple’s Hide My Email to get a unique address (if you get overwhelmed with spam, just generate a new one)

Or, just publish your email address and let your spam filter take care of the spam.

gtirloni · February 27, 2023, 8:59pm

Yep, that’s what I’m doing. Not looking for 100% protection.

Miraz · February 27, 2023, 9:16pm

Thanks Jason. I think this would be useful information in the @custom blog. Suppose I had a page at https://example.com/about/ which includes an email address. Can you tell me:

where would I put the robots.txt file on a hosted blog?
what would that file contain (if this was all it had in it)?

Miraz · February 27, 2023, 9:17pm

Good suggestion re Hide My Email.

jsonbecker · February 27, 2023, 9:41pm

https://help.micro.blog/t/search-engine-indexing/63/31

This is a good link – basically you can override it in your template and do something like:

User-agent: *
Disallow: /contact

Miraz · February 27, 2023, 9:45pm

Thanks.

sod · February 27, 2023, 10:21pm

@jsonbecker @Miraz This is a great tip to prevent well-behaving crawlers, like search engines, from indexing a page. But bad actors, like email address scrapers, will just ignore the robots.txt and scrape the page anyway.