Parsing of Links in Email

One of the features we have received questions about is the parsing of links which a part of the email message object in the messages API.

Example Email Message Object

  "_id": "BotvTxaona7gLID1Adtpfj8Fnfi7HSSv-0",
  "from": [
      "address": "",
      "name": "Microsoft Store"
  "to": [
      "address": "",
      "name": ""
  "cc": null,
  "bcc": null,
  "subject": "Ahoy, Sea of Thieves for PC is here",
  "savedBy": null,
  "originalInbox": "",
  "inbox": "",
  "domain": "",
  "received": "2018-03-29T18:28:07.732Z",
  "size": 23420,
  "attachments": ["c830ee26e0a326e0a30c585494793479"],
  "ip": "",
  "via": "",
  "folder": "inbox",
  "labels": [],
  "read": null,
  "rtls": true,
  "links": [
  "spam": 0.331

In this example there are 4 links that have been parsed from the email.

HTML vs Plain Text
Email messages are generally sent in with two content-type encodings. The first is text/plain
, this format is designed for email clients that do not support HTML. The second, text/html, is designed for email clients that support HTML.

What Links Are Parsed?
Mailsac only parses links found in the text/plain section. Marketing emails may contain dozens of tracking links in their HTML formatted emails, making it hard to determine which links are important. If a link is important for the customer to see, it is almost always included in both the text/html and text/plain section.

As an example, the latest email I received from Del Taco contained 19 links. How many of those links were present in the text/plain section? 1 link, the unsubscribe button.

What other questions do you have about our processing of messages?

The pressing question @mjmayer is why do you subscribe to del taco’s newsletter? And I hope you’re fetching mailsac email using pop3 into gmail, not just using gmail…

In any event, as of recently, Mailsac now parses HTML links.