Parsing of Links in Email

One of the features we have received questions about is the parsing of links which a part of the email message object in the messages API.

Example Email Message Object

{
  "_id": "BotvTxaona7gLID1Adtpfj8Fnfi7HSSv-0",
  "from": [
    {
      "address": "[email protected]",
      "name": "Microsoft Store"
    }
  ],
  "to": [
    {
      "address": "[email protected]",
      "name": ""
    }
  ],
  "cc": null,
  "bcc": null,
  "subject": "Ahoy, Sea of Thieves for PC is here",
  "savedBy": null,
  "originalInbox": "[email protected]",
  "inbox": "[email protected]",
  "domain": "mailsac.com",
  "received": "2018-03-29T18:28:07.732Z",
  "size": 23420,
  "attachments": ["c830ee26e0a326e0a30c585494793479"],
  "ip": "65.55.234.211",
  "via": "144.202.71.79",
  "folder": "inbox",
  "labels": [],
  "read": null,
  "rtls": true,
  "links": [
    "https://support.xbox.com/games/game-titles/xbox-play-anywhere-help",
    "https://e.microsoft.com/Key-3567701.C.CQZpy.J.K0.-.CpMBp0",
    "https://account.microsoft.com/profile/unsubscribe?CTID=0&ECID=jIce0uXtDC5qRlyCYqZsz5yCL"
  ],
  "spam": 0.331
}

In this example there are 4 links that have been parsed from the email.

HTML vs Plain Text
Email messages are generally sent in with two content-type encodings. The first is text/plain
, this format is designed for email clients that do not support HTML. The second, text/html, is designed for email clients that support HTML.

What Links Are Parsed?
Mailsac only parses links found in the text/plain section. Marketing emails may contain dozens of tracking links in their HTML formatted emails, making it hard to determine which links are important. If a link is important for the customer to see, it is almost always included in both the text/html and text/plain section.

As an example, the latest email I received from Del Taco contained 19 links. How many of those links were present in the text/plain section? 1 link, the unsubscribe button.

What other questions do you have about our processing of messages?

The pressing question @mjmayer is why do you subscribe to del taco’s newsletter? And I hope you’re fetching mailsac email using pop3 into gmail, not just using gmail…

In any event, as of recently, Mailsac now parses HTML links.

1 Like