What exactly is a permalink?

7 min read

Links are the primary way we navigate the internet. With just a simple string of characters I can point you to specific resource on another server that I think you should check out for one reason or another. This can be incredibly useful for citations, in which researchers want to include many articles worth of knowledge to support their claims, but cannot fit all that text within their own article. These get included at the end of the article, in the References section, and historically they would have included metadata to allow someone to go find a physical resource in a library. In the digital age it is now common and expected to digitally link to the location of the resource on the internet. It is vital that the link placed in the references stays stable and resolveable, because once published scientific papers usually aren’t edited unless there is a necessary revision or retraction. So, what defines these “permalinks”, and are they really as stable as we think?

This is where it starts to get interesting, and where some terms that you may or may not be familiar with start to show up. What is essential to keep in mind is that on a technical basis permalinks will never get more complex than a series of redirects, and that the process is never free from human work and upkeep.

The Digital Object Identifier (DOI) System

The Digital Object Identifier System was first introduced in 1997[1], and along with the inception of the system an organization was also created to manage it (The DOI Foundation). The DOI system currently follows the ISO 26324 standard. A DOI consists of two components, a prefix and a suffix which together form a prefix/suffix dyad. The prefix refers to the namespace that has been allocated to a given provider[2]. The suffix represents the unique resource within that provider’s namespace, and can be any length and follow any standard. Thus, a given provider can never run out of suffixes to assign.

A common critique of the DOI name system is that it costs money to “mint” a DOI. This terminology can be confusing, especially in the age of cryptocurrency, because there isn’t the requirement for complex mathematical operations to generate a DOI. Rather, the cost is required because the system is centralized amongst a series of registrants, and those registrants have bills to pay. The cost to register a DOI under their namespace recuperates this cost.

A DOI itself (in the form of prefix/suffix) doesn’t lead to the resource. A resolver must be used to take the DOI and conduct the redirects required to serve the current URL associated with the resource. The most common resolver is https://doi.org. It is so common that CrossRef recommends using the full resolver + DOI syntax[3].

The Archival Resource Key (ARK) System

As opposed the centralized nature of the DOI system, Archival Resource Keys are built to be decentralized. The general thinking between the ARK system is that all archival integrity is essentially a promise between the host and recipients of a resource. In other words, people trust DOI’s because they trust that the promise of the DOI foundation will be honoured for a long time, but there is no reason why any entity cannot also make that promise, and they should have the ability to do so. Then, users can choose which entities to trust and not trust.

ARKs have three requirements[4]:

  1. Give users a promise of stewardship.
  2. Give users a description of the object.
  3. Give users the object itself

The composition of ARKs is as follows:

ARK ANATOMY OVERVIEW
================

    Resolver Service            Compact ARK
   __________________  ______________________________
  /                  \/                              \
  https://example.org/ark:12345/x6np1wh8k/c3/s5.v7.xsl
  \___________________________/\________/\___________/
              Prefixes          Base Name    Suffixes
  \__________________________________________________/
                      Mapping ARK

ARKs can be resolved from any hostname, it does not have to go through a dedicated hostname such as doi.org. However, organizations can register in the NAAN registry to obtain a five digit NAAN that can be used to resolve links through n2t.net (n2t stands for “name to thing”). n2t.net can also resolve other identifiers, such as a DOI (https://n2t.net/doi:10.47366/sabia.v5n1a3) or ORCID IDs (https://n2t.net/orcid:0000-0002-8786-5167).

The Persistent URL System

PURL stands for “Persistent Uniform Resource Locator” and was originally introduced in 1995[5]. The concept was simple; redirect from the PURL to the URL (usually using a hostname that contained the term “purl” somewhere in it). purl.org has since been transitioned and the resolver is currently run and maintained by the Internet Archive via https://purl.archive.org/. Anyone can sign up and use this service.

The difference between the ways of thinking is outlined well in Getting Started with Persistent Identifiers.

  • PURL (Persistent URL) - “URLs are fine if you redirect from purl.org”
  • URN (Uniform Resource Name), DOI (Digital Object Identifier) & Handle - URLs and domain names are bad, except for ours, and we redirect
  • Tim Berbers-Lee - “Cool URLs don’t break” (didn’t go over this in this blog post)
  • ARK (Archival Resource Key) - “URLs are fine if managed well, but do tell us which of your URLs are meant for what kind of persistence”.

When choosing between the various mechanisms, a lot comes down to how much longer you think DNS will be around in its current state. If a permanent link mechanism involves a specific hostname, then you have to believe that hostname will be around for however long you need the link to resolve.

There have been some recent developments focused on attaching permanent link structures to objects such as blog posts. The Rogue Scholar service is available to science bloggers who expose their full posts via RSS and wish to have a DOI for their posts available.

Practically speaking, you can create your own permanent link structure for your own blog however you like, and there are numerous ways you could do this. As mentioned above, you could register on purl.archive.org, or implement your own resolver. For example, you could run archive.example.com, or buy another domain specifically for that purpose (who doesn’t love buying more domains). What matters for longevity is how long you realistically think something should be around, and how much people believe that such links will truly be stable (so they use them!)

For me, I plan on keeping markpitblado.me for life, and my blog at /blog, but who knows what the internet will look like in the future. Perhaps in a few decades time I will have written enough articles that I will need something more than the title to distinguish articles from one another in the URL, but for now, things seem to be pretty stable.

Would you ever implement a permanent link system for any of your sites? Have you already? Feel free to send me an email using the button below, would love to hear all about it.


  1. https://www.doi.org/doi-handbook/HTML/introduction-to-the-doi-system.html ↩︎

  2. https://www.doi.org/doi-handbook/HTML/index.html ↩︎

  3. https://www.crossref.org/display-guidelines/ ↩︎

  4. https://datatracker.ietf.org/doc/html/draft-kunze-ark-39#name-ark-anatomy ↩︎

  5. https://library.oclc.org/digital/collection/p267701coll28/id/1839 ↩︎