SLUGS, GLORIOUS SLUGS

Posted on 05 October 2024 by admin

For a long while, I maintained a very simple URL path for blog posts in the form of "/post/<primary key>". However, after studying the URLs of other blogs and websites in general, I saw that this wasn't very conventional, clear, or optimal for SEO. I figured the title of the post should be incorporated in some way in place of the primary key, and that the "/post/" segment was redundant in the context of a blog. It would be clearer using the publication date instead.

The new URL path structure I settled on was "/<year>/<month>/<slug>". Whilst the year and month could be obtained from the publication date field attributes, I wanted the slug to be created automatically from the title rather than being set by the user. So, I overrode the save method for the model subclass and used the "slugify" Django utility on the post title:

def save(self, *args, **kwargs):
    self.slug = slugify(self.title)

    super().save(*args, **kwargs)

Almost immediately, the edge case occurred to me where two posts published in the same year and month could have the same title, and thus the same slug, causing a URL collision. Sure, I could have reverted back to a manually-set slug and applied a unique=True validator, but that seemed like taking the easy way out.

Instead, I added a while loop to the save method which would check for existing posts with the same publication year and month and slugified title. In the case of a conflict, the new post would have an incremented counter appended to its slug:

def save(self, *args, **kwargs):
    original_slug = slugify(self.title)
    unique_slug = original_slug
    counter = 1
    while Post.objects.filter(
        slug=unique_slug,
        pub_date__year=self.pub_date.year,
        pub_date__month=self.pub_date.month,
    ).exists():
        unique_slug = f"{original_slug}-{counter}"
        counter += 1
    self.slug = unique_slug

    super().save(*args, **kwargs)

This was all well and good until the time came to actually write, save, edit and re-save posts. When re-saving an existing post without modifying the title, the save method would match it against all 3 queryset filters, and it would be assigned a new, incremented slug. If re-saved again, the original slug would be available and assigned, and so on.

Having considered my options, I decided to constrain slug creation to new posts and posts with modified titles. Additionally, slug creation was moved to its own method for separation of concerns: 

def get_slug(self, title):
    original_slug = slugify(title)
    unique_slug = original_slug
    counter = 1
    while Post.objects.filter(
        slug=unique_slug,
        pub_date__year=self.pub_date.year,
        pub_date__month=self.pub_date.month,
    ).exists():
        unique_slug = f"{original_slug}-{counter}"
        counter += 1

    return unique_slug

def save(self, *args, **kwargs):
    if not self.pk or (
        self.pk and Post.objects.get(pk=self.pk).title != self.title
    ):
        self.slug = self.get_slug(self.title)

    super().save(*args, **kwargs)

The last, and perhaps most unlikely, edge case to consider was that the title may be non-empty, but cannot be slugified due to not having any alphanumeric characters. For instance, it could be a goofy title only made up of emojis.

In this case, the post will still need a highly-unique identifier, but also one that can be easily and automatically created, such as UUID (universally unique identifier). This is the final implementation:

def get_slug(self, title):
    original_slug = slugify(title)
    if not original_slug:
        return str(uuid.uuid4())

    unique_slug = original_slug
    counter = 1
    while Post.objects.filter(
        slug=unique_slug,
        pub_date__year=self.pub_date.year,
        pub_date__month=self.pub_date.month,
    ).exists():
        unique_slug = f"{original_slug}-{counter}"
        counter += 1

    return unique_slug

def save(self, *args, **kwargs):
    if not self.pk or (
        self.pk and Post.objects.get(pk=self.pk).title != self.title
    ):
        self.slug = self.get_slug(self.title)

    super().save(*args, **kwargs)