For a long while, I maintained a very simple URL path for blog posts in the form of "/post/<primary key>". However, after studying the URLs of other blogs and websites in general, I saw that this wasn't very conventional, clear, or optimal for SEO. I figured the title of the post should be incorporated in some way in place of the primary key, and that the "/post/" segment was redundant in the context of a blog. It would be clearer using the publication date instead.
The new URL path structure I settled on was "/<year>/<month>/<slug>". Whilst the year and month could be obtained from the publication date field attributes, I wanted the slug to be created automatically from the title rather than being set by the user. So, I overrode the save method for the model subclass and used the "slugify" Django utility on the post title:
def save(self, *args, **kwargs):
self.slug = slugify(self.title)
super().save(*args, **kwargs)
Almost immediately, the edge case occurred to me where two posts published in the same year and month could have the same title, and thus the same slug, causing a URL collision. Sure, I could have reverted back to a manually-set slug and applied a unique=True validator, but that seemed like taking the easy way out.
Instead, I added a while loop to the save method which would check for existing posts with the same publication year and month and slugified title. In the case of a conflict, the new post would have an incremented counter appended to its slug:
def save(self, *args, **kwargs):
original_slug = slugify(self.title)
unique_slug = original_slug
counter = 1
while Post.objects.filter(
slug=unique_slug,
pub_date__year=self.pub_date.year,
pub_date__month=self.pub_date.month,
).exists():
unique_slug = f"{original_slug}-{counter}"
counter += 1
self.slug = unique_slug
super().save(*args, **kwargs)
This was all well and good until the time came to actually write, save, edit and re-save posts. When re-saving an existing post without modifying the title, the save method would match it against all 3 queryset filters, and it would be assigned a new, incremented slug. If re-saved again, the original slug would be available and assigned, and so on.
Having considered my options, I decided to constrain slug creation to new posts and posts with modified titles. Additionally, slug creation was moved to its own method for separation of concerns:
def get_slug(self, title):
original_slug = slugify(title)
unique_slug = original_slug
counter = 1
while Post.objects.filter(
slug=unique_slug,
pub_date__year=self.pub_date.year,
pub_date__month=self.pub_date.month,
).exists():
unique_slug = f"{original_slug}-{counter}"
counter += 1
return unique_slug
def save(self, *args, **kwargs):
if not self.pk or (
self.pk and Post.objects.get(pk=self.pk).title != self.title
):
self.slug = self.get_slug(self.title)
super().save(*args, **kwargs)
The last, and perhaps most unlikely, edge case to consider was that the title may be non-empty, but cannot be slugified due to not having any alphanumeric characters. For instance, it could be a goofy title only made up of emojis.
In this case, the post will still need a highly-unique identifier, but also one that can be easily and automatically created, such as UUID (universally unique identifier). This is the final implementation:
def get_slug(self, title):
original_slug = slugify(title)
if not original_slug:
return str(uuid.uuid4())
unique_slug = original_slug
counter = 1
while Post.objects.filter(
slug=unique_slug,
pub_date__year=self.pub_date.year,
pub_date__month=self.pub_date.month,
).exists():
unique_slug = f"{original_slug}-{counter}"
counter += 1
return unique_slug
def save(self, *args, **kwargs):
if not self.pk or (
self.pk and Post.objects.get(pk=self.pk).title != self.title
):
self.slug = self.get_slug(self.title)
super().save(*args, **kwargs)