If you haven’t seen it, the markdown file of this post is named “2020-05-12-jekyll*+_ slưü.g.md”. Why so?

What they are

A post in Jekyll is the result of parsing a file which is named with the structure

<year> - <month> - <day> - <slug>.<extension>

I only added spaces to make it clearer. There should be no space.

The Jekyll doc might call that “slug” part “title” but when you write jekyll*+_ slưü.g, what you get is that “title”. Oh sorry I forgot to add the raw tag. “when you write {{page.slug}}”.

I have only written 4 posts here (this is the fifth, but one has been moved to the newly created Notes section), and one of them contains a dot in its slug. Again, you don’t see any dot in URLs around here, whether in this post’s or that old post’s, although I also use “slug”s to build URLs. The reason is that “slug”s to build URL and page.slugs are slugified differently.

Slugification (don’t take my word for it!) is the act of converting strings to a form with no space and no “strange” characters. In Liquid, the slugify filter helps you do this. How “strange” characters are converted depends on which option you pass to the filter:

  • {{page.slug | slugify}} gives “jekyll-slưü-g” just like what you see in the address bar;
  • {{page.slug | slugify: "ascii"}} gives “jekyll-sl-g”, it’s what I’m gonna use with this site now;
  • {{page.slug | slugify: "latin"}} gives “jekyll-sl-u-g”, it’s what I was about to use because I thought of writing something in Vietnamese in the future, but as I see my characters are not preserved (I expected only partly though), for now I will skip it.

And as you might already see, page.slug still contains a bunch of special characters, which means that it is not slugified at all. So yes, page.slug is less sluggy than the URL.

Why I am doing this

Back to the old post with a dot in its slug. When I used only page.slug for identifying which post a Staticman comment belongs to, comments on that old post would end up in a directory with a dot in its name

toggle-krunner-plasma-5.17

However, the dot does not appear in the property name of the data object site.data.comments that Jekyll retrieves

{“toggle-krunner-plasma-517”=>{“entry…

So when my code tried to get site.data.comments[page.slug], nothing was returned.

How Jekyll converts the directory name is still unknown to me. For example, “jekyll*+_ slưü.g”, this post’s slug, when read by Jekyll, appears as “jekyll__slg”, which shows that:

  • Underscores are kept;
  • Asterisks, dots, and pluses are removed. At least ü’s and ư’s are also removed;
  • Spaces are changed to underscores.

And that’s all I know so far.

What I can do with my Staticman commenting function is to slugify page.slug both in the comment form to indicate which post this comment belongs to and in the comment list to tell which post’s comments to get. And let’s see if my comment (later) can be shown down there.

While playing around with the slug, I also got a problem with Jekyll being unable to sort some null object. By fixing this error, I found out that data files are read by Jekyll as properties of their folder as the object. To be clearer, consider a post “How to slugify” with the slug “how-to-slugify”. Its comments are placed as in this directory tree:

.
├─ _data
│  └─ comments
│     ├─ how-to-slugify
│     │  ├─ entry1.yml
│     │  ├─ entry2.yml
│     │  └─ entry3.yml
│     ├─ ...

and the data are read into this object site.data.comments as

{
  "how-to-slugify": {
    "entry1": {...},
    "entry2": {...},
    "entry3": {...}
  },
  ...
} 

So when I passed the object site.data.comments[slug] to a sort filter, it’s not an array and the filter threw an error.

I went on and converted these properties into an array of comment objects. You can see more here.

{% assign props_array = '' | split: '' %}
{% for entry in site.data.comments[slug] %}
  {% assign props_array = props_array | concat: entry %}
{% endfor %}
{% assign comments_array = props_array | where_exp: "entry", "entry['_id'] != nil" %}
{% assign first_level = comments_array | where_exp: "item", "item.reply_to == ''" | sort: 'date' %}

Basically, using Liquid, you cannot create an array directly, but only split a string to create a string array. So I did the trick of splitting an empty string and then concatenated each property to the resulting array. The concatenated properties, however, also contains property names (“entry1”, “entry2”, “entry3” in the example) as separate array elements, so I filtered them out using a where_exp filter to check for the “_id” property in each element.

There is however still one thing I don’t understand about this problem with Liquid: site.data.comments[slug] after a filter (e.g. where_exp: "item", "item.reply_to == ''") is shown to be an array, just like our first_level array variable above, but I still cannot sort it. It would be nice to know why. Anyway, off to the comment section.