oneohthree/quick-slugify.sh

cavo789 · 2021-05-11T12:11:33Z

Best to replace tr A-Z a-z at the end by tr "[:upper:]" "[:lower:]" to support accentuated characters like É f.i.

pjboro · 2022-05-10T09:52:34Z

Best to replace tr A-Z a-z at the end by tr "[:upper:]" "[:lower:]" to support accentuated characters like É f.i.

These characters are handled by iconv. I thought, were they not, they would be handled by sed replace, but at least in GNU sed 4.8 most of them belongs to a-z range.

╰─➤  echo É | iconv -t ascii//TRANSLIT                                                                     
E

# not every diacritic is contained in a-z
╰─➤  echo "ā, ä, ǟ, ḑ, ē, ī, ļ, ņ, ō, ȯ, ȱ, õ, ȭ, ŗ, š, ț, ū, ž." | sed -r 's/[^a-zA-Z0-9]+/-/g' | sed -r 's/^-+\|-+$//g' | tr A-Z a-z                  130 ↵
ā-ä-ǟ-ḑ-ē-ī-ļ-ņ-ō-ȯ-ȱ-õ-ȭ-ŗ-š-ț-ū-

janosgyerik · 2022-11-09T08:01:33Z

It's good to replace multiple sed processes with a single one using multiple -e parameters.

It's good to use [:alnum:] instead of [^a-zA-Z0-9].

It's good to use tr "[:upper:]" "[:lower:]" instead of tr A-Z a-z as a matter of principle for the goal of lowercasing input. To know that tr A-Z a-z is good enough requires verifying what comes before in the pipeline, and knowing how iconv works. That's added mental burden.

Putting it together:

iconv -t ascii//TRANSLIT | sed -E -e 's/[^[:alnum:]]+/-/g' -e 's/^-+|-+$//g' | tr '[:upper:]' '[:lower:]'

oneohthree/quick-slugify.sh

Select an option

No results found

Select an option

No results found

cavo789 commented May 11, 2021

Uh oh!

pjboro commented May 10, 2022

Uh oh!

janosgyerik commented Nov 9, 2022 •

edited

Loading

Uh oh!

oneohthree/quick-slugify.sh

cavo789 commented May 11, 2021

Uh oh!

pjboro commented May 10, 2022

Uh oh!

janosgyerik commented Nov 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

janosgyerik commented Nov 9, 2022 •

edited

Loading