0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

No, you don't want a dash

Last updated at Posted at 2024-07-13

Characters have names.

I happen to have some experience with text normalization of Western texts, crawled from the World Wide Web, having seen the better part of the zodiac of esoteric symbols. Also, as a traumatic carryover from my relationship with LaTex, I also have some by-product-level of interest in typesetting. One of the recurring topics being the hyphen.

Since poeople have a tendency to assign any word to any meaning (hello arbitrariness of the sign, hello Saussure!), I don't expect anyone to be aware of the nomenclature concerning the horizontal bar-like symbols. Say, when German people read out text loud for others, they say "minus" for the hyphen. But anyone who did LaTeX - and math - knows that minus and hyphen symbols are not interchangeable. You are butchering it. Well, historically, hyphen used to be a replacement for minus in environments where only ASCII was available. Like your grandmom's typewriter. Hence the official name HYPHEN-MINUS.

Without aiming to be comprehensive, here is a list of hyphen-like symbols.

symbol unicode name notes
- 0x002d plain hyphen the only one in ASCII plane
0x2010 unicode hyphen
0x2011 non-breaking hyphen
0x2012 figure dash
0x2013 en-dash as wide as 'n'
0x2014 em-dash as wide as 'm'
0x2015 dash or em-dash as wide as 'm'
0x2212 minus minus is not hyphen
0x2e12 CJK minus minus in Japanese texts
0x2e14 double em-dash twice 'm'
0x2e15 horizontal bar
0xfe58 small em dash
0xfe63 small hyphen minus
0xff0d CJK full width hyphen-minus
0xff70 CJK length mark half-width
0x30fc CJK full width length mark 'the' kana length mark
0x4e00 numerical one
0x2500 box drawings light horizontal may look different
0x2501 box drawings heavy horizontal may look different

Feel free to check the code point for any of these symbols in Python:

hex(ord(""))

So, who cares?

Normally no one cares. I don't care. But when a site is explicitly asking for a dash in the input, most probably it is not what they want. They want an ASCII hyphen, not a dash. They just don't know. Perhaps dash sounds cooler than hyphen, or easier to spell. I have never come accross any site that accepted dash - code point 0x2013 - in the input.

qiita-dash.png

Yeah, the failed attempt captured in the screenshot was due to the requested dash.

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?