Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Common Lisp hash-table with with accented characters as keys

I'm trying to create a hashtable in Common Lisp to store characters as keys, but the hashtable doesn't work if I use accented characters. It only takes one possible key with accents.

In this example I add 5 keys, and see that the hashtable shows 5 elements, then add another 5 with accents, and the table shows 6 elements, then add another “normal” 5 elements and the size goes to 11 (as expected).

What is happening? And how can I solve this?

(defparameter *h* (make-hash-table))
(setf (gethash #\A *h*) #\A)
(setf (gethash #\E *h*) #\A)
(setf (gethash #\I *h*) #\A)
(setf (gethash #\O *h*) #\A)
(setf (gethash #\U *h*) #\A)
(hash-table-count *h*)
(setf (gethash #\á *h*) #\A)
(setf (gethash #\é *h*) #\A)
(setf (gethash #\í *h*) #\A)
(setf (gethash #\ó *h*) #\A)
(setf (gethash #\ú *h*) #\A)
(hash-table-count *h*)
(setf (gethash #\a *h*) #\A)
(setf (gethash #\e *h*) #\A)
(setf (gethash #\i *h*) #\A)
(setf (gethash #\o *h*) #\A)
(setf (gethash #\u *h*) #\A)
(hash-table-count *h*)
like image 557
Manuel Avatar asked Oct 20 '25 22:10

Manuel


2 Answers

From the SBCL manual:

On non-Unicode builds, the default external format is :latin-1.

You want to use UTF-8. So do what the manual says, and set your environment up when you call SBCL:

$ LANG=C.UTF-8 sbcl --noinform --no-userinit --eval "(print (map 'string #'code-char (list 97 98 246)))" --quit
"abö"
$ LANG=C sbcl --noinform --no-userinit --eval "(print (map 'string #'code-char (list 97 98 246)))" --quit
"ab?"

If you use SLIME or Sly from Emacs, there is a way to set it up in your init:

(setq sly-lisp-implementations
      '((sbcl ("/opt/sbcl/bin/sbcl") :coding-system utf-8-unix)))

Then use a sane test function, like char=. You should use the most specific predicate whenever possible, in my opinion. char-equal is the case-insensitive version.

Sly manual, though the above snippet works on SLIME too as slime-lisp-implemetations

As noted in the comment by @Manuel if your LANG variable and friends do not use UTF-8, then you are doomed. See this quetsion

like image 56
Spenser Truex Avatar answered Oct 24 '25 06:10

Spenser Truex


If, for whatever reason, you cannot change SBCL's default external fomat, you can always use #\LATIN_SMALL_LETTER_A_WITH_ACUTE, etc.

like image 36
peter.cntr Avatar answered Oct 24 '25 07:10

peter.cntr



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!