You want to see if a value contains only alphabetic characters.
A first approximation can be achieved using the standard character classes:
> (regexp-match "^[A-Za-z]+$" "PLTScheme")
("PLTScheme")
> (regexp-match "^[A-Za-z]+$" "123-Have Fun")
#f
> (if (regexp-match "^[A-Za-z]+$" "Soup")
(printf "I like alphabet soup!")
(printf "I don't like number soup."))
I like alphabet soup!
>
Unfortunately, this does not properly handle foreign languages that might have additional characters outside the standard 26 english letters.
> (regexp-match "^[A-Za-z]+$" "Molière")
#f
If you need to match alternative alphabets (as defined by the user's locale settings), you should use
SRFI 14 (Character-set Library) and the
char-set:letter character set.
> (define exp
(regexp (string-append "^["
(char-set->string
(char-set-difference
char-set:letter char-set:punctuation))
"]+$")))
> (regexp-match exp "PLTScheme")
("PLTScheme")
> (regexp-match exp "Molière")
("Molière")
SRFI 14 provides a large set of character set and character set manipulation tools. For the purposes of this recipe we can build a suitable regular expression by assembling it from the "beginning of line" character (
^), the "match one-of" opening bracket (
[), the set of letter characters from Unicode (
char-set:letter) less the set of punctuation characters (
char-set:punctuation =), which we convert to a string, the "match one-of" closing bracket (=]), the "match at least one" operator (
+), and the "to end of line" character (
$).
Here's how you'd use this in a program:
(require (lib "14.ss" "srfi"))
(define (test-alphabetic words)
(letrec ((exp
(regexp (string-append "^["
(char-set->string
(char-set-difference
char-set:letter char-set:punctuation))
"]+$")))
(checker (lambda (words alphawords)
(if (null? words) alphawords
(let ((word (car words))
(if (regexp-match exp (car words))
(checker (cdr words) (cons (car words) alphawords))
(checker (cdr words) alphawords))))))
(checker words '())))2004
(define test-words
(list "silly" "façade" "coöperate" "niño" "Renée" "Molière"
"hæmoglobin" "naïve" "tschüß" "random!stuff#here"))
> (test-alphabetic test-words)
("tschüß" "naïve" "hæmoglobin" "Molière" "Renée" "niño" "coöperate" "façade" "silly")
Your system's locale (3) manpage
Mastering Regular Expressions
Even though we are able to handle most Latin-1 character sets, Scheme is really not fully Unicode compliant. This will be addressed for PLT Scheme in the soon-to-be-released update.
--
BrentAFulgham - 18 May 2004