Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: "Wolli Buechel" Newsgroups: de.comp.lang.python Subject: =?utf-8?q?=5BPython-de=5D_Re=3A_Fwd=3A_Keyboard_coding?= Date: Wed, 24 Jul 2024 15:55:03 -0000 Lines: 38 Message-ID: <172183650388.12053.598107056645157522@mail.python.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-Trace: news.uni-berlin.de D+jEx2PIvgqIlqEdtVbfMALWga5QuEBrirpZRQHMLNvA== Cancel-Lock: sha1:7f0UWDLzu6sIfhhKplaNrleuiAM= sha256:U5UWG1VkW+CL3aBNl7bTp24/Slswj9h5OOW7zYS4wL8= Authentication-Results: mail.python.org; dkim=none reason="no signature"; dkim-adsp=none (unprotected policy); dkim-atps=neutral In-Reply-To: User-Agent: HyperKitty on https://mail.python.org/ Message-ID-Hash: BIUMEKOF7IBCMP2RYOCVRT73FDCHHP2G X-Message-ID-Hash: BIUMEKOF7IBCMP2RYOCVRT73FDCHHP2G X-MailFrom: wjb131@web.de X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-python-de.python.org-0; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10b1 Precedence: list List-Id: Die Deutsche Python Mailingliste Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Xref: csiph.com de.comp.lang.python:6126 Sehr geehrter Herr Schnoor, Mehrfachzeichen in ziffern sind nicht lediglich "einige chinesische Zeichen" und auch nicht bloß "doppelt", sondern insgesamt 66 Zeichen aus folgenden Schriftsystemen: DEVANAGARI : 6 BENGALI : 4 ORIYA : 3 TIBETAN : 7 KHMER : 11 OL CHIKI : 4 GEORGIAN : 20 CJK : 11 Dies kann man mit dem Python-Modul unicodedata herausfinden: import unicodedata # Python-Doku: https://docs.python.org/3/library/unicodedata.html # extrahiere aus ziffern : Mehrfachzeichen, Zeilenumbrüche, Leerzeichen mehrfach = sorted(list(set([ x for x in ziffern if ziffern.count(x) > 1 or x in "\n\t " ]))) Names = dict() for i, ch in enumerate(mehrfach): try: chName = unicodedata.name(ch) except: chName = unicodedata.category(ch) language = chName.split()[0] if 'OL' in language: language = ' '.join(chName.split()[:2]) Names[language] = Names.get(language, 0) + 1 # print("[%2d] >%s<\t%5d\t%s" % (i+1, ch, ord(ch), chName)) for k, v in Names.items(): print("%s \t: %2d" % (k, v)) W. Buechel