Path: csiph.com!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.arch.embedded
Subject: Re: How to add the second (or other) languages
Date: Wed, 12 Feb 2025 20:50:18 +0100
Organization: A noiseless patient Spider
Lines: 124
Message-ID: <voiu1q$2f5ap$1@dont-email.me>
References: <voii3i$28jmm$1@dont-email.me> <voioe3.598.1@stefan.msgid.phost.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 12 Feb 2025 20:50:19 +0100 (CET)
Injection-Info: dont-email.me; posting-host="d5da087d7befb89d680325980c0123c4"; logging-data="2594137"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19x0H5+WOo667EGgdEgrbmwz94nQmIQz6E="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:cu7T79lHhRhXuBXtxiLfgkVtWlY=
In-Reply-To: <voioe3.598.1@stefan.msgid.phost.de>
Content-Language: en-GB
Xref: csiph.com comp.arch.embedded:32334

On 12/02/2025 18:14, Stefan Reuther wrote:
> Am 12.02.2025 um 17:26 schrieb pozz:
>> #if LANGUAGE_ITALIAN
>> #  define STRING123            "Evento %d: accensione"
>> #elif LANGUAGE_ENGLISH
>> #  define STRING123            "Event %d: power up"
>> #endif
> [...]
>> Another approach is giving the user the possibility to change the
>> language at runtime, maybe with an option on the display. In some cases,
>> I have enough memory to store all the strings in all languages.
> 
> Put the strings into a structure.
> 
>    struct Strings {
>        const char* power_up_message;
>    };
> 
> I hate global variables, so I pass a pointer to the structure to every
> function that needs it (but of course you can also make a global variable).
> 
> Then, on language change, just point your structure pointer elsewhere,
> or load the strings from secondary storage.
> 
> One disadvantage is that this loses you the compiler warnings for
> mismatching printf specifiers.
> 
>> I know there are many possible solutions, but I'd like to know some
>> suggestions from you. For example, it could be nice if there was some
>> tool that automatically extracts all the strings used in the source code
>> and helps managing more languages.
> 
> There's packages like gettext. You tag your strings as
> 'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
> all into a .po file. Other tools help you manage these files (e.g.
> 'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
> warnings.
> 
> The .po file is a mapping from English to Whateverish strings. So you
> would convert that into some space-efficient resource file, and
> implement the '_' macro/function to perform the mapping. The
> disadvantage is that this takes lot of memory because your app needs to
> have both the English and the translated strings in memory. But unless
> you also use a fancy preprocessor that translates your code to
> 'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
> you might come up with some compile-time hashing...
> 
> I wouldn't use that on a microcontroller, but it's nice for desktop apps.
> 
> 
>    Stefan


You don't need a very fancy pre-processor to handle this yourself, if 
you are happy to make a few changes to the code.  Have your code use 
something like :

#define DisplayPrintf(id, desc, args...) \
	display_printf(strings[language][string_ ## id], ## x)

Use it like :

	DisplayPrintf(event_type_on, "Event on", ev->idx);


A little Python preprocessor script can chew through all your C files 
and identify each call to "DisplayPrintf".  It can collect together all 
the id's and generate a header with something like :

	typedef enum {
		string_event_type_on, ...
	} string_index;
	enum { no_of_strings = ... };

	enum {
		lang_English, lang_Italian, ...
	} language_index;
	enum { no_of_languages = ... };

	extern language_index language;		// global var :-)
	extern const char* strings[no_of_languages][no_of_strings];

Then it will have a C file :

	#include "language.h"

	language_index language;
	const char* strings[no_of_languages][no_of_strings] = {
	{	// English
		"Event %d: power up",		// Event on
		...
	}
	{	// Italian
		"Evento %d: accensione",	// Event on
	}
	}

It would generate the strings based on language files:

	# english.txt
	event_type_on : Event %d: power up
	...

If the preprocessor finds a use of DisplayPrintf where the id (which can 
be as long or short as you want, but can't have spaces or awkward 
characters) does not match the description, it should give an error - 
duplicate uses of the same pair are skipped.  (You could just use an id 
and no description if you prefer.)

Any ids that are not in the language files will be printed out or put in 
a file, ids that are in the language files but not used in the program 
will give warnings, etc.

It can all be done in a manner that makes it easy to get right, hard to 
get wrong, and will not cause trouble as strings are added or removed.

It would be a lot simpler than gettext, and use minimal runtime space 
and time.  And it should be straightforward to change if you want to 
have string tables stored externally or something like that.  (I've made 
systems with string tables in an external serial eprom, for example.)