German Umlauts file names not correctly displayed in graphical terminal applications

This is a fresh Manjaro installation on a Thinkpad with a US keyboard layout and English language everywhere (using en_US.UTF-8 locale).

Since I reside in Germany, I use the German locale for other things like number formats and monetary syntax. Apart from Plasma this usually doesn’t create any issues.

This is my current locale setting (as it shows inside the kitty terminal or Konsole application):

$ locale
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=de_DE.UTF-8
LC_TIME=de_DE.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=en_DE.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_ADDRESS=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LC_MEASUREMENT=en_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
LC_ALL=
$ localectl status
System Locale: LANG=en_US.UTF-8
               LANGUAGE=en_US:en_GB:en:de_DE:de
               LC_NUMERIC=de_DE.UTF-8
               LC_TIME=de_DE.UTF-8
               LC_COLLATE=de_DE.UTF-8
               LC_MONETARY=de_DE.UTF-8
               LC_PAPER=de_DE.UTF-8
               LC_NAME=de_DE.UTF-8
               LC_ADDRESS=de_DE.UTF-8
               LC_TELEPHONE=de_DE.UTF-8
               LC_MEASUREMENT=de_DE.UTF-8
               LC_IDENTIFICATION=de_DE.UTF-8
    VC Keymap: us
   X11 Layout: us

Unfortunately, when running the ls command, Umlauts in file names like “ä” become $‘\303\244’.

Running the command explicitly with a set LC_ALL environment variable does fix the issue:
$ LC_ALL=en_US.UTF-8 ls

So does using another ls replacement like eza.

Umlauts in file names are also displayed correctly inside the graphical applications, like Dolphin. The ls command also displays them correctly in text-only terminals on the same machine.

There may be other commands having the same issue, but ls is the one I noticed it first.

Sometimes, when running pacman, I also get this message in the output, which is probably related to the issue:
bsdtar: Failed to set default locale

I know that Plasma uses its own language settings and I’ve set them up as I did on my other machines running Manjaro. Although, I get this warning message, which is new to me:

I’ve tried using the Manjaro and Arch Wiki entries about locales, but cant figure this out by myself.

This is my /etc/locale.conf:

LANG=en_US.UTF-8
LANGUAGE=en_US:en_GB:en:de_DE:de
LC_MESSAGES=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
LC_COLLATE=de_DE.UTF-8
LC_ADDRESS=de_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
LC_MEASUREMENT=de_DE.UTF-8
LC_MONETARY=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_NUMERIC=de_DE.UTF-8
LC_PAPER=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LC_TIME=de_DE.UTF-8
LC_ALL=

I’ve un-commented the en_US.UTF-8, en_GB.UTF-8 and de_DE.UTF-8 locales in the locales.gen file and generated them with locale-gen multiple times and also restarted the system a couple of times after configuration changes.

AFAIK LC_ALL is supposed to be empty, so I don’t get why the locale command complains about it.

I’ve tried some other things like
$ LANG=en_US.UTF-8 ls
or
$ LC_MESSAGES=en_US.UTF-8 ls
but that doesn’t change anything.

Also some sources recommend to set the C locale in locales.conf, but the default configuration doesn’t and it’s also not showing up in the locales.gen file. But I’ve tried that too and it didn’t change anything.

I know that Plasma and locale settings is a big mess for many years now, but I wonder if there’s a solution for this issue? Usually there was a “set and don’t touch after” solution for these kind of issues.

the german locale is probably not enabled in /etc/locale.gen
and is there fore not generated
and is therefore not available to be chosen -
or displayed incorrectly if chosen, but not available

That would be my suspicion.

… enable the locale in that file
and run:
locale-gen (sudo needed)

the red message hints at this - for some reason, automatic locale generation is not supported, it says

ps:
the easy way around all this is:
install:
glibc-locales
(contains all the locales, not just german and english …)

This has to be something plasma specific, since i have nearly identical bi(tri)lingual config and it works. On XFCE.
So, you can test ls from TTY. If it works there, then it is something terminal program specific - try another one, check the font in settings, etc.

Thank you for your reply!

The de_DE.UTF-8 locale is definitely uncommented and is also shown when generating the locales manually – I also have glibc-locales already installed (maybe some dependency).

this line in your /etc/locale.conf looks odd to me
but that is all it is - I may not know enough to judge
and:
I do have Plasma/KDE installed - but just in it’s stock form as a VM
I don’t really use it.

This is what my setup for german looks like

system menus are all in english

but Umlauts are correctly displayed

cat /etc/locale.conf 
LANG=en_US.UTF-8
LC_ADDRESS=de_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
LC_MEASUREMENT=de_DE.UTF-8
LC_MONETARY=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_NUMERIC=de_DE.UTF-8
LC_PAPER=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LC_TIME=de_DE.UTF-8

no LANGUAGE= there
not sure whether it even belongs there - try commenting it out?

Thank you for your reply!

I’ve tried Konsole, Kitty and Alacritty and it behaves exactly same in each, even with other fonts.

Since the Umlauts are displayed correctly, for example when entering them manually on the command line or using another program like eza to list the directoy contents, I assume that the terminal app and font or code page is not the issue here.

Here’s another example, how this looks like:

mkdir test
cd test

$ touch Äpfel Mühle Börse

$ ls
'B'$'\303\266''rse'  'M'$'\303\274''hle'  ''$'\303\204''pfel'

$ LC_ALL=en_US.UTF-8 ls
Äpfel  Börse  Mühle

$ eza
Äpfel  Mühle  Börse

$ find . -type f
./B??rse
./M??hle
./??pfel

$ LC_ALL=en_US.UTF-8 find . -type f
./Börse
./Mühle
./Äpfel

Something weird is going on…

… I added to my post - re

Thanks!

I use a similar line on all my systems and also added this only after I did not work with the default settings. I’ve tried it with and without this line and it doesn’t seem to matter at all.

I just have never seen something like that
LANGUAGE=en_US:en_GB:en:de_DE:de

I posted the content of my file - all the system, all the menus are in english
but german Umlauts displayes correctly everywhere.

standard question:
is your system fully up to date?

only these two locales are enabled/generated on my system …

grep -v ^# /etc/locale.gen

de_DE.UTF-8 UTF-8
en_US.UTF-8 UTF-8

glibc-locales are not installed

Why is that line needed? I don’t have it anywhere that I can detect.
I use Swedish mixed with US English and my settings are:

$ cat /etc/locale.conf
LANG=en_US.UTF-8
LC_NUMERIC=sv_SE.UTF-8
LC_TIME=sv_SE.UTF-8
LC_MONETARY=sv_SE.UTF-8
LC_PAPER=sv_SE.UTF-8
LC_NAME=sv_SE.UTF-8
LC_ADDRESS=sv_SE.UTF-8
LC_TELEPHONE=sv_SE.UTF-8
LC_MEASUREMENT=sv_SE.UTF-8
LC_IDENTIFICATION=sv_SE.UTF-8

$ localectl status
System Locale: LANG=en_US.UTF-8
               LC_NUMERIC=sv_SE.UTF-8
               LC_TIME=sv_SE.UTF-8
               LC_MONETARY=sv_SE.UTF-8
               LC_PAPER=sv_SE.UTF-8
               LC_NAME=sv_SE.UTF-8
               LC_ADDRESS=sv_SE.UTF-8
               LC_TELEPHONE=sv_SE.UTF-8
               LC_MEASUREMENT=sv_SE.UTF-8
               LC_IDENTIFICATION=sv_SE.UTF-8
    VC Keymap: sv-latin1
   X11 Layout: se
    X11 Model: pc105

That just works for me.

I don’t know why exactly, but I deleted the plasma-localerc file in ~/.config/ and restarted the system, then it worked as expected.

The plasma-localerc file got recreated, but only contains one line now:

[Formats]
LANG=en_US.UTF-8

I’m not 100% sure that this is the reason it works now, but it was the last new thing I tried before restarting the system.

$ locale
LANG=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
LC_NUMERIC=de_DE.UTF-8
LC_TIME=de_DE.UTF-8
LC_COLLATE=de_DE.UTF-8
LC_MONETARY=de_DE.UTF-8
LC_MESSAGES=en_US.UTF-8
LC_PAPER=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_ADDRESS=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LC_MEASUREMENT=de_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
LC_ALL=

As far as I can tell the output is identical, but this time without the error/warning message.

1 Like

It would have been slightly interesting to know what it was before, but:
good - it now appears to work as intended

that LANGUAGE= line looked unfamiliar and weird …

Just found out that the shell commands now produce German output messages, although LANG and LC_MESSAGES is set to en_US.UTF-8. But I remember having this issue before. I can live with that for now.