Jump to content

Goals/Input/2025-02-06 Matrix Log: Keyboard Layouts and Input Methods

From KDE Community Wiki

Logs have been slightly edited for brevity. Discussion points:

  • Recap of Input Goal history with regards to input methods.
  • Unified settings for keyboard layouts + input methods.
  • Logical organization of language vs. layouts vs. input methods on other platforms.
  • csslayer (Weng Xuetian) recommends for System Settings to use a flat list of keyboard layouts, like it already does, but with input method(s) optionally associated to a layout.
  • It's worth considering for universal input methods (e.g. speech to text, emoji selector) to exist outside of the layout or language selection concept.

---

[jpetso] (Jakob Petsovits)

csslayer, here's a quick recap of where we're coming from and where I think we currently stand.

1. Around June last year, Duha (Gernot Schiller) approached me with an idea to propose input handling as one of the new KDE Goals, running from Akademy '24 to Akademy '26. We drafted a proposal. redstrate (Joshua Goins) joined the proposal, with a focus on making tablet input great.

2. At Akademy last September, I got to talk to a number of people including d_ed (David Edmundson), apol (Aleix Pol) and Eike (Hein), who taught me about parts of the input method history and current situation. Despite this, I still feel like I've barely scratched the surface.

3. The biggest wishlist items on the input method front seem to be (a) a better on-screen keyboard than what Maliit can currently provide, including for Plasma Mobile, and (b) better integration of input methods, so we a default Plasma installation can appeal to a CJK user.

4. We all agree that input methods are important, but I think only csslayer and Eike have substantial experience using/developing them. Our plan so far was to break down the tasks to improve the Plasma user experience with virtual keyboards and input methods.

[csslayer]

Thanks for the information. I think that sounds great.

Kylin Virtual Keyboard on SteamOS
Kylin Virtual Keyboard on SteamOS

As for on-screen keyboard side from fcitx, fcitx also developed an dbus interface to support virtual keyboard UI from external process. And there is a (even Qt & QML based) implementation can be found here. https://gitee.com/openkylin/kylin-virtual-keyboard

Due to it's dev'ed with a Chinese distro openkylin it has some hardcoded kylin related config thing. And it doesn't work very well with wayland for now (well, for well known limitation on wayland windowing thing).

If anyone interested in what it looks like.. here's a screen shot when I compiled it on steam deck unlike maliit, it does support seamless switch between physical keyboard and virtual keyboard (e.g. if you press a phyiscal key, the OSK will simply hide and there's a button to switch the between physical / OSK mode)

I have a fork for it locally with some changes:

  1. port to Qt6, the original code is currently qt5 based. (mostly done)
  2. cut all the kylin specific code
  3. use layer-shell to support wayland (not done yet)

If someone is interested e.g. put more plasma tech based on it or adding fcitx support to current plasma-keyboard project I'd be happy to help.

[csslayer]

As for adding Input method support to keyboard KCM... it's trickier because people might not have a consensus on what it should looks like or how should it be handled.

I tried to design a system that "from my point of view" in fcitx5 for both layout & input method, but I'm not really a keyboard layout user and I also received bug talking about this keyboard layout switching feature is not like what they want etc.

But from what it trends on other platform (mac, win).. layout and input method are being managed together with their language settings

[jpetso]

When talking to different people, everyone had a different emphasis and the suggested solutions may differ depending on who is providing mentorship. So my initial takeaway is that there are lots of moving parts, and we'll probably have to build things iteratively because it's not entirely clear yet what the ultimate outcome will look like.

Here are some fragments:

  • @apol has prototyped a new virtual keyboard based on the Qt Virtual Keyboard library, which works well with Wayland and does not suffer from the architectural or practical concerns of Maliit. Importantly, outsources a lot of the international expertise that we don't have to Qt. But needs lots more work if we want to actually ship it. I don't know how it compares to the Kylin vkb.
  • We would like to evolve the Keyboard and Virtual Keyboard KCMs into a unified "Input Languages" KCM, same general idea as what your blog post talks about. To make this happen, we need to provide the right abstractions on an API/configuration level, and also adapt the KCMs accordingly. If we have a good plan and people to work on it, we could go at this both from the bottom (infrastructure) and top (UI) direction simultaneously.
  • I got the sense that we don't want the kind of hard dependency on fcitx that GNOME has on ibus. Both fcitx and ibus and also KDE-driven Wayland input methods with a narrower scope (e.g. for text-to-speech) should be available to users depending on their needs.
  • There is some motivation to allow quick switching between Wayland input methods or even cascading them for simultaneous use (e.g. fcitx for physical keyboard, plasma-keyboard for on-screen keyboard). I also realize that input method frameworks such as fcitx and ibus already provide similar concepts, and I'm not quite sure what the right answer is, if we don't want to depend on a particular input method framework.

Going forward, we would like to keep existing features working on X11 but new features can be Wayland-only where it makes sense.

[jpetso]

And then there's the whole side discussion about Wayland input methods being able to emit key events rather than UTF-8 text from an on-screen keyboard. @dorotac joined the room a while earlier, and if I'm not mistaken, is taking up work on the input-method-v3 Wayland protocol but no clue where this is all going or when. GNOME conveniently sidestepped this issue by embedding its on-screen keyboard into the compositor process directly.

[jpetso]

csslayer: So that's a lot, and if you have ideas on where an input newbie such as myself (or a future GSoC student) could best make an impact, I'm interested about your opinion.

csslayer: Do you have any pointers to user opinions about what they want in a settings page? (or don't want)

[csslayer]

what does this "they" refer to?

[jpetso]

the user(s) that told you this is not what they want in a keyboard switching feature

You've been reading a lot of reported issues, I figure you may have some good insight on what works and what doesn't.

[csslayer]

well, one main thing is partially relevant to the fact of missing global hotkey handling on wayland in fcitx

most keyboard layout user would think they should be able to switch layout without focusing into a text box.

for input method, in most case, only receives key when there is some app need to type text.

so for such layout switching, it is implemented through some non-input method way

e.g. xcb_key_grab on X11

while, xdg portal recently added global hotkey which is plumbed to desktop global hotkey manager

the feature that I need is not there in xdg portal (I want to implement an alt tab style switching, and modifier key only switching)

[jpetso]

I think this is related to the unified input language settings. Regular Plasma keyboard layouts can be switched with a shortcut; ideally we should use this shortcut to switch the whole input language setting instead incl. layout and Wayland input method settings?

[csslayer]

in plasma & xkb there are two styles, one is layout switching thing within xkb (xkb support only up to 4 layout).

one is the "ctrl.alt k" done with kglobalaccel , which can't do modifier only switching.

so what fcitx does for physical layout switching is, it send only 1 layout to kwin (under wayland), when fcitx detects a switching request (hotkey or whatever), it send a new layout config to kwin.

[jpetso]

> one is the "ctrl.alt k" done with kglobalaccel , which can't do modifier only switching.

That's interesting, because the shortcut button accepts a modifier-only shortcut. So that won't work? Let me try that.

[csslayer]

at least that is a new thing under wayland and doesn't work under X

[jpetso]

Just tried it under Wayland, seems to work here.

[csslayer]

but either way it is not really alt tab style.

alt tab style is you have a stack and switch to the last used ones.

that requires to detect whether you are still holding the modifier key

[jpetso]

In the "Configure Switching" sub-page of Plasma layout settings, there is a "Switch to last used layout" shortcut (Meta+Alt+L) which isn't quite alt-tab style but switches between the last two layouts. I wonder if this could be changed to go through the complete stack instead of just the last two.

I'd probably need a user report if I want KWin to accept that change, or they're going to ask "why".

assuming this is handled by KWin

So, plan forward:

  1. Switch entire languages / layout+IM settings instead of just layout
  2. Get modifier-only shortcuts for free
  3. Extend "Switch to last used layout" to full alt-tab style behavior

Point (1) is the hard part obviously.

csslayer: How would you feel about an API (D-Bus or Wayland) that lets fcitx enumerate input methods that it supports internally, and then System Settings allows the user to associate one of these to a keyboard layout? We should be able to pass off control from the settings page to fcitx for customizing the respective input method.

[csslayer]

I don't want you to jump to conclusion too fast

There are many different factors that may affect how we want to implement it For example how does this work with per window layout

[jpetso]

Fair :)

[csslayer]

If you are using a stack when you have more than three items the most recent used one may not be predictable

There are certain preference in different kind of users and I was just pointing out one possible use case

[jpetso]

In the "Configure Switching" settings sub-page, the first setting is "Switching layout affects: [All windows | All windows on current desktop | ... | Current window only]". Is that what you mean by per-window layout, or is there another interpretation?

I feel like part of the problem in this space is that there are already so many options implemented by both Plasma and fcitx, but they're slightly different and combining them into one interface without changes would probably be overwhelming.

[csslayer]

I think we are missing the point and going into too much details

The main point that is more important is whether or how plasma will manage the layouts and the input method switching

Let's take Windows as an example

Under windows there are two dimensions

One is to switch between different languages

The other is to switch between input method and layout under that language

[jpetso]

I think that makes sense. Do you have an opinion on whether or how Plasma should manage the switching?

[csslayer]

Maybe let's just compare all other platforms first

On Android It's also two dimension because every keyboard is a standalone app and the Android system will kill the keyboard being switched away

They are certain keyboard app that's May support multiple languages for example gboard

[jpetso]

So the two dimensions on Android are keyboard app -> input method within keyboard, correct?

[csslayer]

And the switching languages within the keyboard app is kind of like the second dimension

[jpetso]

I guess that's in a way similar to what Plasma/Wayland has right now, with Virtual Keyboard as first dimension and e.g. fcitx internal switching as a second dimension.

not considering that fcitx itself may still expose more detail than that

[csslayer]

But Android also provides a flattened option to allow user to switch to a sub keyboard Within that keyboard app

GNOME is just one dimension and mixed layout and input method together. I think it is similar to mac but I dont really have much experience with mac.

[jpetso]

I was just looking at Mac. It seems they have a flat list of Input Sources, where each Input Source is associated to a language as well as other settings (like layout & more settings for CJK languages than Latin ones).

but don't have a Mac so I can only go by articles on the Internet :)

[csslayer]

as for fcitx itself , it is also two dimension, the first dimension is "group", but it is always associated with a layout so you can think the first dimension is layout.

the second dimension is input methods within the group, but just like many open source things with lots of freedom, it allows you to add layout as "input method"

layout as input method behaves slightly different comparing to pure layout but let's not going into that details for now.

so you can see fcitx's own design is more similar to windows, but gives you lots of freedom to organize things in whatever way you want...I'm not saying it is a good thing and I know it has created lots of confusion in real users. But let me explain why...

so xkb itself, if you look into evdev.xml in xkeyboard-config

layout+variant may have multiple language code associate with one layout.

and it also has many confusing entries, with some nonsense language code associated with it

for example, the cn "layout"

its language code is zh (Chinese), but it is just identical to us layout and can not produce Chinese characters at all.

I think it is called "Chinese" because it is the China's standard layout, even though it is just en-us layout

Similarly, the jp layout can not produce Kanji (some special one can produce hiragana or katakana), it is also qwerty, but symbols/punctuation location are different from the us layout.

And many Chinese (Linux) users would just use US layout as their layout. Very few people would choose that cn layout.

This makes me think make layout as the first dimension makes more sense on Linux, because the language -> layout mapping is so messed up.

another reason is that, even for non text typing purpose, layout is necessary for application to convert from key code to key sym and application handles key sym more that key code.

I think another benefit for using layout as first dimension is, the existing layout user can transition seamlessly.

[jpetso]

So if we use layout as the first dimension, would there be a second dimension with input method settings? Or rather a flat list like Mac, where you can essentially have the same layout twice but with different language settings?

[csslayer]

Yeah, that would be what I prefer to have.

[jpetso]

Hm, I think this would have interesting implications for Wayland input methods that aren't strictly layout based, like e.g. speech to text.

[csslayer]

As for the confusion that I mentioned earlier... I think it comes from:

1. current presentation in fcitx5 configtool / kcm is not so good, it is a combobox choose group, and a list view displays what's in current group.

I would prefer to have a 1 level tree view or sectioned list view to display everything at the same time.

2. allowing adding layout as input method within the group

[csslayer]

as for 1)... the reason that I'm not using treeview or sectioned list view is mainly because I just find it hard to implement drag and drop while qml has no good tree view... mainly some technical reasons 🤣

[jpetso]

Hm, I think this would have interesting implications for Wayland input methods that aren't strictly layout based, like e.g. speech to text.

[csslayer]

If you look at other systems, speech to text is usually managed separately, that I don't think would be a blocker here.

[jpetso]

It's also related to text input. But perhaps fair to look at it separately. It could also be advantageous to switch speech languages together with keyboard layout settings that are associated to a given language.

Either way, I agree that speech to text is solvable in this model.

[csslayer]

for example, the mic button on android may just scan the whole list and launch the text to speech capable one

I think it can be a special item belong to the group that not involved in regular switching at all when we implement it.

prior one accessibility feature that uses system language more without need to switch languages frequently.

I don't really have good experience with speech to text on any platforms, especially comes to multi lingual scenario..

[jpetso]

I haven't used it much, but I know people who do (edit: although, not multi-language), and I guess the models are getting better.

[csslayer]

I feel that the need to specify a ordered list of preferred language for speech to text is technical issues, not what user really want.

For example, it is really painful if I need to switch my input method to Chinese before I can make my voice to be recognized as Chinese or vice versa

they are just two different things.

I dont need to ask another human who can understand both Chinese and English to switch to "Chinese" mode before I can start to speak right 🤣

[jpetso]

STT models like Whisper already transcribe to different languages afaik. Slightly older tech required language selection. I guess by the time we're done with this project, we won't have to worry much about older STT models anymore :P

[csslayer]

I'm feeling optimistic on this given the whole AI thing is improving these a lot

[csslayer]

> STT models like Whisper already transcribe to different languages afaik. Slightly older tech required language selection. I guess by the time we're done with this project, we won't have to worry much about older STT models anymore :P

I don't know about whisper but neither siri or google assistant or gboard can get my words correct if I need to put some English word.

Actually google assist can hardly understand my English if the language order is 1.Chinese 2.English. Accuracy is almost 0

[jpetso]

So in your model, the Keyboard settings page could look very similar to the current one, but in addition we could add a button such as "Configure input method..." to a given layout row?

[csslayer]

I would prefer it to be flattened in one page, but I'm no HIG expert here.

[jpetso]

Flat list yes, but every input method has additional settings and I think a sub-page or pop-up dialog for those is probably unavoidable. The fcitx KCM also has a "settings" icon for each row under "Input Method Off" and "Input Method On", I imagine something similar.

Also, if the input method is provided by a plugin component from outside of core Plasma, it probably makes sense to give it a separate space for configuring its own settings.

[csslayer]

> Flat list yes, but every input method has additional settings and I think a sub-page or pop-up dialog for those is probably unavoidable. The fcitx KCM also has a "settings" icon for each row under "Input Method Off" and "Input Method On", I imagine something similar.

yeah, for engine specific settings, even for me, is less commonly touched

[jpetso]

Apologies if I'm jumping to conclusions early again. I'm happy to change directions at any time, it's just easier to have a picture in my head to work with.

I'd be happy to support this direction. It sounds like a feasible way forward. Now for Eike and other interested parties to read through the chat history and form their own opinions :)

csslayer: The next step I can think of would be, how to integrate fcitx into a model where Plasma keeps ownership of the keyboard layout but fcitx provides an additional input method for it? But it's pretty late here and I should probably go to bed for today.

Thanks for sharing your time and knowledge. Much appreciated :)

[csslayer]

Thank you. I had a wonderful time talking with you. :)