I should have some results available later today. I’ll also provide you with the instructions how to replicate some of the tests, so we can verify the observed behavior in two different environments – if applicable, of course.
Just a brief update… I’ve been busy in the last few days writing a couple of articles for a regional long-lasting printed computer magazine, to meet the magazine’s publishing deadline. Those articles should be completed in a day or two, at which point I’ll happily get back to debugging this issue.
As part of writing one of those articles, I’ve discovered an issue with a Patriot Rage Pro, a high-performance USB flash drive, for which I’ll also work on a fix through USB quirks in the Linux kernel. It would be ideal to fix that issue through a firmware update for the drive, but frankly, I don’t see that happening. However, I’ll try contacting the drive manufacturer, once I’m 100% certain about the actual culprit.
All kinds of issues just keep popping up all over the place.
I see, I see. Let’s see how it unfolds.
I’m finally back to working on this issue. I’ve been on a true rollercoaster ride in my private life for the last few weeks, but thankfully, the things are settling down.
Excited to hear what you might find.
I just added some patch to 6.1.12-2 I’m currently building …
Is this the missing hardware definitions to make cpuidle work or something else? I recognized the commit text to be same as in this sunxi spec page documenting something they call cpuidle but might not be same as kernel cpuidle governor? CPUIDLE - linux-sunxi.org
Most likely as the Megi kernel seems not to have them by default: linux/sun50i-a64.dtsi at orange-pi-6.1 · megous/linux · GitHub linux/sun50i-h5.dtsi at orange-pi-6.1 · megous/linux · GitHub
Just let me know if it works now …
YES, it is working now. Thanks for chasing the issue down.
Basically, that’s the wrong place to add the PSCI CPUIDLE stuff to the device tree. Instead, that’s already performed dynamically by TF-A, BL31 to be precise, when the device is booted, but the additions are actually lost.
It’s BL31’s job to actually perform the PSCI operations, so it detects dynamically on boot whether CRUST is there, etc., and adds the CPUIDLE stuff to the device tree if all of the required “ingredients” are detected to be there.
In other words, having that kernel patch in place is, basically, a wrong workaround. I’d suggest that we keep the topic unresolved, as I’m working on a proper fix.
I did mark it unsolved, but i would love to hear your ideas going forward. Seeing similar arguments as you raised were raised in the linux kernel thread, and no solution was found. Were they even ultimately equivalent, or just my impression?
edit: since megi does it like this, and it was proposed to kernel, is it truly an unwanted method? Is there even a kernel interface to do it as you propose, or why is it “lost” as you say? Do you know that your method will be THE proper fix? Are you saying this build that now works should not be pushed to stable?
Thanks for marking the issue as unresolved.
The final result will eventually be the same as with the workaround kernel patch, it’s just about where the required amendments to the device tree are performed. The right place is the BL31 itself, in a dynamic manner, which is already performed as visible below in a TF-A console message, but the amendments are actually lost.
INFO: PSCI: Suspend is available via SCPI
Of course, we can keep the workaround kernel patch in place for now, until there’s a proper fix available through the U-Boot package.
I’m quite close to the proper fix, but I’m unfortunately having some health issues that have already pinned me to the bed for about a week. I’m having some difficulties even while sitting to write this, but I should get better in about a week or so, according to the doctor. Hopefully.
Yes, it’s truly an unwanted way to fix this. For example, if there’s no CRUST available, the kernel will likely crash or error out while attempting to use the feature defined trough the device tree, but actually unavailable at runtime.
The kernel interface is to have the CPUIDLE “stuff” defined though the device tree, from the kernel’s viewpoint, but the actual device tree nodes aren’t supposed to be coming from the static device tree, but are added dynamically in BL31, as part of the boot loader booting up and the device firmware initializing, as I already described above.
I’m sure that the method I’m working on is the right way to fix this issue, simply because the BL31 already adds the CPUIDLE device tree nodes dynamically, by detecting the required runtime support, etc. Why the performed device tree amendments are actually lost is something I still need to determine and fix.
The patched kernel can be pushed to the stable branch as a temporary workaround. It should work just fine, but that’s simply the wrong place for a fix.
Nice to hear you are somewhat on top of this, but sad to hear you still have no idea wtf happens where things just get lost. Maybe megi wrestled with this also, was unable to decode it, and just hard patched it in?
Please, keep in mind that I’ve had very little opportunity to actually work on this in the last week or so, because I had to remain pinned to my bed and I wasn’t able to think straight most of the time, as a result of my current health isssues. Things happen.
I intend to keep working until there’s a proper fix available. Also, the above-linked kernel patch is actually from Samuel Holland, back from 2021, who, as we know, implemented CRUST and the associated TF-A code. See also commit e2b18771 in the TF-A source code, back from 2022, also coming from Samuel.
I was not trying to criticize you or your work in any way. Maybe saying you have no idea wtf the issue is was too strong of an expression and you took it the wrong way, i did not mean it like that. Just that even if you have no idea what the real issue is at this point, i think this proposed kernel patch/what philm merged is good to push to production, as you stated as temporary fix.
Ah, no worries, everything is perfectly fine. Actually, I was just trying to explain why I haven’t produced any results recently.
Yes, the proposed kernel patch should be fine as a temporary workaround. Once the fixed U-Boot package is available, we’ll have it fixed the right way and remove the workaround; until then, the workaround is fine for the stable branch.
For now the fix only exists in the unstable branch and will get its normal way to stable at some point. The patch is not part of the Megi kernel at all. More or less Mobian is maintaining their own stuff and had added it there.
Yes, Mobian has been using their own set of kernel patches, which were reportedly incomplete and caused quite a few random issues on the PinePhone, including the issues related to the communication with USB chargers.
Though, I don’t know is that still the case, or have the kernel patches applied by Mobian become more complete.
Here’s good news: I think I’ve figured out why the amendments to the device tree performed in BL31 become lost, but I still need to verify all that. Of course, I’ll describe everything in detail once I’ve completed the verification.
The bad news is that having it fixed properly won’t be easy. It may actually require an entire redesign of the way BL31 performs the device tree amendments, which will also be a whole lot of fun to upstream.