Support for mesa with Vulkan update in Raspberry Pi 4

I updated my system to use the RC kernel (5.10.0-rc7-1-MANJARO-ARM) and mesa-git, but the results are no where as impressive as everyone is getting. I am using the XFCE4 desktop to test.

Update

I found that I had xf86-video-fbturbo-git installed. I had only checked for xf86-video-fbturbo. Removing it, the numbers are improved (compared to ~300s I was seeing earlier), but they still are not that impressive.

ATTENTION: default value of option vblank_mode overridden by environment.
5625 frames in 5.0 seconds = 1124.936 FPS
5542 frames in 5.0 seconds = 1108.389 FPS
5904 frames in 5.0 seconds = 1180.615 FPS
5933 frames in 5.0 seconds = 1186.443 FPS
5943 frames in 5.0 seconds = 1188.506 FPS
XIO:  fatal IO error 104 (Connection reset by peer) on X server ":0.0"
      after 30151 requests (2027 known processed) with 0 events remaining.
vblank_mode=0 glxgears  9.27s user 5.18s system 55% cpu 26.168 total

System Info

glxinfo OpenGL

OpenGL vendor string: Mesa/X.org
OpenGL renderer string: llvmpipe (LLVM 11.0.0, 128 bits)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 21.0.0-devel (git-6df572532d)
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

vulkaninfo

==========
VULKANINFO
==========

Vulkan Instance Version: 1.2.159


Instance Extensions: count = 10
===============================
	VK_EXT_debug_report                    : extension revision 9
	VK_EXT_debug_utils                     : extension revision 2
	VK_KHR_display                         : extension revision 23
	VK_KHR_external_memory_capabilities    : extension revision 1
	VK_KHR_get_physical_device_properties2 : extension revision 1
	VK_KHR_get_surface_capabilities2       : extension revision 1
	VK_KHR_surface                         : extension revision 25
	VK_KHR_wayland_surface                 : extension revision 6
	VK_KHR_xcb_surface                     : extension revision 6
	VK_KHR_xlib_surface                    : extension revision 6

Layers: count = 2
=================
VK_LAYER_MESA_device_select (Linux device selection layer) Vulkan version 1.1.73, layer version 1:
	Layer Extensions: count = 0
	Devices: count = 1
		GPU id = 0 (V3D 4.2)
		Layer-Device Extensions: count = 0

VK_LAYER_MESA_overlay (Mesa Overlay layer) Vulkan version 1.1.73, layer version 1:
	Layer Extensions: count = 0
	Devices: count = 1
		GPU id = 0 (V3D 4.2)
		Layer-Device Extensions: count = 0

Presentable Surfaces:
=====================
GPU id : 0 (V3D 4.2):
	Surface types: count = 2
		VK_KHR_xcb_surface
		VK_KHR_xlib_surface
	Formats: count = 2
		SurfaceFormat[0]:
			format = FORMAT_B8G8R8A8_SRGB
			colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
		SurfaceFormat[1]:
			format = FORMAT_B8G8R8A8_UNORM
			colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
	Present Modes: count = 4
		PRESENT_MODE_IMMEDIATE_KHR
		PRESENT_MODE_MAILBOX_KHR
		PRESENT_MODE_FIFO_KHR
		PRESENT_MODE_FIFO_RELAXED_KHR
	VkSurfaceCapabilitiesKHR:
	-------------------------
		minImageCount       = 3
		maxImageCount       = 0
		currentExtent:
			width  = 256
			height = 256
		minImageExtent:
			width  = 256
			height = 256
		maxImageExtent:
			width  = 256
			height = 256
		maxImageArrayLayers = 1
		supportedTransforms: count = 1
			SURFACE_TRANSFORM_IDENTITY_BIT_KHR
		currentTransform    = SURFACE_TRANSFORM_IDENTITY_BIT_KHR
		supportedCompositeAlpha: count = 2
			COMPOSITE_ALPHA_OPAQUE_BIT_KHR
			COMPOSITE_ALPHA_INHERIT_BIT_KHR
		supportedUsageFlags: count = 4
			IMAGE_USAGE_TRANSFER_SRC_BIT
			IMAGE_USAGE_TRANSFER_DST_BIT
			IMAGE_USAGE_STORAGE_BIT
			IMAGE_USAGE_COLOR_ATTACHMENT_BIT


Device Properties and Extensions:
=================================
GPU0:
-----
VkPhysicalDeviceProperties:
---------------------------
	apiVersion     = 4194459 (1.0.155)
	driverVersion  = 84291683 (0x5063063)
	vendorID       = 0x14e4
	deviceID       = 0x002a
	deviceType     = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
	deviceName     = V3D 4.2

VkPhysicalDeviceLimits:
-----------------------
	maxImageDimension1D                             = 4096
	maxImageDimension2D                             = 4096
	maxImageDimension3D                             = 4096
	maxImageDimensionCube                           = 4096
	maxImageArrayLayers                             = 2048
	maxTexelBufferElements                          = 268435456
	maxUniformBufferRange                           = 134217728
	maxStorageBufferRange                           = 134217728
	maxPushConstantsSize                            = 128
	maxMemoryAllocationCount                        = 478999
	maxSamplerAllocationCount                       = 65536
	bufferImageGranularity                          = 0x00000100
	sparseAddressSpaceSize                          = 0x00000000
	maxBoundDescriptorSets                          = 16
	maxPerStageDescriptorSamplers                   = 16
	maxPerStageDescriptorUniformBuffers             = 12
	maxPerStageDescriptorStorageBuffers             = 12
	maxPerStageDescriptorSampledImages              = 16
	maxPerStageDescriptorStorageImages              = 4
	maxPerStageDescriptorInputAttachments           = 4
	maxPerStageResources                            = 128
	maxDescriptorSetSamplers                        = 96
	maxDescriptorSetUniformBuffers                  = 72
	maxDescriptorSetUniformBuffersDynamic           = 8
	maxDescriptorSetStorageBuffers                  = 72
	maxDescriptorSetStorageBuffersDynamic           = 36
	maxDescriptorSetSampledImages                   = 96
	maxDescriptorSetStorageImages                   = 24
	maxDescriptorSetInputAttachments                = 4
	maxVertexInputAttributes                        = 16
	maxVertexInputBindings                          = 16
	maxVertexInputAttributeOffset                   = 4294967295
	maxVertexInputBindingStride                     = 4294967295
	maxVertexOutputComponents                       = 64
	maxTessellationGenerationLevel                  = 0
	maxTessellationPatchSize                        = 0
	maxTessellationControlPerVertexInputComponents  = 0
	maxTessellationControlPerVertexOutputComponents = 0
	maxTessellationControlPerPatchOutputComponents  = 0
	maxTessellationControlTotalOutputComponents     = 0
	maxTessellationEvaluationInputComponents        = 0
	maxTessellationEvaluationOutputComponents       = 0
	maxGeometryShaderInvocations                    = 0
	maxGeometryInputComponents                      = 0
	maxGeometryOutputComponents                     = 0
	maxGeometryOutputVertices                       = 0
	maxGeometryTotalOutputComponents                = 0
	maxFragmentInputComponents                      = 64
	maxFragmentOutputAttachments                    = 4
	maxFragmentDualSrcAttachments                   = 0
	maxFragmentCombinedOutputResources              = 20
	maxComputeSharedMemorySize                      = 16384
	maxComputeWorkGroupCount: count = 3
		65535
		65535
		65535
	maxComputeWorkGroupInvocations                  = 256
	maxComputeWorkGroupSize: count = 3
		256
		256
		256
	subPixelPrecisionBits                           = 6
	subTexelPrecisionBits                           = 8
	mipmapPrecisionBits                             = 8
	maxDrawIndexedIndexValue                        = 16777215
	maxDrawIndirectCount                            = 2147483647
	maxSamplerLodBias                               = 14
	maxSamplerAnisotropy                            = 16
	maxViewports                                    = 1
	maxViewportDimensions: count = 2
		4096
		4096
	viewportBoundsRange: count = 2
		-8192
		8191
	viewportSubPixelBits                            = 0
	minMemoryMapAlignment                           = 4096
	minTexelBufferOffsetAlignment                   = 0x00000100
	minUniformBufferOffsetAlignment                 = 0x00000020
	minStorageBufferOffsetAlignment                 = 0x00000020
	minTexelOffset                                  = -8
	maxTexelOffset                                  = 7
	minTexelGatherOffset                            = -8
	maxTexelGatherOffset                            = 7
	minInterpolationOffset                          = -0.5
	maxInterpolationOffset                          = 0.5
	subPixelInterpolationOffsetBits                 = 6
	maxFramebufferWidth                             = 4096
	maxFramebufferHeight                            = 4096
	maxFramebufferLayers                            = 256
	framebufferColorSampleCounts: count = 2
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_4_BIT
	framebufferDepthSampleCounts: count = 2
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_4_BIT
	framebufferStencilSampleCounts: count = 2
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_4_BIT
	framebufferNoAttachmentsSampleCounts: count = 2
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_4_BIT
	maxColorAttachments                             = 4
	sampledImageColorSampleCounts: count = 2
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_4_BIT
	sampledImageIntegerSampleCounts: count = 2
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_4_BIT
	sampledImageDepthSampleCounts: count = 2
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_4_BIT
	sampledImageStencilSampleCounts: count = 2
		SAMPLE_COUNT_1_BIT
		SAMPLE_COUNT_4_BIT
	storageImageSampleCounts: count = 1
		SAMPLE_COUNT_1_BIT
	maxSampleMaskWords                              = 1
	timestampComputeAndGraphics                     = true
	timestampPeriod                                 = 1
	maxClipDistances                                = 8
	maxCuvulkan: No DRI3 support detected - required for presentation
Note: you can probably enable DRI3 in your Xorg config
vulkan: No DRI3 support detected - required for presentation
Note: you can probably enable DRI3 in your Xorg config
llDistances                                = 0
	maxCombinedClipAndCullDistances                 = 8
	discreteQueuePriorities                         = 2
	pointSizeRange: count = 2
		0
		512
	lineWidthRange: count = 2
		1
		32
	pointSizeGranularity                            = 0
	lineWidthGranularity                            = 0
	strictLines                                     = true
	standardSampleLocations                         = false
	optimalBufferCopyOffsetAlignment                = 0x00000020
	optimalBufferCopyRowPitchAlignment              = 0x00000020
	nonCoherentAtomSize                             = 0x00000100

VkPhysicalDeviceSparseProperties:
---------------------------------
	residencyStandard2DBlockShape            = false
	residencyStandard2DMultisampleBlockShape = false
	residencyStandard3DBlockShape            = false
	residencyAlignedMipSize                  = false
	residencyNonResidentStrict               = false


Device Extensions: count = 6
----------------------------
	VK_EXT_external_memory_dma_buf : extension revision 1
	VK_EXT_private_data            : extension revision 1
	VK_KHR_external_memory         : extension revision 1
	VK_KHR_external_memory_fd      : extension revision 1
	VK_KHR_maintenance1            : extension revision 2
	VK_KHR_swapchain               : extension revision 68

VkQueueFamilyProperties:
========================
	queueProperties[0]:
	-------------------
		minImageTransferGranularity = (1,1,1)
		queueCount                  = 1
		queueFlags                  = QUEUE_GRAPHICS | QUEUE_COMPUTE | QUEUE_TRANSFER
		timestampValidBits          = 64
		present support             = false

VkPhysicalDeviceMemoryProperties:
=================================
memoryHeaps: count = 1
	memoryHeaps[0]:
		size   = 1961981952 (0x74f17800) (1.83 GiB)
		budget = 187651277412376 (0xaaaaf7bb5818) (170.67 TiB)
		usage  = 187651277412488 (0xaaaaf7bb5888) (170.67 TiB)
		flags: count = 1
			MEMORY_HEAP_DEVICE_LOCAL_BIT
memoryTypes: count = 1
	memoryTypes[0]:
		heapIndex     = 0
		propertyFlags = 0x0007: count = 3
			MEMORY_PROPERTY_DEVICE_LOCAL_BIT
			MEMORY_PROPERTY_HOST_VISIBLE_BIT
			MEMORY_PROPERTY_HOST_COHERENT_BIT
		usable for:
			IMAGE_TILING_OPTIMAL:
				color images
				FORMAT_D16_UNORM
				FORMAT_X8_D24_UNORM_PACK32
				FORMAT_D32_SFLOAT
				FORMAT_D24_UNORM_S8_UINT
				(non-sparse)
			IMAGE_TILING_LINEAR:
				color images
				(non-sparse)

VkPhysicalDeviceFeatures:
=========================
	robustBufferAccess                      = true
	fullDrawIndexUint32                     = false
	imageCubeArray                          = true
	independentBlend                        = true
	geometryShader                          = false
	tessellationShader                      = false
	sampleRateShading                       = true
	dualSrcBlend                            = false
	logicOp                                 = true
	multiDrawIndirect                       = false
	drawIndirectFirstInstance               = true
	depthClamp                              = false
	depthBiasClamp                          = false
	fillModeNonSolid                        = true
	depthBounds                             = false
	wideLines                               = true
	largePoints                             = true
	alphaToOne                              = true
	multiViewport                           = false
	samplerAnisotropy                       = true
	textureCompressionETC2                  = true
	textureCompressionASTC_LDR              = false
	textureCompressionBC                    = false
	occlusionQueryPrecise                   = true
	pipelineStatisticsQuery                 = false
	vertexPipelineStoresAndAtomics          = true
	fragmentStoresAndAtomics                = true
	shaderTessellationAndGeometryPointSize  = false
	shaderImageGatherExtended               = false
	shaderStorageImageExtendedFormats       = true
	shaderStorageImageMultisample           = false
	shaderStorageImageReadWithoutFormat     = false
	shaderStorageImageWriteWithoutFormat    = false
	shaderUniformBufferArrayDynamicIndexing = false
	shaderSampledImageArrayDynamicIndexing  = false
	shaderStorageBufferArrayDynamicIndexing = false
	shaderStorageImageArrayDynamicIndexing  = false
	shaderClipDistance                      = true
	shaderCullDistance                      = false
	shaderFloat64                           = false
	shaderInt64                             = false
	shaderInt16                             = false
	shaderResourceResidency                 = false
	shaderResourceMinLod                    = false
	sparseBinding                           = false
	sparseResidencyBuffer                   = false
	sparseResidencyImage2D                  = false
	sparseResidencyImage3D                  = false
	sparseResidency2Samples                 = false
	sparseResidency4Samples                 = false
	sparseResidency8Samples                 = false
	sparseResidency16Samples                = false
	sparseResidencyAliased                  = false
	variableMultisampleRate                 = false
	inheritedQueries                        = true

VkPhysicalDevicePrivateDataFeaturesEXT:
---------------------------------------
	privateData = true

llvmpipe is software rendering. When you have hardware acceleration working you will see

OpenGL renderer string: V3D 4.2
1 Like

Hi,

proper command is. glxgears -info | grep -i render

its “glxgears -info | …”

I also fail to reproduce @0n0w1c setup using XFCE:

  1. I changed to “unstable” branch
  2. I updated all packages
  3. I installed linux-rpi4-rc kernel (5.10.0-rc7-1-MANJARO-ARM)

I tried both, mesa and mesa-git. As soon I enable KMS (vc4–km4-v3d-pi4) in config.txt I am not able to boot into UI:

  1. without xf86-video-fbdev I get an blinking cursor instead of login-screen with “no screen found (EE)” in Xorg.0.log
  2. with fbdev installed I get an segfault in Xorg.0.log

@0n0w1c could you try to reproduce your setup if XFCE (e.g. on a second sd-card) - I really would like to make this work.

I believe it requires either a HDMI 2.0 cable or the proper hdmi_mode and hdmi_group set in your config.txt. It may also be possible to have working video without the above, but I was not so lucky. But with either of those, it worked for me.

I have a xfce4 install in the office. I will check it in the morning.

I upgraded to the current arm-stable with xfce4. Then I uninstalled xf86-video-fbturbo-git and yes, I have V3D 4.2 as the renderer. No issue with lightdm starting.

Edit: However, the performance with this version of Mesa is not good, need the 20.3. The mesa-git will probably be fine but I have not tested it.

Edit 2: I went ahead and tried the mesa-git but performance is not good with it either. Now I am curious whats up.

1 Like

does 20.3.x build rpi-vulkan support?

The Manjaro Mesa 20.3 package did not because the Arch Arm package build did not include it. There is a pull request to have the support added, so maybe when 20.3 returns to the Manjaro repository it will be included.

linux-rpi4-mainline is newer than linux-rpi4-rc, isn’t it?

Correct. The very latest kernel will be on the unstable branch.

I am also in the process of compiling linux linux-rpi4-mainline 5.10.4. If it test out ok I will push it to the unstable branch in a couple of hours. It was upgraded yesterday and would not compile but today the RPi people pushed a fix.

i still have no idea how to reach
"
5625 frames in 5.0 seconds = 1124.936 FPS
"
all i get is 300 FPS
so i watch this thread and try to learn

it says
GL_RENDERER = V3D 4.2

With my testing, the current mesa and mesa-git do not perform well.
I am using mesa-20.3.1-0.1 which had a short life in unstable. It seems to cause issues for the Pinebook Pro but it runs great on my RPi4… so I won’t downgrade.

1 Like

Hi, thanks for doing the tests. Did you have xf86-video-fbdev installed in your tests? IMHO this is not needed as Xorgs should use its (I assume) buildin KMS-driver if all goes well, right? I wonder if I can uninstall this package.

Yes, you can remove it, but you may end up without working video. :slight_smile:

I believe you are correct, xorg should use kms, even if llvmpipe is available. In theory, it should defer to vc4. I have had differing results when having mesa/mesa-git and xf86-video-turbo/xf86-video-turbo-git. And that is why I uninstall xf86-video-fbdev*, because sometimes it does not switch to vc4 depending on the combination of the above.

boot fresh install of kde plasma
upgrade to arm-unstable
reboot
install linux-rpi4-mainline
reboot
install mesa-demos

$ grep kms /boot/config.txt

dtoverlay=vc4-fkms-v3d

$ glxinfo | grep “OpenGL renderer”

OpenGL renderer string: V3D 4.2

$ vblank_mode=0 glxgears

ATTENTION: default value of option vblank_mode overridden by environment.
4273 frames in 5.0 seconds = 854.513 FPS
4055 frames in 5.0 seconds = 810.872 FPS
3941 frames in 5.0 seconds = 786.039 FPS
4562 frames in 5.0 seconds = 912.349 FPS

@ 1920x1080

@Darksky I am under the impression fkms is to give way to kms. Implying kms is superior and for the long term, while fkms is only temporary. This change was to occur once vc4 was in the kernel. If this is incorrect, would you please correct my wayward thoughts.

Edit: I previously associated fkms with software rendering but this shows that thought to be incorrect.

It seems I copied over a typo from above in “vc4–km4-v3d-pi4” -> using proper string “vc4-kms-v3d-pi4” I get roughly 400 FPS both with llvmpipe and V3D with mainline kernel (5.10) and mesa 2.6.

Switching to mesa-git results in roughly 1000FPS but desktop becomes useless, i.e. applications miss borders, no desktop background …

I stumbled across a link that had this:

$ cat /proc/device-tree/soc/firmwarekms@7e600000/status
okay

$ cat /proc/device-tree/v3dbus/v3d@7ec04000/status
okay

It was interesting in that the first one returned disabled for me. Yet I had hw acceleration. As a result, I have switched to:

dtoverlay=vc4-fkms-v3d

Now I get okay from both, however no change in glxgears performance.

Edit: In testing on a 4K monitor, much better performance with kms than with fkms. So there is some significant difference between them.

vulkan-broadcom package version 20.3.2 is now in unstable branch, if you want to test it out.

1 Like