408 lines
18 KiB
HTML
408 lines
18 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<html>
|
|
<head>
|
|
<meta name="generator" content=
|
|
"HTML Tidy for Linux/x86 (vers 1 September 2005), see www.w3.org">
|
|
<meta http-equiv="Content-Type" content=
|
|
"text/html; charset=us-ascii">
|
|
<title>Chapter 9. Known Issues</title>
|
|
<meta name="generator" content="DocBook XSL Stylesheets V1.68.1">
|
|
<link rel="start" href="index.html" title=
|
|
"NVIDIA Accelerated Linux Graphics Driver README and Installation Guide">
|
|
<link rel="up" href="installationandconfiguration.html" title=
|
|
"Part I. Installation and Configuration Instructions">
|
|
<link rel="prev" href="commonproblems.html" title=
|
|
"Chapter 8. Common Problems">
|
|
<link rel="next" href="dma_issues.html" title=
|
|
"Chapter 10. Allocating DMA Buffers on 64-bit Platforms">
|
|
</head>
|
|
<body>
|
|
<div class="navheader">
|
|
<table width="100%" summary="Navigation header">
|
|
<tr>
|
|
<th colspan="3" align="center">Chapter 9. Known
|
|
Issues</th>
|
|
</tr>
|
|
<tr>
|
|
<td width="20%" align="left"><a accesskey="p" href=
|
|
"commonproblems.html">Prev</a> </td>
|
|
<th width="60%" align="center">Part I. Installation and
|
|
Configuration Instructions</th>
|
|
<td width="20%" align="right"> <a accesskey="n" href=
|
|
"dma_issues.html">Next</a></td>
|
|
</tr>
|
|
</table>
|
|
<hr></div>
|
|
<div class="chapter" lang="en">
|
|
<div class="titlepage">
|
|
<div>
|
|
<div>
|
|
<h2 class="title"><a name="knownissues" id=
|
|
"knownissues"></a>Chapter 9. Known Issues</h2>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<p>The following problems still exist in this release and are in
|
|
the process of being resolved.</p>
|
|
<div class="variablelist">
|
|
<p class="title"><b>Known Issues</b></p>
|
|
<dl>
|
|
<dt><span class="term">Cache Aliasing</span></dt>
|
|
<dd>
|
|
<p>Cache aliasing occurs when multiple mappings to a physical page
|
|
of memory have conflicting caching states, such as cached and
|
|
uncached. Due to these conflicting states, data in that physical
|
|
page may become corrupted when the processor's cache is flushed. If
|
|
that page is being used for DMA by a driver such as NVIDIA's
|
|
graphics driver, this can lead to hardware stability problems and
|
|
system lockups.</p>
|
|
<p>NVIDIA has encountered bugs with some Linux kernel versions that
|
|
lead to cache aliasing. Although some systems will run perfectly
|
|
fine when cache aliasing occurs, other systems will experience
|
|
severe stability problems, including random lockups. Users
|
|
experiencing stability problems due to cache aliasing will benefit
|
|
from updating to a kernel that does not cause cache aliasing to
|
|
occur.</p>
|
|
</dd>
|
|
<dt><span class="term">Valgrind</span></dt>
|
|
<dd>
|
|
<p>The NVIDIA OpenGL implementation makes use of self modifying
|
|
code. To force Valgrind to retranslate this code after a
|
|
modification you must run using the Valgrind command line
|
|
option:</p>
|
|
<pre class="screen">
|
|
--smc-check=all
|
|
</pre>
|
|
<p>Without this option Valgrind may execute incorrect code causing
|
|
incorrect behavior and reports of the form:</p>
|
|
<pre class="screen">
|
|
==30313== Invalid write of size 4
|
|
</pre>
|
|
<p></p>
|
|
</dd>
|
|
<dt><a name="msi_interrupts" id="msi_interrupts"></a><span class=
|
|
"term">Driver fails to initialize when MSI interrupts are
|
|
enabled</span></dt>
|
|
<dd>
|
|
<p>The Linux NVIDIA driver uses Message Signaled Interrupts (MSI)
|
|
by default. This provides compatibility and scalability benefits,
|
|
mainly due to the avoidance of IRQ sharing.</p>
|
|
<p>Some systems have been seen to have problems supporting MSI,
|
|
while working fine with virtual wire interrupts. These problems
|
|
manifest as an inability to start X with the NVIDIA driver, or CUDA
|
|
initialization failures. The NVIDIA driver will then report an
|
|
error indicating that the NVIDIA kernel module does not appear to
|
|
be receiving interrupts generated by the GPU.</p>
|
|
<p>Problems have also been seen with suspend/resume while MSI is
|
|
enabled. All known problems have been fixed, but if you observe
|
|
problems with suspend/resume that you did not see with previous
|
|
drivers, disabling MSI may help you.</p>
|
|
<p>NVIDIA is working on a long-term solution to improve the
|
|
driver's out of the box compatibility with system configurations
|
|
that do not fully support MSI.</p>
|
|
<p>MSI interrupts can be disabled via the NVIDIA kernel module
|
|
parameter "NVreg_EnableMSI=0". This can be set on the command line
|
|
when loading the module, or more appropriately via your
|
|
distribution's kernel module configuration files (such as those
|
|
under /etc/modprobe.d/).</p>
|
|
</dd>
|
|
<dt><a name="console_restore" id="console_restore"></a><span class=
|
|
"term">Console restore behavior</span></dt>
|
|
<dd>
|
|
<p>The Linux NVIDIA driver uses the nvidia-modeset module for
|
|
console restore whenever it can. Currently, the improved console
|
|
restore mechanism is used on systems that boot with the UEFI
|
|
Graphics Output Protocol driver, and on systems that use supported
|
|
VESA linear graphical modes. Note that VGA text, color index,
|
|
planar, banked, and some linear modes cannot be supported, and will
|
|
use the older console restore method instead.</p>
|
|
<p>When the new console restore mechanism is in use and the
|
|
nvidia-modeset module is initialized (e.g. because an X server is
|
|
running on a different VT, nvidia-persistenced is running, or the
|
|
nvidia_drm module is loaded with the <code class=
|
|
"computeroutput">modeset=1</code> parameter), then nvidia-modeset
|
|
will respond to hot plug events by displaying the console on as
|
|
many displays as it can. Note that to save power, it may not
|
|
display the console on all connected displays.</p>
|
|
</dd>
|
|
<dt><a name="vulkan_devices" id="vulkan_devices"></a><span class=
|
|
"term">Vulkan and device enumeration</span></dt>
|
|
<dd>
|
|
<p>Starting with the X.Org X server version 1.20.7, it is possible
|
|
to enumerate all the NVIDIA devices in the system if the
|
|
application is able to open a connection to the X server. However,
|
|
such applications will only be able to create an Xlib or XCB
|
|
swapchain on the device driving the X screen. Such a device can be
|
|
identified by using the vkGetPhysicalDeviceSurfaceSupportKHR()
|
|
API.</p>
|
|
<p>Prior to the X.Org X server version 1.20.7, it is not possible
|
|
to enumerate multiple devices if one of them will be used to
|
|
present to an X11 swapchain. It is still possible to enumerate
|
|
multiple devices even if one of them is driving an X screen, if the
|
|
devices will be used for Vulkan offscreen rendering or presenting
|
|
to a display swapchain. For that, make sure that the application
|
|
cannot open a display connection to an X server by, for example,
|
|
unsetting the DISPLAY environment variable.</p>
|
|
</dd>
|
|
<dt><a name="profiling" id="profiling"></a><span class=
|
|
"term">Restricting access to GPU performance counters</span></dt>
|
|
<dd>
|
|
<p>NVIDIA Developer Tools allow developers to debug, profile, and
|
|
develop software for NVIDIA GPUs. GPU performance counters are
|
|
integral to these tools. By default, access to the GPU performance
|
|
counters is restricted to root, and other users with the
|
|
CAP_SYS_ADMIN capability, for security reasons. If developers
|
|
require access to the NVIDIA Developer Tools, a system
|
|
administrator can accept the security risk and allow access to
|
|
users without the CAP_SYS_ADMIN capability.</p>
|
|
<p>Wider access to GPU performance counters can be granted by
|
|
setting the kernel module parameter
|
|
"NVreg_RestrictProfilingToAdminUsers=0" in the nvidia.ko kernel
|
|
module. This can be set on the command line when loading the
|
|
module, or more appropriately via your distribution's kernel module
|
|
configuration files (such as those under /etc/modprobe.d/).</p>
|
|
</dd>
|
|
<dt><a name="RedHat" id="RedHat"></a><span class="term">Driver
|
|
fails to initialize with some versions of RHEL 8</span></dt>
|
|
<dd>
|
|
<p>Some versions of Red Hat Enterprise Linux 8 kernels have a bug
|
|
that causes driver initialization to fail with an error such
|
|
as:</p>
|
|
<pre class="screen">
|
|
NVRM: Xid (PCI:0000:09:00): 79, pid=2172, GPU has fallen off the bus.
|
|
NVRM: GPU 0000:09:00.0: GPU has fallen off the bus.
|
|
NVRM: GPU 0000:09:00.0: RmInitAdapter failed! (0x26:0x65:1239)
|
|
NVRM: GPU 0000:09:00.0: rm_init_adapter failed, device minor number 0
|
|
</pre>
|
|
<p></p>
|
|
<p>See the Red Hat knowledge base article <a href=
|
|
"https://access.redhat.com/solutions/5825061" target=
|
|
"_top">https://access.redhat.com/solutions/5825061</a> to find the
|
|
specific affected and fixed kernel versions.</p>
|
|
</dd>
|
|
<dt><a name="IBT" id="IBT"></a><span class="term">Driver fails to
|
|
load on Linux kernel versions 5.18 through 5.18.19 with
|
|
CONFIG_X86_KERNEL_IBT enabled</span></dt>
|
|
<dd>
|
|
<p>The NVIDIA driver fails to load on IBT (Indirect Branch
|
|
Tracking) supported CPUs running Linux kernel versions 5.18 to
|
|
5.18.19, when IBT is enabled, with the following error:</p>
|
|
<pre class="screen">
|
|
error "traps: Missing ENDBR:"
|
|
</pre>
|
|
<p></p>
|
|
<p>This issue is not seen with Linux kernels having the following
|
|
commit:</p>
|
|
<pre class="screen">
|
|
commit 3c6f9f77e618 (objtool: Rework ibt and extricate from stack validation)
|
|
</pre>
|
|
<p>The aforementioned commit is available in Linux kernel versions
|
|
5.19 and later. The NVIDIA driver's IBT support works with Linux
|
|
kernels containing commit 3c6f9f77e618 (5.19 and later). Please use
|
|
the kernel boot parameter "ibt=off" as a workaround on kernels
|
|
without that commit.</p>
|
|
</dd>
|
|
<dt><span class="term">Notebooks</span></dt>
|
|
<dd>
|
|
<p>If you are using a notebook see the "Known Notebook Issues" in
|
|
<a href="configlaptop.html" title=
|
|
"Chapter 16. Configuring a Notebook">Chapter 16,
|
|
<i>Configuring a Notebook</i></a>.</p>
|
|
</dd>
|
|
<dt><a name="texture_clamping" id=
|
|
"texture_clamping"></a><span class="term">Texture seams in Quake 3
|
|
engine</span></dt>
|
|
<dd>
|
|
<p>Many games based on the Quake 3 engine set their textures to use
|
|
the <code class="computeroutput">GL_CLAMP</code> clamping mode when
|
|
they should be using <code class=
|
|
"computeroutput">GL_CLAMP_TO_EDGE</code>. This was an oversight
|
|
made by the developers because some legacy NVIDIA GPUs treat the
|
|
two modes as equivalent. The result is seams at the edges of
|
|
textures in these games. To mitigate this, older versions of the
|
|
NVIDIA display driver remap <code class=
|
|
"computeroutput">GL_CLAMP</code> to <code class=
|
|
"computeroutput">GL_CLAMP_TO_EDGE</code> internally to emulate the
|
|
behavior of the older GPUs, but this workaround has been disabled
|
|
by default. To re-enable it, uncheck the "Use Conformant Texture
|
|
Clamping" checkbox in nvidia-settings before starting any affected
|
|
applications.</p>
|
|
</dd>
|
|
<dt><span class="term">FSAA</span></dt>
|
|
<dd>
|
|
<p>When FSAA is enabled (the __GL_FSAA_MODE environment variable is
|
|
set to a value that enables FSAA and a multisample visual is
|
|
chosen), the rendering may be corrupted when resizing the
|
|
window.</p>
|
|
</dd>
|
|
<dt><span class="term">libGL DSO finalizer and pthreads</span></dt>
|
|
<dd>
|
|
<p>When a multithreaded OpenGL application exits, it is possible
|
|
for libGL's DSO finalizer (also known as the destructor, or
|
|
"_fini") to be called while other threads are executing OpenGL
|
|
code. The finalizer needs to free resources allocated by libGL.
|
|
This can cause problems for threads that are still using these
|
|
resources. Setting the environment variable "__GL_NO_DSO_FINALIZER"
|
|
to "1" will work around this problem by forcing libGL's finalizer
|
|
to leave its resources in place. These resources will still be
|
|
reclaimed by the operating system when the process exits. Note that
|
|
the finalizer is also executed as part of dlclose(3), so if you
|
|
have an application that dlopens(3) and dlcloses(3) libGL
|
|
repeatedly, "__GL_NO_DSO_FINALIZER" will cause libGL to leak
|
|
resources until the process exits. Using this option can improve
|
|
stability in some multithreaded applications, including Java3D
|
|
applications.</p>
|
|
</dd>
|
|
<dt><span class="term">Thread cancellation</span></dt>
|
|
<dd>
|
|
<p>Canceling a thread (see pthread_cancel(3)) while it is executing
|
|
in the OpenGL driver causes undefined behavior. For applications
|
|
that wish to use thread cancellation, it is recommended that
|
|
threads disable cancellation using pthread_setcancelstate(3) while
|
|
executing OpenGL or GLX commands.</p>
|
|
</dd>
|
|
</dl>
|
|
</div>
|
|
<p>This section describes problems that will not be fixed. Usually,
|
|
the source of the problem is beyond the control of NVIDIA.
|
|
Following is the list of problems:</p>
|
|
<div class="variablelist">
|
|
<p class="title"><b>Problems that Will Not Be Fixed</b></p>
|
|
<dl>
|
|
<dt><span class="term">NV-CONTROL versions 1.8 and 1.9</span></dt>
|
|
<dd>
|
|
<p>Version 1.8 of the NV-CONTROL X Extension introduced target
|
|
types for setting and querying attributes as well as receiving
|
|
event notification on targets. Targets are objects like X Screens,
|
|
GPUs and Quadro Sync devices. Previously, all attributes were
|
|
described relative to an X Screen. These new bits of information
|
|
(target type and target id) were packed in a non-compatible way in
|
|
the protocol stream such that addressing X Screen 1 or higher would
|
|
generate an X protocol error when mixing NV-CONTROL client and
|
|
server versions.</p>
|
|
<p>This packing problem has been fixed in the NV-CONTROL 1.10
|
|
protocol, making it possible for the older (1.7 and prior) clients
|
|
to communicate with NV-CONTROL 1.10 servers. Furthermore, the
|
|
NV-CONTROL 1.10 client library has been updated to accommodate the
|
|
target protocol packing bug when communicating with a 1.8 or 1.9
|
|
NV-CONTROL server. This means that the NV-CONTROL 1.10 client
|
|
library should be able to communicate with any version of the
|
|
NV-CONTROL server.</p>
|
|
<p>NVIDIA recommends that NV-CONTROL client applications relink
|
|
with version 1.10 or later of the NV-CONTROL client library
|
|
(libXNVCtrl.a, in the nvidia-settings-535.161.07.tar.bz2 tarball).
|
|
The version of the client library can be determined by checking the
|
|
NV_CONTROL_MAJOR and NV_CONTROL_MINOR definitions in the
|
|
accompanying nv_control.h.</p>
|
|
<p>The only web released NVIDIA Linux driver that is affected by
|
|
this problem (i.e., the only driver to use either version 1.8 or
|
|
1.9 of the NV-CONTROL X extension) is 1.0-8756.</p>
|
|
</dd>
|
|
<dt><span class="term">CPU throttling reducing memory bandwidth on
|
|
IGP systems</span></dt>
|
|
<dd>
|
|
<p>For some models of CPU, the CPU throttling technology may affect
|
|
not only CPU core frequency, but also memory frequency/bandwidth.
|
|
On systems using integrated graphics, any reduction in memory
|
|
bandwidth will affect the GPU as well as the CPU. This can
|
|
negatively affect applications that use significant memory
|
|
bandwidth, such as video decoding using VDPAU, or certain OpenGL
|
|
operations. This may cause such applications to run with lower
|
|
performance than desired.</p>
|
|
<p>To work around this problem, NVIDIA recommends configuring your
|
|
CPU throttling implementation to avoid reducing memory bandwidth.
|
|
This may be as simple as setting a certain minimum frequency for
|
|
the CPU.</p>
|
|
<p>Depending on your operating system and/or distribution, this may
|
|
be as simple as writing to a configuration file in the /sys or
|
|
/proc filesystems, or other system configuration file. Please read,
|
|
or search the Internet for, documentation regarding CPU throttling
|
|
on your operating system.</p>
|
|
</dd>
|
|
<dt><span class="term">VDPAU initialization failures on supported
|
|
GPUs</span></dt>
|
|
<dd>
|
|
<p>If VDPAU gives the VDP_STATUS_NO_IMPLEMENTATION error message on
|
|
a GPU which was labeled or specified as supporting PureVideo or
|
|
PureVideo HD, one possible reason is a hardware defect. After
|
|
ruling out any other software problems, NVIDIA recommends returning
|
|
the GPU to the manufacturer for a replacement.</p>
|
|
</dd>
|
|
<dt><a name="extension_string_size" id=
|
|
"extension_string_size"></a><span class="term">Some applications,
|
|
such as Quake 3, crash after querying the OpenGL extension
|
|
string</span></dt>
|
|
<dd>
|
|
<p>Some applications have bugs that are triggered when the
|
|
extension string is longer than a certain size. As more features
|
|
are added to the driver, the length of this string increases and
|
|
can trigger these sorts of bugs.</p>
|
|
<p>You can limit the extensions listed in the OpenGL extension
|
|
string to the ones that appeared in a particular version of the
|
|
driver by setting the <code class=
|
|
"computeroutput">__GL_ExtensionStringVersion</code> environment
|
|
variable to a particular version number. For example,</p>
|
|
<pre class="screen">
|
|
__GL_ExtensionStringVersion=17700 quake3
|
|
</pre>
|
|
<p>will run Quake 3 with the extension string that appeared in the
|
|
177.* driver series. Limiting the size of the extension string can
|
|
work around this sort of application bug.</p>
|
|
</dd>
|
|
<dt><a name="gnome_shell_doesnt_update" id=
|
|
"gnome_shell_doesnt_update"></a><span class="term">gnome-shell
|
|
doesn't update until a window is moved</span></dt>
|
|
<dd>
|
|
<p>Versions of libcogl prior to 1.10.x have a bug which causes
|
|
glBlitFramebuffer() calls used to update the window to be clipped
|
|
by a 0x0 scissor (see <a href=
|
|
"https://bugzilla.gnome.org/show_bug.cgi?id=690451" target=
|
|
"_top">GNOME bug #690451</a> for more details). To work around this
|
|
bug, the scissor test can be disabled by setting the <code class=
|
|
"computeroutput">__GL_ConformantBlitFramebufferScissor</code>
|
|
environment variable to 0. Note this version of the NVIDIA driver
|
|
comes with an application profile which automatically disables this
|
|
test if libcogl is detected in the process.</p>
|
|
</dd>
|
|
<dt><a name="Xserver_compares_only_the_matrix_part_of_a_transform"
|
|
id=
|
|
"Xserver_compares_only_the_matrix_part_of_a_transform"></a><span class="term">Some
|
|
X servers ignore the RandR transform filter during a modeset
|
|
request</span></dt>
|
|
<dd>
|
|
<p>The RandR layer of the X server attempts to ignore redundant
|
|
RRSetCrtcConfig requests. If the only property changed by an
|
|
RRSetCrtcConfig request is the transform filter, some X servers
|
|
will ignore the request as redundant. This can be worked around by
|
|
also changing other properties, such as the mode, transformation
|
|
matrix, etc.</p>
|
|
</dd>
|
|
</dl>
|
|
</div>
|
|
<p></p>
|
|
</div>
|
|
<div class="navfooter">
|
|
<hr>
|
|
<table width="100%" summary="Navigation footer">
|
|
<tr>
|
|
<td width="40%" align="left"><a accesskey="p" href=
|
|
"commonproblems.html">Prev</a> </td>
|
|
<td width="20%" align="center"><a accesskey="u" href=
|
|
"installationandconfiguration.html">Up</a></td>
|
|
<td width="40%" align="right"> <a accesskey="n" href=
|
|
"dma_issues.html">Next</a></td>
|
|
</tr>
|
|
<tr>
|
|
<td width="40%" align="left" valign="top">
|
|
Chapter 8. Common Problems </td>
|
|
<td width="20%" align="center"><a accesskey="h" href=
|
|
"index.html">Home</a></td>
|
|
<td width="40%" align="right" valign="top">
|
|
Chapter 10. Allocating DMA Buffers on 64-bit
|
|
Platforms</td>
|
|
</tr>
|
|
</table>
|
|
</div>
|
|
</body>
|
|
</html>
|