Chapter 8. Common Problems

This section provides solutions to common problems associated with the NVIDIA Linux x86 Driver.

My X server fails to start, and my X log file contains the error:

(EE) NVIDIA(0): The NVIDIA kernel module does not appear to
(EE) NVIDIA(0):      be receiving interrupts generated by the NVIDIA graphics
(EE) NVIDIA(0):      device PCI:x:x:x. Please see the COMMON PROBLEMS
(EE) NVIDIA(0):      section in the README for additional information.

This can be caused by a variety of problems, such as PCI IRQ routing errors, I/O APIC problems, conflicts with other devices sharing the IRQ (or their drivers), or MSI compatibility problems.

If possible, configure your system such that your graphics card does not share its IRQ with other devices (try moving the graphics card to another slot if applicable, unload/disable the driver(s) for the device(s) sharing the card's IRQ, or remove/disable the device(s)).

Depending on the nature of the problem, one of (or a combination of) these kernel parameters might also help:

Parameter Behavior
pci=noacpi don't use ACPI for PCI IRQ routing
pci=biosirq use PCI BIOS calls to retrieve the IRQ routing table
noapic don't use I/O APICs present in the system
acpi=off disable ACPI

The problem may also be caused by MSI compatibility problems. See MSI Interrupts for details.

My X server fails to start, and my X log file contains the error:

(EE) NVIDIA(0): The interrupt for NVIDIA graphics device PCI:x:x:x
(EE) NVIDIA(0):      appears to be edge-triggered. Please see the COMMON
(EE) NVIDIA(0):      PROBLEMS section in the README for additional information.

An edge-triggered interrupt means that the kernel has programmed the interrupt as edge-triggered rather than level-triggered in the Advanced Programmable Interrupt Controller (APIC). Edge-triggered interrupts are not intended to be used for sharing an interrupt line between multiple devices; level-triggered interrupts are the intended trigger for such usage. When using edge-triggered interrupts, it is common for device drivers using that interrupt line to stop receiving interrupts. This would appear to the end user as those devices no longer working, and potentially as a full system hang. These problems tend to be more common when multiple devices are sharing that interrupt line.

This occurs when ACPI is not used to program interrupt routing in the APIC. It may also occur when ACPI is disabled, or fails to initialize. In these cases, the Linux kernel falls back to tables provided by the system BIOS. In some cases the system BIOS assumes ACPI will be used for routing interrupts and configures these tables to incorrectly label all interrupts as edge-triggered. The current interrupt configuration can be found in /proc/interrupts.

Available workarounds include: updating to a newer system BIOS, a more recent Linux kernel with ACPI enabled, or passing the 'noapic' option to the kernel to force interrupt routing through the traditional Programmable Interrupt Controller (PIC). The Linux kernel also provides an interrupt polling mechanism you can use to attempt to work around this problem. This mechanism can be enabled by passing the 'irqpoll' option to the kernel.

Currently, the NVIDIA driver will attempt to detect edge triggered interrupts and X will purposely fail to start (to avoid stability issues). This behavior can be overridden by setting the "NVreg_RMEdgeIntrCheck" NVIDIA Linux kernel module parameter. This parameter defaults to "1", which enables the edge triggered interrupt detection. Set this parameter to "0" to disable this detection.

X starts for me, but OpenGL applications terminate immediately.

If X starts but you have trouble with OpenGL, you most likely have a problem with other libraries in the way, or there are stale symlinks. See Chapter 5, Listing of Installed Components for details. Sometimes, all it takes is to rerun ldconfig.

You should also check that the correct extensions are present;

    % xdpyinfo

should show the “GLX” and “NV-GLX” extensions present. If these two extensions are not present, then there is most likely a problem loading the glx module, or it is unable to implicitly load GLcore. Check your X config file and make sure that you are loading glx (see Chapter 6, Configuring X for the NVIDIA Driver). If your X config file is correct, then check the X log file for warnings/errors pertaining to GLX. Also check that all of the necessary symlinks are in place (refer to Chapter 5, Listing of Installed Components).

When Xinerama is enabled, my stereo glasses are shuttering only when the stereo application is displayed on one specific X screen. When the application is displayed on the other X screens, the stereo glasses stop shuttering.

This problem occurs with DDC and "blue line" stereo glasses, that get the stereo signal from one video port of the graphics card. When a X screen does not display any stereo drawable the stereo signal is disabled on the associated video port.

Forcing stereo flipping allows the stereo glasses to shutter continuously. This can be done by enabling the OpenGL control "Force Stereo Flipping" in nvidia-settings, or by setting the X configuration option "ForceStereoFlipping" to "1".

Stereo is not in sync across multiple displays.

There are two cases where this may occur. If the displays are attached to the same GPU, and one of them is out of sync with the stereo glasses, you will need to reconfigure your monitors to drive identical mode timings; see Chapter 18, Programming Modes for details.

If the displays are attached to different GPUs, the only way to synchronize stereo across the displays is with a Quadro Sync device, which is only supported by certain Quadro cards. See Chapter 29, Configuring Frame Lock and Genlock for details.

I just upgraded my kernel, and now the NVIDIA kernel module will not load.

The kernel interface layer of the NVIDIA kernel module must be compiled specifically for the configuration and version of your kernel. If you upgrade your kernel, then the simplest solution is to reinstall the driver.

ADVANCED: You can install the NVIDIA kernel module for a non running kernel (for example: in the situation where you just built and installed a new kernel, but have not rebooted yet) with a command line such as this:

    # sh NVIDIA-Linux-x86-352.79.run --kernel-name='KERNEL_NAME'

Where 'KERNEL_NAME' is what uname -r would report if the target kernel were running.

Installing the driver fails with:

Unable to load the kernel module 'nvidia.ko'.

My X server fails to start, and my X log file contains the error:

(EE) NVIDIA(0): Failed to load the NVIDIA kernel module!

nvidia-installer attempts to load the NVIDIA kernel module before installing the driver, and will abort if this test load fails. Similarly, if the kernel module fails to load when starting the an X server with the NVIDIA X driver, the X server will fail to start.

If the NVIDIA kernel module fails to load, you should check the output of dmesg for kernel error messages and/or attempt to load the kernel module explicitly with modprobe nvidia. There are a number of common failure cases:

  • Some symbols that the kernel module depends on failed to be resolved. If this happens, then the kernel module was most likely built against a Linux kernel source tree (or kernel headers) for a kernel revision or configuration that doesn't match the running kernel.

    In some cases, the NVIDIA kernel module may fail to resolve symbols due to those symbols being provided by modules that were built as part of the configuration of the currently running kernel, but which are not installed. For example, some distributions, such as Ubuntu 14.04, provide the DRM kernel module in an optionally installed package (in the case of Ubuntu 14.04, linux-image-extra), but the kernel headers will reflect the availability of DRM regardless of whether the module that provides it is actually installed. The NVIDIA kernel module build will detect the availability of DRM when building, but will fail at load time with messages such as:

    nvidia: Unknown symbol drm_open (err 0)
    

    If the NVIDIA kernel module fails to load due to unresolved symbols, and you are certain that it was built against the correct kernel source tree (or headers), check to see if there are any optionally installable modules that might provide these symbols which are not currently installed. If you believe that you might not be using the correct kernel sources/headers, you can specify their location when you install the NVIDIA driver using the --kernel-source-path command line option (see sh NVIDIA-Linux-x86-352.79.run --advanced-options for details).

  • Nouveau, or another driver, is already using the GPU. See Q & A 8.1, “Interaction with the Nouveau Driver” for more information on Nouveau and how to disable it.

  • The kernel requires that kernel modules carry a valid signature from a trusted key, and the NVIDIA kernel module is unsigned, or has an invalid or untrusted signature. This may happen, for example, on some systems with UEFI Secure Boot enabled. See the section called “Signing the NVIDIA Kernel Module” for more information about signing the kernel module.

  • No supported GPU is detected, either because no NVIDIA GPUs are detected in the system, or because none of the NVIDIA GPUs which are present are supported by this version of the NVIDIA kernel module. See Appendix A, Supported NVIDIA GPU Products for information on which GPUs are supported by which driver versions.

Installing the NVIDIA kernel module gives an error message like:

#error Modules should never use kernel-headers system headers
#error but headers from an appropriate kernel-source

You need to install the source for the Linux kernel. In most situations you can fix this problem by installing the kernel-source or kernel-devel package for your distribution

OpenGL applications crash and print out the following warning:

WARNING: Your system is running with a buggy dynamic loader.
This may cause crashes in certain applications.  If you
experience crashes you can try setting the environment
variable __GL_SINGLE_THREADED to 1.  For more information,
consult the FREQUENTLY ASKED QUESTIONS section in
the file /usr/share/doc/NVIDIA_GLX-1.0/README.txt.

The dynamic loader on your system has a bug which will cause applications linked with pthreads, and that dlopen() libGL multiple times, to crash. This bug is present in older versions of the dynamic loader. Distributions that shipped with this loader include but are not limited to Red Hat Linux 6.2 and Mandrake Linux 7.1. Version 2.2 and later of the dynamic loader are known to work properly. If the crashing application is single threaded then setting the environment variable __GL_SINGLE_THREADED to 1 will prevent the crash. In the bash shell you would enter:

    % export __GL_SINGLE_THREADED=1

and in csh and derivatives use:

    % setenv __GL_SINGLE_THREADED 1

Previous releases of the NVIDIA Accelerated Linux Graphics Driver attempted to work around this problem. Unfortunately, the workaround caused problems with other applications and was removed after version 1.0-1541.

Quake3 crashes when changing video modes.

You are probably experiencing a problem described above. Please check the text output for the “WARNING” message described in the previous hint. Setting __GL_SINGLE_THREADED to 1 as will fix the problem.

I cannot build the NVIDIA kernel module, or, I can build the NVIDIA kernel module, but modprobe/insmod fails to load the module into my kernel.

These problems are generally caused by the build using the wrong kernel header files (i.e. header files for a different kernel version than the one you are running). The convention used to be that kernel header files should be stored in /usr/include/linux/, but that is deprecated in favor of /lib/modules/RELEASE/build/include (where RELEASE is the result of uname -r. The nvidia-installer should be able to determine the location on your system; however, if you encounter a problem you can force the build to use certain header files by using the --kernel-include-dir option. For this to work you will of course need the appropriate kernel header files installed on your system. Consult the documentation that came with your distribution; some distributions do not install the kernel header files by default, or they install headers that do not coincide properly with the kernel you are running.

Compiling the NVIDIA kernel module gives this error:

You appear to be compiling the NVIDIA kernel module with
a compiler different from the one that was used to compile
the running kernel. This may be perfectly fine, but there
are cases where this can lead to unexpected behavior and
system crashes.

If you know what you are doing and want to override this
check, you can do so by setting IGNORE_CC_MISMATCH.

In any other case, set the CC environment variable to the
name of the compiler that was used to compile the kernel.

You should compile the NVIDIA kernel module with the same compiler version that was used to compile your kernel. Some Linux kernel data structures are dependent on the version of gcc used to compile it; for example, in include/linux/spinlock.h:

        ...
        * Most gcc versions have a nasty bug with empty initializers.
        */
        #if (__GNUC__ > 2)
          typedef struct { } rwlock_t;
          #define RW_LOCK_UNLOCKED (rwlock_t) { }
        #else
          typedef struct { int gcc_is_buggy; } rwlock_t;
          #define RW_LOCK_UNLOCKED (rwlock_t) { 0 }
        #endif

If the kernel is compiled with gcc 2.x, but gcc 3.x is used when the kernel interface is compiled (or vice versa), the size of rwlock_t will vary, and things like ioremap will fail. To check what version of gcc was used to compile your kernel, you can examine the output of:

    % cat /proc/version

To check what version of gcc is currently in your $PATH, you can examine the output of:

    % gcc -v

I have installed the NVIDIA driver with the --multiple-kernel-modules option. I then tried to run nvidia-smi, which resulted in the following error message.

FATAL: Module nvidia not found.
Failed to initialize NVML: Unknown Error

When multiple kernel modules are used, applications that use the NVIDIA GPU, such as nvidia-smi, must be directed to a specific module instance with the __NVIDIA_KERNEL_MODULE_INSTANCE environment variable in order to access the GPUs that are attached to that instance. This type of failure may be seen when running on a system with multiple NVIDIA kernel modules without setting __NVIDIA_KERNEL_MODULE_INSTANCE, and should be resolved by setting __NVIDIA_KERNEL_MODULE_INSTANCE to one of the installed module instance numbers.

    % __NVIDIA_KERNEL_MODULE_INSTANCE=0 nvidia-smi

X fails with error

Failed to allocate LUT context DMA

This is one of the possible consequences of compiling the NVIDIA kernel interface with a different gcc version than used to compile the Linux kernel (see above).

I recently updated various libraries on my system using my Linux distributor's update utility, and the NVIDIA graphics driver no longer works.

Conflicting libraries may have been installed by your distribution's update utility; see Chapter 5, Listing of Installed Components for details on how to diagnose this.

I have rebuilt the NVIDIA kernel module, but when I try to insert it, I get a message telling me I have unresolved symbols.

Unresolved symbols are most often caused by a mismatch between your kernel sources and your running kernel. They must match for the NVIDIA kernel module to build correctly. Make sure your kernel sources are installed and configured to match your running kernel.

OpenGL applications leak significant amounts of memory on my system!

If your kernel is making use of the -rmap VM, the system may be leaking memory due to a memory management optimization introduced in -rmap14a. The -rmap VM has been adopted by several popular distributions, the memory leak is known to be present in some of the distribution kernels; it has been fixed in -rmap15e.

If you suspect that your system is affected, try upgrading your kernel or contact your distribution's vendor for assistance.

Some OpenGL applications (like Quake3 Arena) crash when I start them on Red Hat Linux 9.0.

Some versions of the glibc package shipped by Red Hat that support TLS do not properly handle using dlopen() to access shared libraries which use some TLS models. This problem is exhibited, for example, when Quake3 Area dlopen()'s NVIDIA's libGL library. Please obtain at least glibc-2.3.2-11.9 which is available as an update from Red Hat.

When changing settings in games like Quake 3 Arena, or Wolfenstein Enemy Territory, the game crashes and I see this error:

...loading libGL.so.1: QGL_Init: dlopen libGL.so.1 failed: 
/usr/lib/tls/libGL.so.1: shared object cannot be dlopen()ed:
static TLS memory too small

These games close and reopen the NVIDIA OpenGL driver (via dlopen()/dlclose()) when settings are changed. On some versions of glibc (such as the one shipped with Red Hat Linux 9), there is a bug that leaks static TLS entries. This glibc bug causes subsequent re-loadings of the OpenGL driver to fail. This is fixed in more recent versions of glibc; see Red Hat bug #89692: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=89692

When I try to install the driver, the installer claims that X is running, even though I have exited X.

The installer detects the presence of an X server by checking for the X server's lock files: /tmp/.Xn-lock, where 'n' is the number of the X Display (the installer checks for X Displays 0-7). If you have exited X, but one of these files has been left behind, then you will need to manually delete the lock file. Do not remove this file if X is still running!

Why does the VBIOS fail to load on my Optimus system?

On some notebooks with Optimus graphics, the NVIDIA driver may not be able to retrieve the Video BIOS due to interactions between the System BIOS and the Linux kernel's ACPI subsystem. On affected notebooks, applications that require the GPU will fail, and messages like the following may appear in the system log:

NVRM: failed to copy vbios to system memory.
NVRM: RmInitAdapter failed! (0x30:0xffffffff:858)
NVRM: rm_init_adapter(0) failed

Such problems are typically beyond the control of the NVIDIA driver, which relies on proper cooperation of ACPI and the System BIOS to retrieve important information about the GPU, including the Video BIOS.

OpenGL applications are running slowly

The application is probably using a different library that still remains on your system, rather than the NVIDIA supplied OpenGL library. See Chapter 5, Listing of Installed Components for details.

X takes a long time to start (possibly several minutes).

Most of the X startup delay problems we have found are caused by incorrect data in video BIOSes about what display devices are possibly connected or what i2c port should be used for detection. You can work around these problems with the X config option IgnoreDisplayDevices.

Fonts are incorrectly sized after installing the NVIDIA driver.

Incorrectly sized fonts are generally caused by incorrect DPI (Dots Per Inch) information. You can check what X thinks the physical size of your monitor is, by running:

 % xdpyinfo | grep dimensions

This will report the size in pixels, and in millimeters.

If these numbers are wrong, you can correct them by modifying the X server's DPI setting. See Appendix E, Dots Per Inch for details.

OpenGL applications don't work, and my X log file contains the error:

(EE) NVIDIA(0): Unable to map device node /dev/zero with read and write
(EE) NVIDIA(0):     privileges.  The GLX extension will be disabled on this 
(EE) NVIDIA(0):     X screen.  Please see the COMMON PROBLEMS section in the 
(EE) NVIDIA(0):     README for more information.

The NVIDIA OpenGL driver must be able to map anonymous memory with read and write execute privileges in order to function correctly. The driver needs this ability to allocate aligned memory, which is used for certain optimizations. Currently, GLX cannot run without these optimizations.

X doesn't start, and my log file contains a message like the following:

(EE) NVIDIA(0): Failed to allocate primary buffer: failed to set CPU access
(EE) NVIDIA(0):     for surface.  Please see Chapter 8: Common Problems in
(EE) NVIDIA(0):     the README for troubleshooting suggestions.

The NVIDIA X driver needs to be able to access the buffers it allocates from the CPU, but wasn't able to set up this access. This commonly fails if you're using a large virtual desktop size. Although your GPU may have enough onboard video memory for the buffer, the amount of usable memory may be limited if the IndirectMemoryAccess option is disabled, or if not enough address space was reserved for indirect memory access (this commonly occurs on 32-bit systems). If you're seeing this problem and are using a 32-bit operating system, it may be resolved by switching to a 64-bit operating system.

My log file contains a message like the following:

(WW) NVIDIA(GPU-0): Unable to enter interactive mode, because non-interactive
(WW) NVIDIA(GPU-0): mode has been previously requested.  The most common
(WW) NVIDIA(GPU-0): cause is that a GPU compute application is currently
(WW) NVIDIA(GPU-0): running. Please see the README for details.

This indicates that the X driver was not able to put the GPU in interactive mode, because another program has requested non-interactive mode. The GPU watchdog will not run, and long-running GPU compute programs may cause the X server and OpenGL programs to hang. If you intend to run long-running GPU compute programs, set the Interactive option to "off" to disable interactive mode.

I see a blank screen or an error message instead of a login screen or desktop session

Installation or configuration problems may prevent the X server, a login/session manager, or a desktop environment from starting correctly. If your system is failing to display a login screen, or failing to start a desktop session, try the following troubleshooting steps:

  • Make sure that you are using the correct X driver for your configuration. Recent X servers will be able to automatically select the correct X driver in many cases, but if your X server does not automatically select the correct driver, you may need to manually configure it. For example, systems with multiple GPUs will likely require a PCI BusID in the "Device" section of the X configuration file, in order to specify which GPU is to be used.

    If you are planning to use NVIDIA GPUs for graphics, you can run the nvidia-xconfig utility to automatically generate a simple X configuration file that uses the NVIDIA X driver. If you are not using NVIDIA GPUs for graphics (e.g. on a server system where displays are driven by an onboard graphics controller, and NVIDIA GPUs are used for non-graphical computational purposes only), do not run nvidia-xconfig.

  • Some recent desktop environments (e.g. GNOME 3, Unity), window managers (e.g. mutter, compiz), and session managers (e.g. gdm3) require a working OpenGL driver in order to function correctly. In addition to making sure that the X server is configured to use the correct X driver for your configuration, please ensure that you are using the correct OpenGL driver to match your X driver.

    If you are not using NVIDIA GPUs for graphical purposes, try installing the driver with the --no-opengl-files option on the installer's command line to prevent the installer from overwriting any existing OpenGL installation, which may be needed for proper OpenGL functionality on whichever graphics controller is to be used on the system.

  • Desktop environments, window managers, and session managers that require OpenGL typically also require the X Composite extension. If you have disabled the Composite extension, either explicitly, or by enabling a feature that is not compatible with it, try re-enabling the extension (possibly by disabling any incompatible features). If you are unable to satisfy your desired use case with the Composite extension enabled, try switching to a different desktop environment, window manager, and/or session manager that does not require Composite.

  • Check the X log (e.g. /var/log/Xorg.0.log) for additional errors not covered above. Warning or error messages in the log may highlight a specific problem that can be fixed with a configuration adjustment.

The display settings I configured in nvidia-settings do not persist.

Depending on the type of configuration being performed, nvidia-settings will save configuration changes to one of several places:

  • Static X server configuration changes are saved to the X configuration file (e.g. /etc/X11/xorg.conf). These settings are loaded by the X server when it starts, and cannot be changed without restarting X.

  • Dynamic, user-specific configuration changes are saved to ~/.nvidia-settings-rc. nvidia-settings loads this file and applies any settings contained within. These settings can be changed without restarting the X server, and can typically be configured through the nvidia-settings command line interface as well, or via the RandR and/or NV-CONTROL APIs.

  • User-specific application profiles edited in nvidia-settings are saved to ~/.nv/nvidia-application-profiles-rc. This file is loaded along with the other files in the application profile search path by the NVIDIA OpenGL driver when it is loaded by an OpenGL application. The driver evaluates the application profiles to determine which settings apply to the application. Changes made to this configuration file while an application is already running will be applied when the application is next restarted. See Appendix J, Application Profiles for more information about application profiles.

Settings in ~/.nvidia-settings-rc only take effect when processed by nvidia-settings, and therefore will not be loaded by default when starting a new X session. To load settings from ~/.nvidia-settings-rc without actually opening the nvidia-settings control panel, use the --load-config-only option on the nvidia-settings command line. nvidia-settings --load-config-only can be added to your login scripts to ensure that your settings are restored when starting a new desktop session.

Even after nvidia-settings has been run to restore any settings set in ~/.nvidia-settings-rc, some desktop environments (e.g. GNOME, KDE, Unity, Xfce) include advanced display configuration tools that may override settings that were configured via nvidia-settings. These tools may attempt to restore their own display configuration when starting a new desktop session, or when events such as display hotplugs, resolution changes, or VT switches occur.

These tools may also override some types of settings that are stored in and loaded from the X configuration file, such as any MetaMode strings that may specify the initial display layouts of NVIDIA X screens. Although the configuration of the initial MetaMode is static, it is possible to dynamically switch to a different MetaMode after X has started. This can have the effect of making the set of active displays, their resolutions, and layout positions as configured in the nvidia-settings control panel appear to be ineffective, when in reality, this configuration was active when starting X and then overridden later by the desktop environment.

If you believe that your desktop environment is overriding settings that you configured in nvidia-settings, some possible solutions are:

  • Use the display configuration tools provided as part of the desktop environment (e.g. gnome-control-center display, gnome-display-properties, kcmshell4 display, unity-control-center display, xfce4-display-settings) to configure your displays, instead of the nvidia-settings control panel or the xrandr command line tool. Setting your desired configuration using the desktop environment's tools should cause that configuration to be the one which is restored when the desktop environment overrides the existing configuration from nvidia-settings. If you are not sure which tools your desktop environment uses for display configuration, you may be able to discover them by navigating any available system menus for "Display" or "Monitor" control panels.

  • For settings loaded from ~/.nvidia-settings-rc which have been overridden, run nvidia-settings --load-config-only as needed to reload the settings from ~/.nvidia-settings-rc.

  • Disable any features your desktop environment may have for managing displays. (Note: this may disable other features, such as display configuration tools that are integrated into the desktop.)

  • Use a different desktop environment which does not actively manage display configuration, or do not use any desktop environment at all.

Some systems may have multiple different display configuration utilities, each with its own way of managing settings. In addition to conflicting with nvidia-settings, such tools may conflict with each other. If your system uses more than one tool for configuring displays, make sure to check the configuration of each tool when attempting to determine the source of any unexpected display settings.

My displays are reconfigured in unexpected ways when I plug in or unplug a display, or power a display off and then power it on again.

This is a special case of the issues described in “The display settings I configured in nvidia-settings do not persist.”. Some desktop environments which include advanced display configuration tools will automatically configure the display layout in response to detected configuration changes. For example, when a new display is plugged in, such a desktop environment may attempt to restore the previous layout that was used with the set of currently connected displays, or may configure a default layout based upon its own policy.

On X servers with support for RandR 1.2 or later, the NVIDIA X driver reports display hotplug events to the X server via RandR when displays are connected and disconnected. These hotplug events may trigger a desktop environment with advanced display management capabilities to change the display configuration. These changes may affect settings such as the set of active displays, their resolutions and positioning relative to each other, per-display color correction settings, and more.

In addition to hotplug events generated by connecting or disconnecting displays, DisplayPort displays will generate a hot unplug event when they power off, and a hotplug event when they power on, even if no physical plugging in or unplugging takes place. This can lead to hotplug-induced display configuration changes without any actual hotplug action taking place.

If display hotplug events are resulting in undesired configuration changes, try the solutions and workarounds listed in “The display settings I configured in nvidia-settings do not persist.”. Another workaround would be to disable the NVIDIA X driver's reporting of hotplug events with the UseHotplugEvents X configuration option. Note that this option will have no effect on DisplayPort devices, which must report all hotplug events to ensure proper functionality.

8.1. Interaction with the Nouveau Driver

What is Nouveau, and why do I need to disable it?

Nouveau is a display driver for NVIDIA GPUs, developed as an open-source project through reverse-engineering of the NVIDIA driver. It ships with many current Linux distributions as the default display driver for NVIDIA hardware. It is not developed or supported by NVIDIA, and is not related to the NVIDIA driver, other than the fact that both Nouveau and the NVIDIA driver are capable of driving NVIDIA GPUs. Only one driver can control a GPU at a time, so if a GPU is being driven by the Nouveau driver, Nouveau must be disabled before installing the NVIDIA driver.

Nouveau performs modesets in the kernel. This can make disabling Nouveau difficult, as the kernel modeset is used to display a framebuffer console, which means that Nouveau will be in use even if X is not running. As long as Nouveau is in use, its kernel module cannot be unloaded, which will prevent the NVIDIA kernel module from loading. It is therefore important to make sure that Nouveau's kernel modesetting is disabled before installing the NVIDIA driver.

How do I prevent Nouveau from loading and performing a kernel modeset?

A simple way to prevent Nouveau from loading and performing a kernel modeset is to add configuration directives for the module loader to a file in one of the system's module loader configuration directories: for example, /etc/modprobe.d/ or /usr/local/modprobe.d. These configuration directives can technically be added to any file in these directories, but many of the existing files in these directories are provided and maintained by your distributor, which may from time to time provide updated configuration files which could conflict with your changes. Therefore, it is recommended to create a new file, for example, /etc/modprobe.d/disable-nouveau.conf, rather than editing one of the existing files, such as the popular /etc/modprobe.d/blacklist.conf. Note that some module loaders will only look for configuration directives in files whose names end with .conf, so if you are creating a new file, make sure its name ends with .conf.

Whether you choose to create a new file or edit an existing one, the following two lines will need to be added:

blacklist nouveau
options nouveau modeset=0

The first line will prevent Nouveau's kernel module from loading automatically at boot. It will not prevent manual loading of the module, and it will not prevent the X server from loading the kernel module; see "How do I prevent the X server from loading Nouveau?" below. The second line will prevent Nouveau from doing a kernel modeset. Without the kernel modeset, it is possible to unload Nouveau's kernel module, in the event that it is accidentally or intentionally loaded.

You will need to reboot your system after adding these configuration directives in order for them to take effect.

If nvidia-installer detects Nouveau is in use by the system, it will offer to create such a modprobe configuration file to disable Nouveau.

What if my initial ramdisk image contains Nouveau?

Some distributions include Nouveau in an initial ramdisk image (henceforth referred to as "initrd" in this document, and sometimes also known as "initramfs"), so that Nouveau's kernel modeset can take place as early as possible in the boot process. This poses an additional challenge to those who wish to prevent the modeset from occurring, as the modeset will occur while the system is executing within the initrd, before any directives in the module loader configuration files are processed.

If you have an initrd which loads the Nouveau driver, you will additionally need to ensure that Nouveau is disabled in the initrd. In most cases, rebuilding the initrd will pick up the module loader configuration files, including any which may disable Nouveau. Please consult your distribution's documentation on how to rebuild the initrd, as different distributions have different tools for building and modifying the initrd. Some popular distro initrd tools include: dracut, mkinitrd, and update-initramfs.

Some initrds understand the rdblacklist parameter. On these initrds, as an alternative to rebuilding the initrd, you can add the option rdblacklist=nouveau to your kernel's boot parameters. On initrds that do not support rdblacklist, it is possible to prevent Nouveau from performing a kernel modeset by adding the option nouveau.modeset=0 to your kernel's boot parameters. Note that nouveau.modeset=0 will prevent a kernel modeset, but it may not prevent Nouveau from being loaded, so rebuilding the initrd or using rdblacklist may be more effective than using nouveau.modeset=0.

Any changes to the default kernel boot parameters should be made in your bootloader's configuration file(s), so that the options get passed to your kernel every time the system is booted. Please consult your distribution's documentation on how to configure your bootloader, as different distributions use different bootloaders and configuration files.

How do I prevent the X server from loading Nouveau?

Blacklisting Nouveau will only prevent it from being loaded automatically at boot. If an X server is started as part of the normal boot process, and that X server uses the Nouveau X driver, then the Nouveau kernel module will still be loaded. Should this happen, you will be able to unload Nouveau with `modprobe -r nouveau` after stopping the X server, as long as you have taken care to prevent it from doing a kernel modeset; however, it is probably better to just make sure that X does not load Nouveau in the first place.

If your system is not configured to start an X server at boot, then you can simply run the NVIDIA driver installer after rebooting. Otherwise, the easiest thing to do is to edit your X server's configuration file so that your X server uses a non-modesetting driver that is compatible with your card, such as the vesa driver. You can then stop X and install the driver as usual. Please consult your X server's documentation to determine where your X server configuration file is located.