锐英源软件
第一信赖

精通

英语

开源

擅长

开发

培训

胸怀四海 

第一信赖

当前位置:锐英源 / 驱动开发培训 / 使用QEMU虚拟设备开发Windows驱动

服务方向

软件开发
办公财务MIS
股票
设备监控
网页信息采集及控制
多媒体
软件开发培训
流媒体开发
Java 安卓移动开发
Java Web开发
HTML5培训
iOS培训
网站前端开发
VC++
C++游戏开发培训
C#软件
C语言(Linux)
ASP.NET网站开发(C#)
C#软件+ASP.NET网站
SOCKET网络通信开发
COMOLE和ActiveX开发
C++(Linux)
汇编和破解
驱动开发
SkinMagicVC++换肤
MicroStation二次开发
计算机英语翻译

联系方式

固话:0371-63888850
手机:138-0381-0136
Q Q:396806883
微信:ryysoft

使用QEMU虚拟设备开发Windows驱动

重点

作者想通过理论描述,表达自己掌握了理论,进而为QEMU虚拟设备功能性进行论证。要学习驱动和设备通信、中断和DMA请认真看此文。

Introduction介绍

 

Stable interactions between operating systems and other computer programs and hardware functions are performed via a device driver. However, developing a device driver significantly increases the time to market for peripheral devices. Fortunately, virtualization technologies like QEMU allow developers to emulate a physical device and start software development before hardware is manufactured.
The QEMU machine emulator and visualizer allow developers to securely test device drivers, find and fix defects which can crash the entire operating system. Developing and debugging drivers on an emulator makes working with them similar to working with user-space applications. At worst, bugs can lead to the emulator crashing.
In this article, we explain our approach to developing Windows drivers using a QEMU virtual device. We’ll describe all the benefits and limitations of device emulation for driver development and provide a clear overview on how you can establish communication between a device and its driver.

操作系统与其他计算机程序和硬件功能之间的稳定交互是通过设备驱动程序执行的。但是,开发设备驱动程序会大大增加外围设备的上市时间。幸运的是,诸如QEMU之类的虚拟化技术使开发人员可以仿真物理设备并在制造硬件之前开始软件开发。

QEMU机器仿真器和可视化器使开发人员能够安全地测试设备驱动程序,查找并修复可能导致整个操作系统崩溃的缺陷。在仿真器上开发和调试驱动程序使使用它们类似于使用用户空间应用程序。在最坏的情况下,错误可能导致仿真器崩溃。

在本文中,我们解释了使用QEMU虚拟设备开发Windows驱动程序的方法。我们将描述设备仿真对驱动程序开发的所有好处和局限性,并提供有关如何在设备与其驱动程序之间建立通信的清晰概述。


Developing Windows device drivers and device firmware are difficult and interdependent processes. In this article, we consider how to speed up and improve device driver development from the earliest stages of the project, prior to or alongside the development of the device and its firmware.
To begin, let’s consider the main stages of hardware and software development:

  • Setting objectives and analyzing requirements
  • Developing specifications
  • Testing the operability of the specifications
  • Developing the device and its firmware
  • Developing the device driver
  • Integrating software and hardware, debugging, and stabilizing

To speed up the time for driver development, we propose using a mock device that can be implemented in a QEMU virtual machine.

开发Windows设备驱动程序和设备固件是困难且相互依赖的过程。在本文中,我们将考虑从项目的最早阶段,在设备及其固件的开发之前或同时,如何加快和改善设备驱动程序的开发。

首先,让我们考虑一下硬件和软件开发的主要阶段:

  1. 设定目标并分析需求
  2. 开发规范
  3. 测试规范的可操作性
  4. 开发设备及其固件
  5. 开发设备驱动程序
  6. 集成软件和硬件,调试和稳定

为了加快驱动程序开发的时间,我们建议使用可以在QEMU虚拟机中实现的模拟设备。

 

Why do we use QEMU?我们为什么要使用QEMU?


QEMU has all the necessary infrastructure to quickly implement virtual devices. Additionally, QEMU provides an extensive list of APIs for device development and control. For a guest operating system, the interface of such a virtual device will be the same as for a real physical device. However, a QEMU virtual device is a mock device, most likely with limited functionality (depending on the device’s capabilities) and will definitely be much slower than a real physical device.具有快速实施虚拟设备的所有必要基础结构。此外,QEMU还提供了用于设备开发和控制的大量API。对于来宾操作系统,此类虚拟设备的界面将与真实物理设备的界面相同。但是,QEMU虚拟设备是模拟设备,很可能功能有限(取决于设备的功能),并且肯定会比真实的物理设备慢得多。

Pros and cons of using a QEMU virtual device优点和使用QEMU虚拟设备的缺点

Let’s consider the pros and cons of this approach, beginning with the pros:

  • The driver and device are implemented independently and simultaneously, provided that there already is a device communication protocol.
  • You get proof of driver-device communication before implementing the device prototype. When implementing a QEMU virtual device and driver, you can test their specifications and find any issues in the device-driver communication protocol.
  • You can detect logical issues in the device communication specifications at early stages of development.
  • QEMU provides driver developers with a better understanding of the logic of a device’s operation.
  • You can stabilize drivers faster due to simple device debugging in QEMU.
  • When integrating a driver with a device, you’ll already have a fairly stable and debugged driver. Thus, integration will be faster.
  • Using unit tests written for the driver and QEMU device, you can iteratively check the specification requirements for a real physical device as you add functionality.
  • A QEMU virtual device can be used to automatically test a driver on different versions of Windows.
  • Using a QEMU virtual device, you can practice developing device drivers without a real device.
  • 让我们从优点开始考虑这种方法的利弊:

    1. 如果已经存在设备通信协议,则驱动程序和设备可以独立并同时实现。
    2. 在实现设备原型之前,您将获得驱动程序与设备通信的证明。实施QEMU虚拟设备和驱动程序时,可以测试它们的规格并在设备驱动程序通信协议中查找任何问题。
    3. 您可以在开发的早期阶段检测设备通信规范中的逻辑问题。
    4. QEMU为驱动程序开发人员提供了对设备操作逻辑的更好理解。
    5. 由于QEMU中的简单设备调试,因此可以更快地稳定驱动程序。
    6. 将驱动程序与设备集成时,您将已经具有相当稳定且经过调试的驱动程序。因此,集成将更快。
    7. 使用为驱动程序和QEMU设备编写的单元测试,可以在添加功能时迭代检查实际物理设备的规格要求。
    8. QEMU虚拟设备可用于在不同版本的Windows上自动测试驱动程序。
    9. 使用QEMU虚拟设备,您可以在没有实际设备的情况下练习开发设备驱动程序。

Now let’s look at the cons of this approach:

  • It takes additional time to implement a QEMU virtual device, debug it, and stabilize it.
  • Since a QEMU virtual device isn’t a real device but is only a mock device with limited capabilities, not all features can be implemented. However, it’s enough to implement stubs for functionality.
  • A QEMU virtual device is much slower than a real physical device, so not everything can be tested on it. Particularly, it’s impossible to test synchronization and boundary conditions that cause device failure.
  • Driver logic functionality can’t be fully tested. Some parts remain to be finished during the device implementation stage.
  • 现在让我们看一下这种方法的缺点:

    1. 实施QEMU虚拟设备,调试和稳定它需要花费更多时间。
    2. 由于QEMU虚拟设备不是真实设备,而只是功能有限的模拟设备,因此无法实现所有功能。但是,足以实现功能性的存根就足够了。
    3. QEMU虚拟设备比真实的物理设备要慢得多,因此并非可以在上面测试所有内容。特别是,不可能测试导致设备故障的同步和边界条件。
    4. 驱动程序逻辑功能无法完全测试。在设备实施阶段,某些部分仍有待完成。

Driver  implementation stages驱动程序实施阶段

To understand when we can use a QEMU virtual device, let’s consider the stages of driver implementation:
Developing device specifications and functionality, including the device communication protocol

  • Implementing a mock device in QEMU (implementing the real physical device can begin simultaneously)
  • Implementing the device and debugging it, including writing tests and providing the proof of driver-device communication
  • Integrating and debugging the driver when running on a real device
  • General bug fixing, changing the requirements and functionality of both the device and its driver
  • Releasing the device and its driver
  • 为了了解何时可以使用QEMU虚拟设备,让我们考虑驱动程序实现的阶段:

    开发设备规格和功能,包括设备通信协议

    1. 在QEMU中实现模拟设备(实现真实物理设备可以同时开始)
    2. 实施设备并对其进行调试,包括编写测试并提供驱动程序与设备通信的证明
    3. 在真实设备上运行时集成和调试驱动程序
    4. 常规错误修复,同时更改设备及其驱动程序的要求和功能
    5. 释放设备及其驱动程序

For a Windows guest operating system, a virtual device will have all the same characteristics and interfaces as a real device because the driver will work identically with both the virtual device and the real device (aside from bugs in any of the components). However, the Windows guest operating system itself will be limited by the resources allocated by QEMU.
We’ve successfully tested this approach on Apriorit projects, confirming its value and effectiveness. Driver profiling can be used in the early stages of working with a QEMU virtual device. This allows you to determine performance bottlenecks in driver code when working with high-performance devices (not all issues are possible to detect, however, because virtual device performance is several times slower). That’s why it’s essential to use Driver Verifier and the Windows Driver Frameworks (WDF) Verifier when developing any drivers for any environment.

对于Windows来宾操作系统,虚拟设备将具有与真实设备相同的特性和接口,因为驱动程序将与虚拟设备和真实设备完全相同(除了任何组件中的错误)。但是,Windows来宾操作系统本身将受到QEMU分配的资源的限制。

我们已经在Apriorit项目上成功测试了这种方法,从而确认了其价值和有效性。可在使用QEMU虚拟设备的早期阶段使用驱动程序配置文件。这样,您就可以在使用高性能设备时确定驱动程序代码中的性能瓶颈(但是,并非所有问题都可以检测到,因为虚拟设备的性能要慢几倍)。这就是为什么在为任何环境开发任何驱动程序时必须使用驱动程序验证程序和Windows驱动程序框架(WDF)验证程序的原因。

 

Communication between a device and its driver设备及其驱动程序之间的通信

Let’s consider how a peripheral component interconnect (PCI) device and its operating system driver communicate with each other. The PCI specification describes all possible channels of communication with a device, while the device PCI header indicates the resources necessary for communication and the operating system or BIOS allocates or initializes these specified resources. In this book, we discuss only two types of communication resources:

  • I/O address space
  • 让我们考虑一下外围组件互连(PCI)设备及其操作系统驱动程序如何相互通信。PCI规范描述了与设备通信的所有可能通道,而设备PCI标头指示通信所需的资源,操作系统或BIOS分配或初始化这些指定的资源。在本书中,我们仅讨论两种通信资源:

    1. I / O地址空间
    2. 中断

    我们将简要介绍这些资源,仅在将它们用于实现通信功能的级别上讨论与它们的工作。

  • Interrupts

We’ll take a brief look at these resources, discussing work with them only at the level on which they’ll be used to implement communication functionality.

I/O  address spaceI/O地址空间

I/O address space is a region of addresses in a device (not necessarily on the physical memory of the device, but simply a region of the address space). When the operating system accesses these addresses, it generates a data access request (to read or write data) and sends it to the device. The device processes the request and sends its response. Access to the I/O address space in the Windows operating system is performed through the WRITE_REGISTER_ * and READ_REGISTER_ * function families, provided that the data size is 1, 2, 4, or 8 bytes. There are also functions that read an array of elements, where the size of one element is 1, 2, 4, or 8 bytes, and allow you to read or write buffers of any data size in one call.
The operating system and BIOS are responsible for allocating and assigning address regions to a device in the I/O address space. The system allocates these addresses from a special physical address space depending on the address dimension requirements. This additional level of abstraction of the device resource initialization eliminates device resource conflicts and relocates the device I/O address space in runtime. Here's an illustration of the physical address space for a hypothetical system:
I / O地址空间是设备中地址的区域(不一定在设备的物理内存上,而仅仅是地址空间的区域)。操作系统访问这些地址时,它将生成数据访问请求(以读取或写入数据)并将其发送到设备。设备处理该请求并发送其响应。如果数据大小为1、2、4或8个字节,则通过WRITE_REGISTER_ *和READ_REGISTER_ *函数系列执行对Windows操作系统中I / O地址空间的访问。还有一些函数可以读取一组元素,其中一个元素的大小为1、2、4或8个字节,并允许您在一次调用中读取或写入任何数据大小的缓冲区。

操作系统和BIOS负责为I / O地址空间中的设备分配和分配地址区域。系统根据地址尺寸要求从特殊的物理地址空间分配这些地址。设备资源初始化的这种附加抽象级别消除了设备资源冲突,并在运行时重新定位了设备I / O地址空间。这是一个假设系统的物理地址空间的说明:


Physical address space: 0x00000000 – 0x23FFFFFFF
I/O address space: 0xС0000000 – 0xFFFFFFFF

物理地址空间:0x00000000 – 0x23FFFFFFF

I / O地址空间:0xС0000000– 0xFFFFFFFF


The operating system reserves a special I/O memory region of various size in the physical address space. This region is usually located within a 4GB address space and ends with 0xFFFFFFFF. This region doesn’t belong to RAM memory but is responsible for accessing the device address space.操作系统在物理地址空间中保留各种大小的特殊I / O存储区。该区域通常位于4GB地址空间内,并以0xFFFFFFFF结尾。该区域不属于RAM内存,但负责访问设备地址空间。


A kernel mode driver in Windows OS cannot directly access physical memory addresses. To access the I/O region, a driver needs to map this region to the kernel virtual address space with the special functions MmMapIoSpace and MmMapIoSpaceEx. These operating system functions return a virtual system address, which is consequently used in the functions of the WRITE_REGISTER_ * and READ_REGISTER_ * families. Schematically, access to the I/O address space looks like this:Windows OS中的内核模式驱动程序无法直接访问物理内存地址。要访问I / O区域,驱动程序需要使用特殊功能MmMapIoSpace 和 MmMapIoSpaceEx将该区域映射到内核虚拟地址空间 。这些操作系统功能返回一个虚拟系统地址,该地址随后在WRITE_REGISTER_ *和READ_REGISTER_ *系列的功能中使用。从原理上讲,对I / O地址空间的访问如下所示:


RAM isn’t used for handling the requests on accessing the virtual I/O address in device memory.
Now let’s look at how to use this mechanism for communication with the device.
A driver developer considers the memory of a virtual QEMU device the device memory. The driver can read this memory to obtain information from the device or write to this memory to configure the device and send commands to it.
There’s some magic in working with this type of memory, as the device immediately detects changes to it and responds by executing the required operations. For example, to make the device execute any command, it’s sufficient to write it to the I/O memory at a certain offset. After this, the device will immediately detect changes in its memory and begin executing the command.

RAM不用于处理访问设备内存中虚拟I / O地址的请求。

现在让我们看一下如何使用这种机制与设备进行通信。

驱动程序开发人员将虚拟QEMU设备的内存视为设备内存。驱动程序可以读取此内存以从设备获取信息,或写入此内存以配置设备并向其发送命令。

使用这种类型的内存有些不可思议,因为设备会立即检测到它的更改并通过执行所需的操作做出响应。例如,要使设备执行任何命令,只需以一定的偏移量将其写入I / O存储器即可。此后,设备将立即检测其内存中的更改并开始执行命令。


However, this type of memory isn’t suitable for transferring large volumes of data due to the following limitations:

  • The size of the I/O space is limited.
  • Accessing this type of memory is usually slower than accessing RAM.
  • The device must contain a comparable amount of internal memory.
  • While accessing the I/O space, the CPU performs all required operations, slowing down the performance of the entire system when large volumes of memory are processed.               

But such memory can be used to obtain statuses, configure device modes, and do anything else that doesn’t require large amounts of memory.
This’s a one-way communication mechanism: the driver can access the device memory at any time and the request will be delivered immediately, but the device can’t deliver a message to the driver asynchronously by using the I/O memory without constantly polling the device memory from the driver’s side.

但是,由于以下限制,这种类型的内存不适合传输大量数据:

  • I / O空间的大小受到限制。
  • 访问这种类型的内存通常比访问RAM慢。
  • 设备必须包含相当数量的内部存储器。
  • 在访问I / O空间时,CPU执行所有必需的操作,从而在处理大量内存时降低了整个系统的性能。               

但是,此类内存可用于获取状态,配置设备模式以及执行其他不需要大量内存的操作。

这是一种单向通信机制:驱动程序可以随时访问设备内存,并且请求将立即发送,但是设备无法通过使用I / O内存异步地向驱动程序异步传递消息,而不会不断轮询驱动程序侧的设备内存。

 

Interrupts中断


Interrupts are a special hardware mechanism with which a PCI device sends messages to the operating system when it requires the driver’s attention or wants to report an event.
A device’s ability to work with interrupts is indicated in the PCI configuration space.
There are three types of interrupts:

  • Line-based
  • Message-signaled
  • MSI-X

In this article, we discuss the first two, as we use them for establishing communication between a device and its driver. All these types of interrupts are also well described in other books and articles.

中断是一种特殊的硬件机制,当需要驱动程序引起注意或要报告事件时,PCI设备将消息发送到操作系统。

PCI配置空间中指示了设备处理中断的能力。

有三种类型的中断:

  1. 基于行
  2. 留言信号
  3. MSI-X

在本文中,我们将讨论前两个,因为我们将它们用于在设备与其驱动程序之间建立通信。所有这些类型的中断在其他书籍和文章中也有很好的描述。

Line-based interrupts基于行的中断

Line-based interrupts (or INTx) are the first type of interrupt that’s supported by all versions of Windows. These interrupts can be shared by several devices, meaning that one interrupt line can serve multiple PCI devices simultaneously. When any of these devices use a dedicated pin to trigger an interrupt, the operating system delivers that interrupt to each device driver in succession until one of them handles it.基于行的中断(或INTx)是所有版本的Windows支持的第一类中断。这些中断可以由多个设备共享,这意味着一条中断线可以同时服务于多个PCI设备。当这些设备中的任何一个使用专用引脚触发中断时,操作系统都会将该中断连续传递给每个设备驱动程序,直到其中一个设备处理该中断为止。

The driver, in turn, requires a mechanism that can determine whether this interrupt was actually raised by its device or came from another device that uses the same INTx line. The device’s I/O memory space may contain an interrupt flag, which if set indicates that the interrupt has been raised by this particular device.
Physically, a line-based interrupt is a special contact to which the device sends a signal until the interrupt is processed by the driver. Thus, the driver must not only check the interrupt flag in the I/O memory but also reset it as soon as possible in order to let the device stop sending a signal to the interrupt contact.
Verifying and clearing the interrupt flag is necessary because several devices can simultaneously raise an interrupt using the same INTx. This approach allows processing interrupts from all devices.
The whole process of handling line-based interrupts looks as follows:反过来,驱动程序需要一种机制来确定此中断是由其设备实际引发的还是来自使用同一INTx线的另一设备发出的。设备的I / O存储空间可能包含一个中断标志,如果该标志被置位,则表明该特定设备已经引发了中断。

从物理上讲,基于行的中断是一种特殊的触点,设备向该触点发送信号,直到驱动程序处理该中断为止。因此,驱动程序不仅必须检查I / O存储器中的中断标志,还必须尽快将其复位,以使设备停止向中断触点发送信号。

验证和清除中断标志是必要的,因为多个设备可以使用相同的INTx同时引发中断。这种方法允许处理来自所有设备的中断。

处理基于行的中断的整个过程如下所示:



Line-based interrupts are full of flaws and limitations and require unnecessary references to the I/O memory. Fortunately, all these problems are solved with the following interrupt technique.基于行的中断充满了缺陷和局限性,并且需要对I / O存储器的不必要引用。幸运的是,所有这些问题都可以通过以下中断技术解决。

Message-signaled interrupts消息信号中断

Message-signaled interrupts, or MSIs, are based on messages recorded by the device at a specific address. In other words, instead of maintaining the voltage on the interrupt line, the interrupt is sent simply by writing a few bytes to a special memory. MSIs have many advantages compared to line-based interrupts. Improved performance is the major one, as this type of interrupt is much easier and cheaper to handle. MSIs also can be assigned to a specific core number.
The major difference between handling MSIs and line-based interrupts in the driver is that MSIs aren’t shared. For instance, if the operating system allocates an MSI interrupt for a device, then this interrupt is guaranteed to be used only by this device provided that all devices in the system work correctly. Because of this, the driver no longer needs to check the interrupt flag in the device I/O space, and the device doesn’t need to wait for the driver to process the interrupt.
The operating system can allocate only one line-based interrupt but multiple MSIs for a single device function (see the PCI function number). A driver can request the operating system to allocate 1, 2, 4, 8, 16, or 32 MSIs. In this case, the device can send different types of messages to the driver, which allows developers to optimize driver code and interrupt handling.消息信号中断(  MSI)基于设备在特定地址记录的消息。换句话说,与其维持中断线上的电压,不如通过将几个字节写入特殊存储器来发送中断。与基于行的中断相比,MSI具有许多优势。性能是主要的改进,因为这种类型的中断更容易处理且更便宜。MSI也可以分配给特定的核心号码。

在驱动程序中处理MSI和基于行的中断之间的主要区别是MSI不共享。例如,如果操作系统为某个设备分配了一个MSI中断,那么只要系统中的所有设备都能正常工作,就可以保证此中断只能由该设备使用。因此,驱动程序不再需要检查设备I / O空间中的中断标志,并且设备不需要等待驱动程序处理中断。

操作系统只能为一个设备功能分配一个基于行的中断,但可以分配多个MSI(请参阅PCI功能编号)。驱动程序可以请求操作系统分配1、2、4、8、16或32个MSI。在这种情况下,设备可以向驱动程序发送不同类型的消息,这使开发人员可以优化驱动程序代码和中断处理。


Each MSI contains information about the message number (the interrupt vector, or the logical type of event on the device’s side). All MSI message numbers start with 0 in WDF. After the operating system allocates MSIs for a device, it records the number of interrupts allocated and all the information necessary for sending them to the PCI configuration space. The device uses this information to send different types of MSI messages. If the device is expecting 8 MSIs but the operating system allocates only one message, then the device should send only MSI number 0. At the physical level, the operating system tries to allocate the number of sequential interrupt vectors that were requested by the driver (1, 2, 4, 8, 16, 32) and sets the first interrupt vector number in the PCI configuration space. The device uses this vector as the base for sending different MSI messages.
When a request is sent by a device to allocate the necessary number of interrupts, the operating system will allocate the requested number only if there are free resources. If the operating system is unable to process this request, then it will allocate only one MSI message, which will be number 0. The device and device driver must be ready for this event. Schematically, MSI interrupt processing looks like this:每个MSI都包含有关消息号的信息(中断向量或设备侧事件的逻辑类型)。WDF中所有MSI消息号均以0开头。操作系统为设备分配MSI之后,它会记录分配的中断数以及将其发送到PCI配置空间所需的所有信息。设备使用此信息发送不同类型的MSI消息。如果设备预期有8个MSI,但操作系统仅分配一个消息,则设备应仅发送MSI编号0。在物理级别,操作系统尝试分配驱动程序请求的顺序中断向量的数量( 1、2、4、8、16、32),并在PCI配置空间中设置第一个中断向量号。

当设备发出分配必要数量的中断的请求时,操作系统将仅在有可用资源的情况下分配所请求的数量。如果操作系统无法处理此请求,则它将仅分配一个MSI消息,其编号为0。设备和设备驱动程序必须为该事件做好准备。从原理上讲,MSI中断处理如下所示:



MSIs are available beginning with Windows Vista, and the maximum number of MSIs supported by Vista is 16. In earlier Windows versions, it was necessary to use line-based interrupts, and because of this drivers must support three modes of interrupt handling:

  • Line-based interrupt (if the system doesn’t support MSIs)
  • One MSI interrupt (if the system can’t allocate more than one MSI)
  • Multiple MSIs (if the system can allocate all requested MSIs and more than one is requested)

Interrupts are also a one-way communication instrument. They’re used by a device to send notifications to the driver. At the same time, interrupts received from devices have the highest priority for the operating system. When an interrupt is received, the system interrupts the execution of one of the processor threads and calls the driver interrupt handler, or interrupt service routine (ISR), callback.

MSI从Windows Vista开始可用,Vista支持的MSI的最大数量为16。在Windows的早期版本中,必须使用基于行的中断,并且由于该驱动程序必须支持三种中断处理方式:

  1. 基于行的中断(如果系统不支持MSI)
  2. 一个MSI中断(如果系统不能分配多个MSI)
  3. 多个MSI(如果系统可以分配所有请求的MSI,并且请求多个)

中断也是一种单向通信工具。它们被设备用来向驱动程序发送通知。同时,从设备收到的中断对于操作系统具有最高优先级。接收到中断时,系统中断处理器线程之一的执行,并调用驱动程序中断处理程序或中断服务程序(ISR)回调。

Working  with DMA memoryDMA内存处理

Some PCI devices need to exchange large volumes of data with the driver (for example, audio, video, network, and disk devices). It’s not the best option to use I/O memory for these purposes because the processor will be directly involved in copying data, which slows down the entire system.
The direct memory access (DMA) mechanism is used to avoid utilizing the processor when transferring data between a driver and a device. DMA has several operating modes and selects among them depending on which a device supports. Let’s take a look at only one of them: bus mastering.一些PCI设备需要与驱动程序交换大量数据(例如,音频,视频,网络和磁盘设备)。将I / O内存用于这些目的不是最佳选择,因为处理器将直接参与复制数据,这会减慢整个系统的速度。

直接内存访问(DMA)机制用于避免在驱动程序和设备之间传输数据时利用处理器。DMA有几种操作模式,并根据设备支持的模式进行选择。让我们只看其中之一:总线控制。

Bus mastering总线控制


Devices with bus mastering support writing to physical memory (RAM) without using a processor. In this case, the device itself locks the address bus, sets the desired memory address, and writes or reads data. Using this mode, it’s sufficient for the driver to transfer the DMA memory buffer address to the device (for example, using I/O memory) and wait for it to complete the operation (wait for the interrupt).具有总线主控功能的设备无需使用处理器即可写入物理内存(RAM)。在这种情况下,设备本身将锁定地址总线,设置所需的存储器地址,并写入或读取数据。使用此模式,驱动程序将DMA内存缓冲区地址传送到设备就足够了(例如,使用I / O内存),并等待它完成操作(等待中断)。




The actual device address should be transferred to the device instead of the virtual address that’s typically used by programs and drivers. There’s plenty of information about virtual, physical, and device addresses and how the operating system works with them on the internet. To work with DMA, it’s enough to know the following:

  • The virtual address buffer can usually be described by two values: address and size.
  • The operating system and processor handle the memory pages rather than individual bytes, and the size of one memory page is 4KB. This has to be taken into account when working with physical pages.
  • Physical memory (RAM) can be paged or non-paged. Paged memory can be paged out to the pagefile (swap file), while non-paged memory is always located in RAM and its physical address doesn’t change.
  • The physical pages of RAM for some virtual memory buffers (if the buffer wasn’t allocated in a special way) aren’t usually arranged one after another, meaning they aren’t located in the continuous physical address space.
  • The physical RAM address and device address aren’t always the same. The actual device address, which is the address accessible by the device, must be transferred to the device (we’ll use the term device addressto refer to both the device and physical address unless otherwise specified). To obtain the device address, the operating system provides a special API, while Windows Driver Frameworks uses its own API.

实际的设备地址应该传输到设备,而不是程序和驱动程序通常使用的虚拟地址。关于虚拟,物理和设备地址以及操作系统如何在Internet上与它们一起工作的信息很多。要使用DMA,只需了解以下内容即可:

  1. 虚拟地址缓冲区通常可以用两个值描述:地址和大小。
  2. 操作系统和处理器处理内存页面,而不是单个字节,一个内存页面的大小为4KB。处理物理页面时必须考虑到这一点。
  3. 物理存储器(RAM)可以 寻呼 或 非分页。分页的内存可以分页到页面文件(交换文件),而非分页的内存始终位于RAM中,并且其物理地址不变。
  4. 一些虚拟内存缓冲区的RAM物理页(如​​果未以特殊方式分配缓冲区)通常不会一个接一个地排列,这意味着它们不在连续的物理地址空间中。
  5. 物理RAM地址和设备地址并不总是相同。实际的设备地址(即设备可访问的地址)必须传送到设备(除非另有说明,否则我们将使用术语“ 设备地址”来指代设备和物理地址)。为了获得设备地址,操作系统提供了一个特殊的 API,而Windows Driver Frameworks使用了自己的 API。


Considering how physical and device memory works, the driver needs to perform some additional actions to transfer DMA memory to the device. Let’s take a look at how it’s possible to transfer a user mode memory buffer to the device for DMA operations.

  • The memory utilized in user mode usually contains paged physical pages; therefore, such memory should be fixed in RAM (to make it non-paged). This will ensure that physical pages aren’t unloaded into the page file while the device is working with them.
  • Physical pages may be located outside the contiguous physical memory range, making it necessary to obtain a device address for each of the pages or every continuous region with region size.
  • After that, all acquired device memory addresses should be transferred to the device. In order to maintain the same address format for the memory page, we’ll use a 64-bit address for both the x86 and x64 versions of Windows.
  • 考虑到物理内存和设备内存的工作方式,驱动程序需要执行一些其他操作才能将DMA内存传输到设备。让我们看一下如何将用户模式内存缓冲区传输到设备以进行DMA操作。

    1. 用户模式下使用的内存通常包含分页的物理页面;因此,此类内存应固定在RAM中(以使其成为非分页的)。这将确保在设备使用物理页面时,不会将物理页面卸载到页面文件中。
    2. 物理页面可能位于连续的物理内存范围之外,因此有必要为每个页面或具有区域大小的每个连续区域获取设备地址。
    3. 之后,所有获取的设备存储器地址都应传送到设备。为了使内存页保持相同的地址格式,我们将为x86和x64版本的Windows使用64位地址。

Note that the physical address for 32-bit Windows doesn’t equal 32 bits because there’s a Physical Address Extension (PAE), and Windows 64-bit uses only 44 bits for the physical address, which allows addressing 244 = 16TB of physical memory. At the same time, the first 12 bits describe the offset in the current memory page (the address of one page of physical memory in Windows can be set by using only 44 - 12 = 32 bits).
To simplify our implementation, we won’t wrap the addresses. Each memory page will be described by an address of 64 bits, both for x86 and x64 versions of the driver.

  • There are two ways to transfer addresses of all pages or regions to the device:
    • Using the I/O memory. In this case, the device must contain enough memory to store the entire array of addresses. The size of the I/O memory is usually fixed, adding some restrictions on the maximum size of the DMA buffer.
    • Using a common buffer as an additional memory buffer that contains page addresses. If the physical memory of this additional buffer is located in continuous physical or device memory, it will be enough to transfer just the address of the beginning of the buffer and its size to the device. The device can work with this memory as with a regular data array.

      请注意,32位Windows的物理地址不等于32位,因为存在物理地址扩展(PAE),而Windows 64位仅使用44位作为物理地址,这允许寻址2 44  = 16TB的物理内存。同时,前12位描述当前内存页中的偏移量(Windows中物理内存一页的地址只能使用44-12 = 32位来设置)。

      为了简化实现,我们将不包装地址。对于x86和x64版本的驱动程序,每个内存页面都将由64位地址描述。

      1. 有两种方法可以将所有页面或区域的地址传输到设备:
        1. 使用I / O内存。在这种情况下,设备必须包含足够的内存以存储整个地址数组。I / O内存的大小通常是固定的,这对DMA缓冲区的最大大小增加了一些限制。
        2. 使用公共缓冲区作为包含页地址的附加内存缓冲区。如果此附加缓冲区的物理内存位于连续的物理内存或设备内存中,则仅将缓冲区起始地址及其大小传输到设备就足够了。该设备可以像常规数据阵列一样使用此存储器。

Both approaches are used, and each has its pros and cons. Let’s consider approach b. Windows has a family of special functions used to allocate contiguous device memory (or a common buffer). Schematically, the user mode buffer transferred to the device for DMA operation looks like this:两种方法都被使用,每种方法都有其优缺点。让我们考虑方法 b。Windows有一系列特殊功能,用于分配连续的设备内存(或公共缓冲区)。从原理上讲,传输到设备以进行DMA操作的用户模式缓冲区如下所示:

Windows Driver Frameworks offers a family of functions for working with DMA memory, and only these particular functions should be used. This set of functions takes into account device capabilities, performs the necessary work to provide access to memory from the driver and device side, configures the mapped register, and so on.
The same memory can have three different types of addresses:

  • A virtual address for accessing the memory from the driver or a user mode process.
  • A physical address in RAM.
  • A device address (local bus address, DMA address) to access the memory from the device.

These mechanisms for communicating with the device will be enough to implement a test driver in Windows. All these mechanisms are reviewed here briefly and are described only to simplify the understanding of the device specifications listed below.

Windows驱动程序框架提供了用于处理DMA内存的一系列功能,并且仅应使用这些特定功能。这组功能考虑了设备功能,执行必要的工作以提供从驱动程序和设备端对内存的访问, 配置 映射的寄存器等。

同一内存可以具有三种不同类型的地址:

  1. 用于从驱动程序或用户模式进程访问内存的虚拟地址。
  2. RAM中的物理地址。
  3. 用于从设备访问内存的设备地址(本地总线地址,DMA地址)。

这些与设备进行通信的机制足以在Windows中实现测试驱动程序。此处简要回顾了所有这些机制,仅对它们进行了描述,以简化对下面列出的设备规格的理解。


Embedded devices are characterized by complex software that should provide stable and secure communication between operating systems and hardware. Embedded software development is one of Apriorit’s specialties, and we often use virtualization technologies in our projects to speed up device launches. 

Conclusion结论

嵌入式设备的特征在于复杂的软件,该软件应在操作系统和硬件之间提供稳定而安全的通信。嵌入式软件开发是Apriorit的专长之一,我们经常在项目中使用虚拟化技术来加快设备发布速度。 

友情链接
版权所有 Copyright(c)2004-2015 锐英源软件
公司注册号:410105000449586 豫ICP备08007559号 最佳分辨率 1024*768
地址:郑州市文化路47号院1号楼4层(47-1楼位于文化路和红专路十字路口东北角,郑州大学工学院招待所南边,工学院科技报告厅西边。)