= DMA device API = [[PageOutline]] == __A) General principles__ == This device allows the kernel to accelerate memory copy of a remote cluster ''source'' to another remote cluster ''destination'', when the architecture contains dedicated hardware accelerators. It can be multi-channel devices, supporting several parallel transfers, and these devices can be ''internal'' devices, replicated in all clusters. The "kernel" API contains two, synchronous and asynchronous, operation types, detailed in section C below. The '''asynchronous''' operation is not directly executed by the client thread. The requests are registered in the waiting queue rooted in the DMA chdev descriptor. These requests are actually handled by a dedicated '''server thread''' running in the cluster containing the chdev descriptor, that calls the blocking ''ioc_driver_cmd()'' function for each registered request. The driver is supposed to deschedule the server thread to wait the DMA transfer completion. The '''synchronous''' operations does not use the waiting queue, and does not use the server thread. The client thread calls itself the ''dma_driver_cmd()'' blocking function. The driver is supposed to use a polling strategy to wait the DMA transfer completion, without using the DMA_IRQ.. To access the various drivers, the DMA device defines a lower-level "driver" API, that is detailed in section D below. All DMA device structures and access functions are defined in the [https://www-soc.lip6.fr/trac/almos-mkh/browser/trunk/kernel/devices/dev_dma.c dev_dma.c] et [https://www-soc.lip6.fr/trac/almos-mkh/browser/trunk/kernel/devices/dev_dma.h dev_dma.h] files. == __B) Initialisation__ == The '''dev_dma_init( chdev_t * chdev )''' function makes the following initializations : * it initialises the DMA specific fields of the chdev descriptor. * it initialises the implementation specific DMA hardware device, * it initializes the specific software data structures required by the hardware implementation. * it links the DMA_IRQ to the core executing the server thread. * It disable the DMA_IRQ, because most operations are supposed to be synchronous. It must be called by a local thread. == __C) The "kernel" API__ == Both the synchronous and the asynchronous operations are blocking and return only when the transfer is completed, but the blocking policy depends on the operation type. They have the same arguments as the hal_remote_memcpy() function: * The '''dev_mma_sync_memcpy( xptr_t dst_xp , xptr_t src_xp , uint32_t nbytes )''' blocking function moves synchronously from a remote source buffer identified by the argument, to another remote destination buffer identified by the argument. It does not use a server thread an the DMA waiting queue, and the driver is supposed use a polling strategy on the DMA status register to wait the transfer completion. * The '''dev_mma_async_memcpy( xptr_t dst_xp , xptr_t src_xp , uint32_t nbytes )''' blocking function moves asynchronously from a remote source buffer identified by the argument, to another remote destination buffer identified by the argument. It register in the DMA waitingg queue, and uses a descheduling policy for both the client and the server thread to wait the transfer completion signaled by the DMA_IRQ. == __D) The "driver" API__ == All DMA drivers must define three functions : * void '''dma_driver_init( chdev_t *chdev )''' * void '''dma_driver_cmd( xptr_t thread_xp )''' * void '''dma_driver_isr( chdev_t * chdev )''' The ''dma_driver_cmd()'' function arguments are actually defined in the ''dma_command_t'' structure embedded in the client thread descriptor. One command contains four informations: - '''sync''' : operation type (synchronous if true / asynchronous if false) - '''size''' : number of bytes to be moved. - '''src_xp''' : extended pointer on source buffer. - '''dst_xp''' : extended pointer on destination buffer. For an asynchronous transfer this function must dynamically enable the DMA_IRQ before launching the transfer, and disable the DMA_IRQ when the transfer completed. The ''dma_diver_isr() function is only used for synchronous transfers. It acknowledges the DMA_IRQ, reports the transfert status into the dma_command_t structure, and reactivates the server thread.