/alps/pcitool

To get this branch, use:
bzr branch http://darksoft.org/webbzr/alps/pcitool
76 by Suren A. Chilingaryan
Handle correctly reference counting in the driver
1
Memory Addressing
2
=================
3
 There is 3 types of addresses: virtual, physical, and bus. For DMA a bus
4
 address is used. However, on x86 physical and  bus addresses are the same (on
5
 other architectures it is not guaranteed). Anyway, this assumption is still
6
 used by xdma driver, it uses phiscal address for DMA access. I have ported
7
 in the same way. Now, we need to provide additionaly bus-addresses in kmem
8
 abstraction and use it in NWL DMA implementation.
9
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
10
DMA Access Synchronization
11
==========================
12
 - At driver level, few types of buffers are supported:
13
    * SIMPLE - non-reusable buffers, the use infomation can be used for cleanup
14
    after crashed applications.
15
    * EXCLUSIVE - reusable buffers which can be mmaped by a single appliction
16
    only. There is two modes of these buffers:
17
	+ Buffers in a STANDARD mode are created for a single DMA operation and
18
	if such buffer is detected while trying to reuse, the last operation
19
	has failed and reset is needed.
20
	+ Buffers in a PERSISTENT mode are preserved between invocations of
21
	control application and cleaned up only after the PERSISTENT flag is 
22
	removed
23
    * SHARED - reusable buffers shared by multiple processes. Not really 
24
    needed at the moment.
25
26
    KMEM_FLAG_HW - indicates that buffer can be used by hardware, acually this
27
    means that DMA will be enabled afterwards. The driver is not able to check
28
    if it really was enable and therefore will block any attempt to release 
29
    buffer until KMEM_HW_FLAG is passed to kmem_free routine as well. The later
30
    should only called with KMEM_HW_FLAG after the DMA engine is stopped. Then,
31
    the driver can be realesd by kmem_free if ref count reaches 0.
32
    
33
    KMEM_FLAG_EXCLUSIVE - prevents multiple processes mmaping the buffer 
34
    simultaneously. This is used to prevent multiple processes use the same
81 by Suren A. Chilingaryan
Support forceful clean-up of kernel memory
35
    DMA engine at the same time. When passed to kmem_free, allows to clean
36
    buffers with lost clients even for shared buffers.
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
37
    
38
    KMEM_FLAG_REUSE - requires reuse of existing buffer. If reusable buffer is 
39
    found (non-reusable buffers, i.e. allocated without KMEM_FLAG_REUSE are
40
    ignored), it is returned instead of allocation. Three types of usage 
41
    counters are used. At moment of allocation, the HW reference is set if 
42
    neccessary. The usage counter is increased by kmem_alloc function and
43
    decreased by kmem_free. Finally, the reference is obtained at returned
44
    during mmap/munmap. So, on kmem_free, we do not clean
73 by Suren A. Chilingaryan
Implement DMA access synchronization in the driver
45
	a) buffers with reference count above zero or hardware reference set.
46
	REUSE flag should be supplied, overwise the error is returned
47
	b) PERSISTENT buffer. REUSE flash should be supplied, overwise the 
48
	error is returned
49
	c) non-exclusive buffers with usage counter above zero (For exclusive
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
50
	buffer the value of usage counter above zero just means that application
51
        have failed without cleaning buffers first. There is no easy way to 
52
        detect that for shared buffers, so it is left as manual operation in
53
        this case)
73 by Suren A. Chilingaryan
Implement DMA access synchronization in the driver
54
        d) any buffer if KMEM_FLAG_REUSE was provided to function
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
55
    During module unload, only buffers with references can prevent cleanup. In
56
    this case the only possiblity to free the driver is to call kmem_free 
57
    passing FORCE flags.
58
    
59
    KMEM_FLAG_PERSISTENT - if passed to allocation routine, changes mode of 
60
    buffer to PERSISTENT, if passed to free routine, vice-versa changes mode
61
    of buffer to NORMAL. Basically, if we call 'pci --dma-start' this flag
62
    should be passed to alloc and if we call 'pci --dma-stop' it should be
63
    passed to free. In other case, the flag should not be present.
64
65
    If application crashed, the munmap while be still called cleaning software
66
    references. However, the hardware reference will stay since it is not clear
67
    if hardware channel was closed or not. To lift hardware reference, the 
68
    application can be re-executed (or dma_stop called, for instance).
69
    * If there is no hardware reference, the buffers will be reused by next 
70
    call to application and for EXCLUSIVE buffer cleaned at the end. For SHARED
71
    buffers they will be cleaned during module cleanup only (no active 
72
    references).
73
    * The buffer will be reused by next call which can result in wrong behaviour
74
    if buffer left in incoherent stage. This should be handled on upper level.
75
    
76
 - At pcilib/kmem level synchronization of multiple buffers is performed
73 by Suren A. Chilingaryan
Implement DMA access synchronization in the driver
77
    * The HW reference and following modes should be consistent between member 
78
    parts: REUSABLE, PERSISTENT, EXCLUSIVE (only HW reference and PERSISTENT 
79
    mode should be checked, others are handled on dirver level)
80
    * It is fine if only part of buffers are reused and others are newly 
81
    allocated. However, on higher level this can be checked and resulting
82
    in failure.
83
    
84
    Treatment of inconsistencies:
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
85
     * Buffers are in PRESISTENT mode, but newly allocated, OK
86
     * Buffers are reused, but are not in PERSISTENT mode (for EXCLUSIVE buffers
87
     this means that application has crashed during the last execution), OK
88
     * Some of buffers are reused (not just REUSABLE, but actually reused), 
73 by Suren A. Chilingaryan
Implement DMA access synchronization in the driver
89
     others - not, OK until 
90
        a) either PERSISTENT flag is set or reused buffers are non-PERSISTENT
91
	b) either HW flag is set or reused buffers does not hold HW reference
92
     * PERSISTENT mode inconsistency, FAIL (even if we are going to set 
93
     PERSISTENT mode anyway)
94
     * HW reference inconsistency, FAIL (even if we are going to set 
95
     HW flag anyway)
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
96
     
97
    On allocation error at some of the buffer, call clean routine and
74 by Suren A. Chilingaryan
Implement DMA access synchronization for NWL implementation
98
     * Preserve PERSISTENT mode and HW reference if buffers held them before
99
     unsuccessful kmem initialization. Until the last failed block, the blocks
100
     of kmem should be consistent. The HW/PERSISTENT flags should be removed
101
     if all reused blocks were in HW/PERSISTENT mode. The last block needs
102
     special treatment. The flags may be removed for the block if it was
103
     HW/PERSISTENT state (and others not).
104
     * Remove REUSE flag, we want to clean if allowed by current buffer status
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
105
     * EXCLUSIVE flag is not important for kmem_free routine.
106
    
107
 - At DMA level
108
    There is 4 components of DMA access:
109
    * DMA engine enabled/disabled
110
    * DMA engine IRQs enabled/disabled - always enabled at startup
111
    * Memory buffers
112
    * Ring start/stop pointers
113
    
114
    To prevent multiple processes accessing DMA engine in parallel, the first
74 by Suren A. Chilingaryan
Implement DMA access synchronization for NWL implementation
115
    action is buffer initialization which will fail if buffers already used
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
116
	* Always with REUSE, EXCLUSIVE, and HW flags 
117
	* Optionally with PERSISTENT flag (if DMA_PERSISTENT flag is set)
118
    If another DMA app is running, the buffer allocation will fail (no dma_stop 
119
    is executed in this case) 
120
121
    Depending on PRESERVE flag, kmem_free will be called with REUSE flag 
122
    keeping buffer in memory (this is redundant since HW flag is enough) or HW
123
    flag indicating that DMA engine is stopped and buffer could be cleaned.
124
    PERSISTENT flag is defined by DMA_PERSISTENT flag passed to stop routine.
125
    
126
    PRESERVE flag is enforced if DMA_PERSISTENT is not passed to dma_stop
127
    routine and either it:
128
	a) Explicitely set by DMA_PERMANENT flag passed to dma_start 
129
	function 
130
	b) Implicitely set if DMA engine is already enabled during dma_start, 
131
	all buffers are reused, and are in persistent mode.
132
    If PRESERVE flag is on, the engine will not be stopped at the end of
133
    execution (and buffers will stay because of HW flag).
134
    
74 by Suren A. Chilingaryan
Implement DMA access synchronization for NWL implementation
135
    If buffers are reused and are already in PERSISTENT mode, DMA engine was on 
136
    before dma_start (PRESERVE flag is ignored, because it can be enforced), 
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
137
    ring pointers are calculated from LAST_BD and states of ring elements.
138
    If previous application crashed (i.e. buffers may be corrupted). Two
139
    cases are possible:
140
    * If during the call buffers were in non-PERSISTENT mode, it can be 
141
    easily detected - buffers are reused, but are not in PERSISTENT mode 
142
    (or at least was not before we set them to). In this case we just 
143
    reinitialize all buffers.
144
    * If during the call buffers were in PERSISTENT mode, it is up to 
145
    user to check their consistency and restart DMA engine.]
146
    
147
    IRQs are enabled and disabled at each call
111 by Suren A. Chilingaryan
Update scripts
148
149
DMA Reads
150
=========
151
standard: 		default reading mode, reads a single full packet
152
multipacket:		reads all available packets
153
waiting multipacket:	reads all available packets, after finishing the
154
			last one waiting if new data arrives
155
exact read:		read exactly specified number of bytes (should be
156
			only supported if it is multiple of packets, otherwise
157
			error should be returned)
158
ignore packets:		autoterminate each buffer, depends on engine 
159
			configuration
117 by Suren A. Chilingaryan
new event architecture, first trial
160
161
 To handle differnt cases, the value returned by callback function instructs
162
the DMA library how long to wait for the next data to appear before timing 
163
out. The following variants are possible:
164
terminate:		just bail out
165
check:			no timeout, just check if there is data, otherwise 
166
			terminate
167
timeout:		standard DMA timeout, normaly used while receiving
168
			fragments of packet: in this case it is expected 
169
			that device has already prepared data and only
170
			the performance of DMA engine limits transfer speed
171
wait:			wait until the data is prepared by the device, this
172
			timeout is specified as argument to the dma_stream
173
			function (standard DMA timeout is used by default)
111 by Suren A. Chilingaryan
Update scripts
174
175
			first |  new_pkt  | bufer 
176
			--------------------------	
117 by Suren A. Chilingaryan
new event architecture, first trial
177
standard		wait  | term      | timeout  
178
multiple packets	wait  | check	  | timeout 	- DMA_READ_FLAG_MULTIPACKET 	
179
waiting multipacket	wait  | wait      | timeout 	- DMA_READ_FLAG_WAIT
180
exact			wait  | wait/term | timeout	- limited by size parameter
181
ignore packets		wait  | wait/check| wait/check 	- just autoterminated
111 by Suren A. Chilingaryan
Update scripts
182
183
Shall we do a special handling in case of overflow?
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
184
    
117 by Suren A. Chilingaryan
new event architecture, first trial
185
186
Buffering
187
=========
188
 The DMA addresses are limited to 32 bits (~4GB for everything). This means we 
126 by Suren A. Chilingaryan
multithread preprocessing of ipecamera frames and code reorganization
189
 can't really use DMA pages are sole buffers. Therefore, a second thread, with
190
 a realtime scheduling policy if possible, will be spawned and will copy the 
191
 data from the DMA pages into the allocated buffers. On expiration of duration
192
 or number of events set by autostop call, this thread will be stopped but 
193
 processing in streaming mode will continue until all copyied data is passed 
194
 to the callbacks.
195
196
 To avoid stalls, the IPECamera requires data to be read continuously read out.
197
 For this reason, there is no locks in the readout thread. It will simplify
198
 overwrite the old frames if data is not copied out timely. To handle this case
199
 after getting the data and processing it, the calling application should use
200
 return_data function and check return code. This function may return error
201
 indicating that the data was overwritten meanwhile. Hence, the data is 
202
 corrupted and shoud be droped by the application. The copy_data function
203
 performs this check and user application can be sure it get coherent data
204
 in this case.
205
 
206
 There is a way to avoid this problem. For raw data, the rawdata callback
207
 can be requested. This callback blocks execution of readout thread and 
208
 data may be treated safely by calling application. However, this may 
209
 cause problems to electronics. Therefore, only memcpy should be performed
210
 on the data normally. 
211
212
 The reconstructed data, however, may be safely accessed. As described above,
213
 the raw data will be continuously overwritten by the reader thread. However,
214
 reconstructed data, upon the get_data call, will be protected by the mutex.
117 by Suren A. Chilingaryan
new event architecture, first trial
215
216
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
217
Register Access Synchronization
218
===============================
219
 We need to serialize access to the registers by the different running 
220
 applications and handle case when registers are accessed indirectly by
221
 writting PCI BARs (DMA implementations, for instance).
222
74 by Suren A. Chilingaryan
Implement DMA access synchronization for NWL implementation
223
 - Module-assisted locking:
224
 * During initialization the locking context is created (which is basicaly
225
 a kmem_handle of type LOCK_PAGE. 
226
 * This locking context is passed to the kernel module along with lock type 
227
 (LOCK_BANK) and lock item (BANK ADDRESS). If lock context is already owns
228
 lock on the specified bank, just reference number is increased, otherwise
229
 we are trying to obtain new lock.
230
 * Kernel module just iterates over all registered lock pages and checks if
231
 any holds the specified lock. if not, the lock is obtained and registered
232
 in the our lock page.
233
 * This allows to share access between multiple threads of single application
234
 (by using the same lock page) or protect (by using own lock pages by each of
235
 the threads)
236
 * Either on application cleanup or if application crashed, the memory mapping
237
 of lock page is removed and, hence, locks are freed.
238
 
239
 - Multiple-ways of accessing registers
240
 Because of reference counting, we can successfully obtain locks multiple 
241
 times if necessary. The following locks are protecting register access:
242
  a) Global register_read/write lock bank before executing implementation
243
  b) DMA bank is locked by global DMA functions. So we can access the 
244
  registers using plain PCI bar read/write.
245
  c) Sequence of register operations can be protected with pcilib_lock_bank
246
  function
247
 Reading raw register space or PCI bank is not locked.
248
  * Ok. We can detect banks which will be affected by PCI read/write and 
249
  lock them. But shall we do it?
250
 
72 by Suren A. Chilingaryan
Provide formal description of DMA access synchronization
251
Register/DMA Configuration
252
==========================
253
 - XML description of registers
254
 - Formal XML-based (or non XML-based) language for DMA implementation. 
255
   a) Writting/Reading register values
256
   b) Wait until <register1>=<value> on <register2>=<value> report error
257
   c) ... ?
88 by Suren A. Chilingaryan
IRQ acknowledgement support in the engine API
258
259
IRQ Handling
260
============
261
 IRQ types: DMA IRQ, Event IRQ, other types
262
 IRQ hardware source: To allow purely user-space implementation, as general
263
 rule, only a  single (standard) source should be used.
264
 IRQ source: The dma/event engines, however, may detail this hardware source
265
 and produce real IRQ source basing on the values of registers. For example, 
266
 for DMA IRQs the source may present engine number and for Event IRQs the 
267
 source may present event type.
268
269
 Only types can be enabled or disabled. The sources are enabled/disabled
270
 by enabling/disabling correspondent DMA engines or Event types. The expected
271
 workflow is following:
272
 * We enabling IRQs in user-space (normally setting some registers). Normally,
273
 just an Event IRQs, the DMA if necessary will be managed by DMA engine itself.
274
 * We waiting for standard IRQ from hardware (driver)
275
 * In the user space, we are checking registers to find out the real source
276
 of IRQ (driver reports us just hardware source), generating appropriate 
277
 events, and acknowledge IRQ. This is dependent on implementation and should 
278
 be managed inside event API.
279
 
280
 I.e. the driver implements just two methods pcilib_wait_irq(hw_source), 
281
 pcilib_clear_irq(hw_source). Only a few hardware IRQ sources are defined.
282
 In most cirstumances, the IRQ_SOURCE_DEFAULT is used. 
283
 
284
 The DMA engine may provide 3 additional methods, to enable, disable,
285
 and acknowledge IRQ.
286
 
287
 ... To be decided in details upon the need...
288
289
Updating Firmware
290
=================
90 by Suren A. Chilingaryan
Small documentation update
291
 - JTag should be connected to USB connector on the board (next to Ethernet)
88 by Suren A. Chilingaryan
IRQ acknowledgement support in the engine API
292
 - The computer should be tourned off and on before programming
90 by Suren A. Chilingaryan
Small documentation update
293
 - The environment variable should be loaded
294
    . /home/uros/.bashrc
88 by Suren A. Chilingaryan
IRQ acknowledgement support in the engine API
295
 - The application is called 'impact'
90 by Suren A. Chilingaryan
Small documentation update
296
    No project is needed, cancel initial proposals (No/Cancel)
297
    Double-click on "Boundary Scan"
298
    Right click in the right window and select "Init Chain"
299
    We don't want to select bit file now (Yes and, then, click Cancel)
300
    Right click on second (right) item and choose "Assign new CF file"
301
    Select a bit file. Answer No, we don't want to attach SPI to SPI Prom
302
    Select xv6vlx240t and program it
303
 - Shutdown and start computer
304
 
305
 Firmware are in
88 by Suren A. Chilingaryan
IRQ acknowledgement support in the engine API
306
    v.2: /home/uros/Repo/UFO2_last_good_version_UFO2.bit
307
    v.3: /home/uros/Repo/UFO3 
308
	Step5 - best working revision
309
	Step6 - last revision
310
311