Thursday, October 21, 2010

Porting the Linux e1000e Driver to RTnet

My client just switched his hardware to a SOM which comes with an on-board E1000 PCIe chip. The RTnet (or Linux) e1000_new driver did not recognize the card so I had to hack the e1000e driver.

Here are the steps (mostly in netdev.c):
- borrow Makefile from RTnet e1000_new;
- junk Ethtool code, disable IPv6;
- short out all of the TCP offloading code -- this is a real time networking stack;
- rip out the Linux stack calls and replace with RTDM if applicable;
- ditto for skb->rtskb;
- kmalloc / kcalloc / kzalloc wrapped for RTDM;
- hash-define rtskb_copy_to_linear_data, rtskb_copy_to_linear_data_offset to some memcpy;
- pre-allocate a pool of 256 rtskbs for RX and initialise it;
- changed IRQ allocation to legacy (not MSI);
- connect to STACK_manager in e1000_open;
- added I/O RT task which is the RTnet bottom half -- waits on a semaphore, calls e1000_clean() and if packets received signals RTnet to process the packets;
- changed the ISR (legacy) to signal the semaphore above is IRQ belongs to the driver;
- removed the QoS stuff;
- removed the VLan stuff;
- disabled multicast code -- RTnet's support for it is flakey;
- disabled set MAC code -- RTnet does not support it;
- disabled skb frags code -- RTnet does not support it;
- disabled change MTU code -- RTnet does not support it;
- disabled power management code -- RTnet does not support it;
- modify RTnet's configure to have "--enable-e1000e".

I spent most of the time trying to compile the hacked code. After I got an IRQ kicking I spent a lot of time making sure the driver is stable and that inserting/removing its module does not thrash the box.

The ported driver (base kernel version is 2.6.29) is here.

-ulianov

Friday, October 15, 2010

A Very Nice and Very UNIXy Bug

I encountered a very puzzling bug in a user-land app I helped write. I wrote the socket comms library and the database backend (using SQLite3).

Once in a blue moon one of the sqlite3 databases would become corrupted and would be written off by the integrity_check pragma. The first thing to blame was sqlite3 with multithreading (tho I had it compiled for that). It ended up not being this.

What happened was soo much better: the socket library was being used by a thread pool so when a new request came it was handed to a worker thread which had to talk to some innards of the app and formulate a result. In the meanwhile the accepted socket was being kept open.

The comms library was used on the client side by ephemeral CGIs that were called by a YUI2 webapp. The web server was thttpd -- it has a CGI timeout cut-off after which it kills the child CGIs.

The innards of the app could sometimes take longer to respond that the CGI cut-off time so the CGI vanished and the client socket was closed. But on the app side (=server) the TCP socket was being used. When finally the innards finished doing whatever they were doing the data would be written to the dangling socket using write(2).

But in the meanwhile in another galax^H^H^H^Hthread a sqlite3 db would be opened, used and closed. UNIX has a policy of reusing the lowest numerical socket that becomes free.

See the conflict? The worker thread would write some late-arriving data to what it thinks it's a socket but now it's a file descriptor!

I fixed this by changing all calls for read/write in the comms library to recv/send -- the latter pair only works on sockets. Also for added paranoia I sprinkled the comms code with getpeername(2) and would log a critical error if a socket descriptor did not look as a socket.

Only took me two days to get to the bottom of this.

-ulianov