Thursday, June 19, 2008

Passing Params thru Registers Woes on x86

One year ago I was asked to look into a problem some colleagues had with a networking framework that lived as a set of Linux kernel modules.

The problem they saw was that when certain framework functions were called a parameter [which was a pointer] contained garbage.

They had the hairy idea to start using "-mregparm=3" to compile the kernel altho until them we lived happily with stack-based calls.

I looked at the code and the makefile and here is how gcc was invoked:
gcc -ggdb -O2 -pipe \
-mregparm=3 \
-Wall -Wstrict-prototypes -Wno-trigraphs \
-fno-strict-aliasing -fno-common \
-fomit-frame-pointer \
-mpreferred-stack-boundary=2 \
-march=pentium code.c
and here is how the offending code looked like [not a verbatim copy]:
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>

#define ENOMEM -2
#define EBADSLT -3
#define ENODEV 0

typedef int (*func_t)(void*, ...);

struct _S2; // forward declaration
struct _S1 {
char filler1[3];
func_t f1;
int (*f2)(long, long, struct _S2*, long);
struct _S2 {
char filler1[13];
long filler2;
struct _S1* s;
char filler3[7];

struct _S1 g_S1;
struct _S2 g_S2;

int f1(struct _S1* s, ...)
int f1(struct _S1* s)
va_list ap;
va_start(ap, s);
if(s != &g_S1)
return -1;
return 1;
int f2(long i1, long i2, struct _S2* s, long i3)
if(s != &g_S2)
return -1;
return 1;
int main()
g_S1.filler1[0] = 'A';
g_S1.f1 = (func_t)f1;
g_S1.f2 = f2;

g_S2.filler2 = 13;
g_S2.s = &g_S1;

if(g_S1.f1(&g_S1) < 0)
return -ENOMEM;
if(g_S2.s->f2(1, 2, &g_S2, 3) < 0)
return -EBADSLT;

return -ENODEV;
I noticed comparing disassembled code (objdump -dSl code.o) that the call setup for
was the same regardless of "-mregparm" -- this is because the compiler applies the template
typedef int (*func_t)(void*, ...);
which forces it to put the arguments on the stack.

However the declaration
int f1(struct _S1* s);
in connection with "-mregparm=3" has f1() looking for its first argument in %eax which contains some garbage!!

Hint: compile the code with -DFORCE_STDARG and without and see the difference in execution.

The moral is two-fold:
1. if you use "..." in a function prototype then also use it in the function implementation!!;
2. in the kernel passing function arguments thru registers will yield little or no gain in execution speed (as only the leaf functions will fully benefit from the stack-free operation).