Sometimes you want to implement classes, but you're using a language such as C rather than C++. While a language like C++ was designed specifically to make it easy to use classes, you can use classes in any language. Here is one way to go about it in C. The general technique works in any language - even assembler. Note that this is NOT a way to get all of the features of C++ in C - if you want that, then just use C++.
Obviously, a class in C must involve a struct
, especially
as in C++ a class and a struct are the same thing. This takes us to
the first relevant difference between C and C++ - access control.
C++ provides public, protected, and private fields, whereas C only provides public fields. You need to decide whether this matters at all - note that the meaning of a correct C++ program is not changed by changing the access control to 'public' on all protected and private fields. If you feel that you need a reminder, you can modify the field name to indicate the access control:
class C { int i; protected: int j; public: int k; };
could be implemented as:
typedef struct C { int i_v; int j_t; int k; } C;
using the convention that _v
means priVate and
_t
means proTected. Note that 'public' does not need to
be marked as you don't need to be reminded when
using a public field to check that you are using it correctly.
Generally, it is much easier to ignore access control - the vast majority of fields are normally private, so you can simply ensure that you are only accessing fields in the member functions of that class. Hopefully your code is neat and compact enough that this is sufficiently obvious.
this
You will need to pass the this
value explicitly into member
functions. In C++ the compiler does this for you behind the scenes.
In C++ the only assistance that the language provides over C for
non-virtual member functions is naming. You can have two member
functions, A::foo() and B::foo() visible at the same time whereas in C
all functions must have unique names. Of course A::foo() is unique
when fully qualified, so the obvious way to handle foo() in C is to
have functions A_foo() and B_foo(). This does not
guarantee uniqueness, but any conflict will show up as a multiply-defined
function, so it is easily fixed. The only two functions which
cannot use this scheme are the constructor and destructor. In C, the
function A()
is illegal if you have a typedef for A, and
~A()
is always illegal. I suggest using the names
A_ctor()
and A_dtor()
. Again, this does not
absolutely guarantee uniqueness, but any conflicts are easily sorted
out.
class C { int a, b; public: int foo(int c) { return a + c; } };
would be:
typedef struct C { int a, b; } C; int C_foo(C *this, int c) { return this->a + c; }
If you don't like function names with capital letters (especially since that conflicts with standard C practice) then you can use the lowercase equivalent, which then means the constructor can simply name the class:
C++ | C |
---|---|
Foo() | foo() |
Foo::bar() | foo_bar() |
~Foo() | foo_dtor() |
It is quite simple to implement derived classes. A derived class starts with its base class, so
class D : B { int x; };
becomes:
typedef struct D { B b; int x; } D:
of course, this also means that you must explicitly convert an object of the derived class to one of the base class when necessary. The C++ code:
int foo(B *b, int c) { return b->x + c; } int bar(D *d) { return foo(d, 5); }
must translate into C with bar()
as:
int bar(D *d) { return foo(&d->b, 5); }
The compiler should provide adequate warnings for any cases you overlook.
Since the name of the base class field does not appear at all in C++,
it is fairly obvious that its name is not very important. We could
simple call it b
(for 'base') always. For example, if we
have FilledCircle derived from Circle, derived from Ellipse, derived
from Shape, we could use:
typedef struct Shape { ... } Shape; typedef struct Ellipse { Shape b; ... } Ellipse; typedef struct Circle { Ellipse b; ... } Circle; typedef struct FilledCircle { Circle b; ... } FilledCircle;
However, that means we sometimes need to refer to x.b.b.b
which is a bit confusing. We could use:
typedef struct Ellipse { Shape shape; ... } Ellipse; typedef struct Circle { Ellipse ellipse; ... } Circle; typedef struct FilledCircle { Circle circle; ... } FilledCircle;
but that requires us to say x.circle.ellise.shape
which
is excessive - especially remembering that C++ does not need to name
these fields at all. A good compromise is to use a very short but
mnemonic abbreviation:
typedef struct Ellipse { Shape sh; ... } Ellipse; typedef struct Circle { Ellipse el; ... } Circle; typedef struct FilledCircle { Circle ci; ... } FilledCircle;
which gives x.ci.el.sh
- easy enough to follow without
big words which contribute little to understanding.
Virtual functions are fairly easy to implement by using a technique used by C++ compilers, but they require a little bookkeeping and a simplifying assumption to make them simple. In C++ when declaring a derived class we can introduce a new virtual member that was not in the base class. It is much simpler to require the base class to have all virtual functions. We then implement this:
struct B { int x; virtual int foo(int z); virtual int bar(); }; int B::foo(int z) { return x + z; } int B::bar() { return x - 1; } struct D : B { virtual int foo(int z); virtual int bar(); }; int D::foo(int z) { return x + z + 1; } int D::bar() { return x - 2; } struct DD : D { virtual int foo(int z); virtual int bar(); }; int DD::foo(int z) { return x + z + 3; } int DD::bar() { return x - 3; }
as:
struct B_vtbl { int (*foo)(B *this, int z); int (*bar)(B *this); }; struct B { struct B_vtbl vtbl; int x; }; static int B_foo(B *this, int z) { return this->x + z; } static int B_bar(B *this) { return this->x - 1; } const struct B_vtbl { B_foo, B_bar, } B_vtbl; void B_ctor(B *this) { this->vtbl = &B_vtbl; } struct D { B b; }; static int D_foo(B *this_base, int z) { D *this = (D *) this_base; return this->b.x + z + 1; } static int D_bar(D *this) { D *this = (D *) this_base; return this->b.x - 2; } const struct D_vtbl { D_foo, D_bar, } D_vtbl; void D_ctor(D *this) { this->b.vtbl = &D_vtbl; } struct DD { D d; }; static int DD_foo(B *this_base, int z) { DD *this = (DD *) this_base; return this->d.b.x + z + 3; } static int DD_bar(B *this_base) { DD *this = (DD *) this_base; return this->d.b.x - 3; } const struct DD_vtbl { DD_foo, DD_bar, } DD_vtbl; void DD_ctor(DD *this) { this->d.b.vtbl = &DD_vtbl; }
This scheme requires you to write a call to a virtual function as
a->vtbl->foo(a, 3)
instead of the C++ syntax
a->foo(3)
but this is not very inconvenient. The
general principle is that every object contains a pointer to the table
of virtual functions. The constructor sets it up, and calls to virtual
functions go through the table, ensuring that the dynamic type of the
object determines the function to be called. In the above example of
classes derived from Shape
, if there is a virtual function
draw(int x, int y)
, and fs
is a pointer to an
instance of FilledCircle
, then the C++:
fs->draw(a, b);
would be written as:
fs->ci.el.sh->vtbl->draw(&fs->ci.el.sh, a, b);
which is a bit tedious, but workable.