Why I Dislike C++ For Large Projects

By Mark Roulo

12-June-2001


My primary complaint against C++ is that the language is so complicated, and has enough booby-traps, that average and above-average programmers have a difficult time writing code without serious bugs.

In a large program with raw pointers, the quality of the program will largely be driven by the least talented members of the team. This makes it highly dangerous to select a language that requires all the programmers to be in, say, the top 25% of the talent pool. Given a team with 10 developers (and my projects typically have more like 40-50), this seems to be asking for lots of long term trouble.

Things become even more unstable when one considers that the average turnover of software developers in Silicon Valley (where I live and work) is something like 20-25% and that many large projects also use contractors. The total number of programmers on a project will be much higher than the average (or even the peak) number of developers on the project.

I have worked on a project with both C++ and Java components (roughly 50% each) that communicate via CORBA. Part of my job has been to interview candidates for both halves of the project over a period of several years. This has given me a fair amount of exposure to the C++ developer pool.

As part of my standard interview for C++ candidates I ask them to write me a small class with the intention of evaluating their command of the language. This also gives us a reasonable coding sample to discuss during the interview. I can ask about potential improvements, extensions and testing strategies.

Most of the candidates I interview have already made it to the top of the resume pool -- usually by claiming at least 3 years professional experience with C++. Since many resumes have this, the candidates tend to have some other plus: large systems experience, degree from a good school, personal recommendation, etc.

The candidates then must survive a phone screen interview whose job is to weed out candidates that can't, for example, describe any of their projects coherently.

My request is to:

Write a Named Point class with three members: two floating point values for the coordinates on an X-Y plane, and a name represented as a 'char *'. Assume that this class will be used for some sort of wargame or simulation program that treats the world as flat and that these named points will be used to represent things like cities, battlefields, etc.

A typical first try looks something like this:

    class NamedPoint
    {
    private:
        float x;
        float y;
        char *name;

    public:
        NamedPoint (float x, float y, char *name)
        {
            this->x    = x;
            this->y    = y;
            this->name = name;
        }

        float getX()    {return x;}
        float getY()    {return y;}
        char *getName() {return name;}

        void  setX(float x)       {this->x = x;}
        void  setY(float y)       {this->y = y;}
        void  setName(char *name) {this->name = name;}
    };
There are several problems with this code: After I point out these problems, a typical fix is to modify NamedPoint to look like this (changes in bold):
    class NamedPoint
    {
    private:
        float x;
        float y;
        char *name;

    public:
        NamedPoint (float x, float y, char *name)
        {
            this->x    = x;
            this->y    = y;
            this->name = new char[strlen(name) + 1];
            strcpy (this->name, name);
        }

        float getX()          {return x;}
        float getY()          {return y;}
        const char *getName() {return name;}

        void  setX(float x)       {this->x = x;}
        void  setY(float y)       {this->y = y;}
        void  setName(char *name) {this->name = new char[strlen(name) + 1];
                                   strcpy (this->name, name);}
    };
This trades in one set of bugs for another. The new version has the following problems: After I point this out, I usually get a third try that looks like this:
    class NamedPoint
    {
    private:
        float x;
        float y;
        char *name;

    public:
        NamedPoint (float x, float y, char *name)
        {
            this->x    = x;
            this->y    = y;
            if (name == NULL)
                this->name = NULL;
            else
            {
                this->name = new char[strlen(name) + 1];
                strcpy (this->name, name);
            }
        }

        ~NamedPoint ()
        {
            if (name != NULL)
                delete name;
        }

        float getX()          {return x;}
        float getY()          {return y;}
        const char *getName() {return name;}

        void  setX(float x)       {this->x = x;}
        void  setY(float y)       {this->y = y;}
        void  setName(char *name) {if (this->name != NULL)
                                       delete this->name;
                                   if (name == NULL)
                                       this->name = NULL;
                                   else
                                   {
                                       this->name = new char[strlen(name) + 1];
                                       strcpy (this->name, name);
                                   }}
    };
Things are slowly improving ... in the sense that the bugs are getting more and more subtle. It is also worth mentioning that over half of the candidates don't assign NULL to name if the input 'name' value is NULL, leaving the memory uninitialized. This really isn't a C++ issue. Failing to initialize pointers in C structs is equally bad. The new problems are: After pointing out the copy constructor and assignment operator problems, the fourth try usually looks like the code below. But not always. Sometimes I need to explain to the candidates what a copy constructor and assignment operator are. Some candidates have strange beliefs about when you need to implement them. One candidate, for example, believed that copy constructors were needed for classes above some size threshold, but not needed for classes below that size threshold. I'll emphasise that I'm interviewing candidates with several years C++ experience who have already passes a phone screen. In any event, typical attempt number four:
    class NamedPoint
    {
    private:
        float x;
        float y;
        char *name;

    public:
        NamedPoint (float x, float y, char *name)
        {
            this->x    = x;
            this->y    = y;
            if (name == NULL)
                this->name = NULL;
            else
            {
                this->name = new char[strlen(name) + 1];
                strcpy (this->name, name);
            }
        }

        ~NamedPoint ()
        {
            if (name != NULL)
                delete name;             
        }

        // NOTE: Most interviewees start with a signature
        //       like this:
        //           NamedPoint (NamedPoint copy)
        //
        NamedPoint (const NamedPoint & copy)
        {
            this->x = copy.x;
            this->y = copy.y;

            if (copy.name != NULL)
            {
                this->name = new char[strlen (copy.name) + 1];
                strcpy (this->name, copy.name);
            }
        }

        NamedPoint & operator=(const NamedPoint & copy)
        {
            this->x = copy.x;
            this->y = copy.y;
            if (this->name != NULL)
                delete this->name;

            if (copy.name != NULL)
            {
                this->name = new char[strlen (copy.name) + 1];
                strcpy (this->name, copy.name);
            }

            // Note that we haven't nulled out this->name, so
            // we can get a double-delete problem...
        }

        float getX()          {return x;}
        float getY()          {return y;}
        const char *getName() {return name;}

        void  setX(float x)       {this->x = x;}
        void  setY(float y)       {this->y = y;}
        void  setName(char *name) {if (this->name != NULL)
                                       delete this->name;
                                   if (name == NULL)
                                       this->name = NULL;
                                   else
                                   {
                                       this->name = new char[strlen(name) + 1];
                                       strcpy (this->name, name);
                                   }}
    };
This is almost correct! The big problems remaining are that the assignment operator doesn't check for assignment to itself, the copy constructor only partially copies if 'copy' has NULL for its name, and we still risk a double-delete via the assignment operator. If a program does try to assign one of these objects to itself, the object deletes its own 'name' value before attempting to copy it onto itself.

I usually stop here (assuming we get this far).

Conclusion

Empirically, I have found it very difficult to find even experienced C++ programmers capable of correctly writing a simple C++ class containing a pointer. Since pointers are, because of the C legacy, an important part of the language, this is a fatal flaw for large projects that need to work. Summarizing the mistakes in my fairly trivial problem: One solution is to do a lot more stuff with things like STL string objects and generally try to hide the heap allocation. The auto_ptr<> and similar classes help here, too. But they only help. The fundamental problem still remains -- it is too easy to write subtly wrong code and the language is littered with booby-traps.

Larger programs encounter even more tricky problems. Scott Meyers has written two book on the subject of not getting killed by the C++ language. My point, though, is that most experienced C++ programmers I have interviewed can't get a simple class correct, even after multiple tries. Enough said. It makes me unwilling to risk a large project with the language.


Copyright 2001 by Mark Roulo