Top 3 Articles for Learning Programming

Table of Contents

This post features three concise and influential articles on programming languages, each authored by a respected expert in the field. Ideal for beginners and experienced programmers alike, these writings shed light on both the practical and philosophical sides of coding.

★Paul Graham: Being Popular
#

Original link | Paul Graham

A friend of mine once told an eminent operating systems expert that he wanted to design a really good programming language. The expert told him that it would be a waste of time, that programming languages don’t become popular or unpopular based on their merits, and so no matter how good his language was, no one would use it. At least, that was what had happened to the language he had designed.

What does make a language popular? Do popular languages deserve their popularity? Is it worth trying to define a good programming language? How would you do it?

I think the answers to these questions can be found by looking at hackers, and learning what they want. Programming languages are for hackers, and a programming language is good as a programming language (rather than, say, an exercise in denotational semantics or compiler design) if and only if hackers like it.

◇1 The Mechanics of Popularity
#

It’s true, certainly, that most people don’t choose programming languages simply based on their merits. Most programmers are told what language to use by someone else. And yet I think the effect of such external factors on the popularity of programming languages is not as great as it’s sometimes thought to be. I think a bigger problem is that a hacker’s idea of a good programming language is not the same as most language designers'.

Between the two, the hacker’s opinion is the one that matters. Programming languages are not theorems. They’re tools, designed for people, and they have to be designed to suit human strengths and weaknesses as much as shoes have to be designed for human feet. If a shoe pinches when you put it on, it’s a bad shoe, however elegant it may be as a piece of sculpture.

It may be that the majority of programmers can’t tell a good language from a bad one. But that’s no different with any other tool. It doesn’t mean that it’s a waste of time to try designing a good language. Expert hackers can tell a good language when they see one, and they’ll use it. Expert hackers are a tiny minority, admittedly, but that tiny minority write all the good software, and their influence is such that the rest of the programmers will tend to use whatever language they use. Often, indeed, it is not merely influence but command: often the expert hackers are the very people who, as their bosses or faculty advisors, tell the other programmers what language to use.

The opinion of expert hackers is not the only force that determines the relative popularity of programming languages — legacy software (Cobol) and hype (Ada, Java) also play a role — but I think it is the most powerful force over the long term. Given an initial critical mass and enough time, a programming language probably becomes about as popular as it deserves to be. And popularity further separates good languages from bad ones, because feedback from real live users always leads to improvements. Look at how much any popular language has changed during its life. Perl and Fortran are extreme cases, but even Lisp has changed a lot. Lisp 1.5 didn’t have macros, for example; these evolved later, after hackers at MIT had spent a couple years using Lisp to write real programs. ☍ Macros very close to the modern idea were proposed by Timothy Hart in 1964, two years after Lisp 1.5 was released. What was missing, initially, were ways to avoid variable capture and multiple evaluation; Hart's examples are subject to both.

So whether or not a language has to be good to be popular, I think a language has to be popular to be good. And it has to stay popular to stay good. The state of the art in programming languages doesn’t stand still. And yet the Lisps we have today are still pretty much what they had at MIT in the mid-1980s, because that’s the last time Lisp had a sufficiently large and demanding user base.

Of course, hackers have to know about a language before they can use it. How are they to hear? From other hackers. But there has to be some initial group of hackers using the language for others even to hear about it. I wonder how large this group has to be; how many users make a critical mass? Off the top of my head, I’d say twenty. If a language had twenty separate users, meaning twenty users who decided on their own to use it, I’d consider it to be real.

Getting there can’t be easy. I would not be surprised if it is harder to get from zero to twenty than from twenty to a thousand. The best way to get those initial twenty users is probably to use a trojan horse: to give people an application they want, which happens to be written in the new language.

◇2 External Factors
#

Let’s start by acknowledging one external factor that does affect the popularity of a programming language. To become popular, a programming language has to be the scripting language of a popular system. Fortran and Cobol were the scripting languages of early IBM mainframes. C was the scripting language of Unix, and so, later, was Perl. Tcl is the scripting language of Tk. Java and Javascript are intended to be the scripting languages of web browsers.

Lisp is not a massively popular language because it is not the scripting language of a massively popular system. What popularity it retains dates back to the 1960s and 1970s, when it was the scripting language of MIT. A lot of the great programmers of the day were associated with MIT at some point. And in the early 1970s, before C, MIT’s dialect of Lisp, called MacLisp, was one of the only programming languages a serious hacker would want to use.

Today Lisp is the scripting language of two moderately popular systems, Emacs and Autocad, and for that reason I suspect that most of the Lisp programming done today is done in Emacs Lisp or AutoLisp.

Programming languages don’t exist in isolation. To hack is a transitive verb — hackers are usually hacking something — and in practice languages are judged relative to whatever they’re used to hack. So if you want to design a popular language, you either have to supply more than a language, or you have to design your language to replace the scripting language of some existing system.

Common Lisp is unpopular partly because it’s an orphan. It did originally come with a system to hack: the Lisp Machine. But Lisp Machines (along with parallel computers) were steamrollered by the increasing power of general purpose processors in the 1980s. Common Lisp might have remained popular if it had been a good scripting language for Unix. It is, alas, an atrociously bad one.

One way to describe this situation is to say that a language isn’t judged on its own merits. Another view is that a programming language really isn’t a programming language unless it’s also the scripting language of something. This only seems unfair if it comes as a surprise. I think it’s no more unfair than expecting a programming language to have, say, an implementation. It’s just part of what a programming language is.

A programming language does need a good implementation, of course, and this must be free. Companies will pay for software, but individual hackers won’t, and it’s the hackers you need to attract.

A language also needs to have a book about it. The book should be thin, well-written, and full of good examples. K&R is the ideal here. At the moment I’d almost say that a language has to have a book published by O’Reilly. That’s becoming the test of mattering to hackers.

There should be online documentation as well. In fact, the book can start as online documentation. But I don’t think that physical books are outmoded yet. Their format is convenient, and the de facto censorship imposed by publishers is a useful if imperfect filter. Bookstores are one of the most important places for learning about new languages.

◇3 Brevity
#

Given that you can supply the three things any language needs — a free implementation, a book, and something to hack — how do you make a language that hackers will like?

One thing hackers like is brevity. Hackers are lazy, in the same way that mathematicians and modernist architects are lazy: they hate anything extraneous. It would not be far from the truth to say that a hacker about to write a program decides what language to use, at least subconsciously, based on the total number of characters he’ll have to type. If this isn’t precisely how hackers think, a language designer would do well to act as if it were.

It is a mistake to try to baby the user with long-winded expressions that are meant to resemble English. Cobol is notorious for this flaw. A hacker would consider being asked to write

add x to y giving z

instead of

z = x+y

as something between an insult to his intelligence and a sin against God.

It has sometimes been said that Lisp should use first and rest instead of car and cdr, because it would make programs easier to read. Maybe for the first couple hours. But a hacker can learn quickly enough that car means the first element of a list and cdr means the rest. Using first and rest means 50% more typing. And they are also different lengths, meaning that the arguments won’t line up when they’re called, as car and cdr often are, in successive lines. I’ve found that it matters a lot how code lines up on the page. I can barely read Lisp code when it is set in a variable-width font, and friends say this is true for other languages too.

Brevity is one place where strongly typed languages lose. All other things being equal, no one wants to begin a program with a bunch of declarations. Anything that can be implicit, should be.

The individual tokens should be short as well. Perl and Common Lisp occupy opposite poles on this question. Perl programs can be almost cryptically dense, while the names of built-in Common Lisp operators are comically long. The designers of Common Lisp probably expected users to have text editors that would type these long names for them. But the cost of a long name is not just the cost of typing it. There is also the cost of reading it, and the cost of the space it takes up on your screen.

◇4 Hackability
#

There is one thing more important than brevity to a hacker: being able to do what you want. In the history of programming languages a surprising amount of effort has gone into preventing programmers from doing things considered to be improper. This is a dangerously presumptuous plan. How can the language designer know what the programmer is going to need to do? I think language designers would do better to consider their target user to be a genius who will need to do things they never anticipated, rather than a bumbler who needs to be protected from himself. The bumbler will shoot himself in the foot anyway. You may save him from referring to variables in another package, but you can’t save him from writing a badly designed program to solve the wrong problem, and taking forever to do it.

Good programmers often want to do dangerous and unsavory things. By unsavory I mean things that go behind whatever semantic facade the language is trying to present: getting hold of the internal representation of some high-level abstraction, for example. Hackers like to hack, and hacking means getting inside things and second guessing the original designer.

Let yourself be second guessed. When you make any tool, people use it in ways you didn’t intend, and this is especially true of a highly articulated tool like a programming language. Many a hacker will want to tweak your semantic model in a way that you never imagined. I say, let them; give the programmer access to as much internal stuff as you can without endangering runtime systems like the garbage collector.

In Common Lisp I have often wanted to iterate through the fields of a struct — to comb out references to a deleted object, for example, or find fields that are uninitialized. I know the structs are just vectors underneath. And yet I can’t write a general purpose function that I can call on any struct. I can only access the fields by name, because that’s what a struct is supposed to mean.

A hacker may only want to subvert the intended model of things once or twice in a big program. But what a difference it makes to be able to. And it may be more than a question of just solving a problem. There is a kind of pleasure here too. Hackers share the surgeon’s secret pleasure in poking about in gross innards, the teenager’s secret pleasure in popping zits. [2] For boys, at least, certain kinds of horrors are fascinating. Maxim magazine publishes an annual volume of photographs, containing a mix of pin-ups and grisly accidents. They know their audience. ☍ In When the Air Hits Your Brain, neurosurgeon Frank Vertosick recounts a conversation in which his chief resident, Gary, talks about the difference between surgeons and internists ("fleas"):
Gary and I ordered a large pizza and found an open booth. The chief lit a cigarette. "Look at those goddamn fleas, jabbering about some disease they'll see once in their lifetimes. That's the trouble with fleas, they only like the bizarre stuff. They hate their bread and butter cases. That's the difference between us and the fucking fleas. See, we love big juicy lumbar disc herniations, but they hate hypertension...."
It's hard to think of a lumbar disc herniation as juicy (except literally). And yet I think I know what they mean. I've often had a juicy bug to track down. Someone who's not a programmer would find it hard to imagine that there could be pleasure in a bug. Surely it's better if everything just works. In one way, it is. And yet there is undeniably a grim satisfaction in hunting down certain sorts of bugs.

Historically, Lisp has been good at letting hackers have their way. The political correctness of Common Lisp is an aberration. Early Lisps let you get your hands on everything. A good deal of that spirit is, fortunately, preserved in macros. What a wonderful thing, to be able to make arbitrary transformations on the source code.

Classic macros are a real hacker’s tool — simple, powerful, and dangerous. It’s so easy to understand what they do: you call a function on the macro’s arguments, and whatever it returns gets inserted in place of the macro call. Hygienic macros embody the opposite principle. They try to protect you from understanding what they’re doing. I have never heard hygienic macros explained in one sentence. And they are a classic example of the dangers of deciding what programmers are allowed to want. Hygienic macros are intended to protect me from variable capture, among other things, but variable capture is exactly what I want in some macros.

A really good language should be both clean and dirty: cleanly designed, with a small core of well understood and highly orthogonal operators, but dirty in the sense that it lets hackers have their way with it. C is like this. So were the early Lisps. A real hacker’s language will always have a slightly raffish character.

A good programming language should have features that make the kind of people who use the phrase “software engineering” shake their heads disapprovingly. At the other end of the continuum are languages like Ada and Pascal, models of propriety that are good for teaching and not much else.

◇5 Throwaway Programs
#

To be attractive to hackers, a language must be good for writing the kinds of programs they want to write. And that means, perhaps surprisingly, that it has to be good for writing throwaway programs.

A throwaway program is a program you write quickly for some limited task: a program to automate some system administration task, or generate test data for a simulation, or convert data from one format to another. The surprising thing about throwaway programs is that, like the “temporary” buildings built at so many American universities during World War II, they often don’t get thrown away. Many evolve into real programs, with real features and real users.

I have a hunch that the best big programs begin life this way, rather than being designed big from the start, like the Hoover Dam. It’s terrifying to build something big from scratch. When people take on a project that’s too big, they become overwhelmed. The project either gets bogged down, or the result is sterile and wooden: a shopping mall rather than a real downtown, Brasilia rather than Rome, Ada rather than C.

Another way to get a big program is to start with a throwaway program and keep improving it. This approach is less daunting, and the design of the program benefits from evolution. I think, if one looked, that this would turn out to be the way most big programs were developed. And those that did evolve this way are probably still written in whatever language they were first written in, because it’s rare for a program to be ported, except for political reasons. And so, paradoxically, if you want to make a language that is used for big systems, you have to make it good for writing throwaway programs, because that’s where big systems come from.

Perl is a striking example of this idea. It was not only designed for writing throwaway programs, but was pretty much a throwaway program itself. Perl began life as a collection of utilities for generating reports, and only evolved into a programming language as the throwaway programs people wrote in it grew larger. It was not until Perl 5 (if then) that the language was suitable for writing serious programs, and yet it was already massively popular.

What makes a language good for throwaway programs? To start with, it must be readily available. A throwaway program is something that you expect to write in an hour. So the language probably must already be installed on the computer you’re using. It can’t be something you have to install before you use it. It has to be there. C was there because it came with the operating system. Perl was there because it was originally a tool for system administrators, and yours had already installed it.

Being available means more than being installed, though. An interactive language, with a command-line interface, is more available than one that you have to compile and run separately. A popular programming language should be interactive, and start up fast.

Another thing you want in a throwaway program is brevity. Brevity is always attractive to hackers, and never more so than in a program they expect to turn out in an hour.

★Bruce Eckel: A Career in Computing
#

Original link | Bruce Eckel

The question that people ask is usually the wrong one: “should I learn C++ or Java?” In this essay, I shall try to lay out my view of the true issues involved in choosing a career in computing.

Note that I am not talking here to the people who already know it is their calling. You’re going to do it regardless of what anyone says, because it’s in your blood and you can’t get away from it. You know the answer already: C++ AND Java AND shell scripting AND Python AND a host of other languages and technologies that you’ll learn as a matter of course. You already know several of these languages, even if you’re only 14.

The person who asks me this question may be coming from another career. Or perhaps they are coming from a field like web development and they’ve figured out that HTML is only kind of like programming, and they’d like to try building something more substantial. But I especially hope that, if you are asking this question, you’ve realized that to be successful in computing, you need to teach yourself how to learn, and never stop learning.

The more I do this, the more it seems to me that software is more akin to writing than anything else. And we haven’t figured out what makes a good writer, we only know when we like what someone writes. This is not some kind of engineering where all we have to do is put something in one end and turn the crank. It is tempting to think of software as deterministic – that’s what we want it to be, and that’s the reason that we keep coming up with tools to help us produce the behavior we desire. But my experience keeps indicating the opposite, that it is more about people than processes, and the fact that it runs on a deterministic machine becomes less and less of an influence, just like the Heisenberg principle doesn’t affect things on a human scale.

My father built custom homes, and in my youth I would occasionally work for him, mostly doing grunt labor and sometimes hanging sheet rock. He and his lead carpenter would tell me that they gave me these jobs for my own good – so that I wouldn’t go into the business. It worked.

So I can also use the analogy that building software is like building a house. We don’t refer to everyone who works on a house as if they were exactly the same. There are concrete masons, roofers, plumbers, electricians, sheet rockers, plasterers, tile setters, laborers, rough carpenters, finish carpenters, and of course, general contractors. Each of these requires a different set of skills, which requires a different amount of time and effort to acquire. House-building is also subject to boom and bust cycles, like programming. If you want to get in quick, you might take a job as a laborer or a sheet rocker, where you can start getting paid without much of a learning curve. As long as demand is strong, you have steady work, and your pay might even go up if there aren’t enough people to do the work. But as soon as there’s a downturn, carpenters and even the general contractor can hang the sheet rock themselves.

When the Internet was first booming, all you had to do was spend some time learning HTML and you could get a job and earn some pretty good money. When things turned down, however, it rapidly becomes clear that there is a hierarchy of desirable skills, and the HTML programmers (like the laborers and sheet rockers) go first, while the highly-skilled code smiths and carpenters are retained.

What I’m trying to say here is that you don’t want to go into this business unless you are ready to commit to lifelong learning. Sometimes it seems like programming is a well-paying, reliable job – but the only way you can make sure of this is if you are always making yourself more valuable.

Of course you can find exceptions. There are always those people who learn one language and are just competent enough and perhaps savvy enough to stay employed without doing much to expand their abilities. But they are surviving by luck, and they are ultimately vulnerable. To make yourself less vulnerable, you need to continuously improve your abilities, by reading, going to user groups, conferences, and seminars. The more depth you have in this field, the more valuable you will be, which means you have more stable job prospects and can command higher salaries.

Another approach is to look at the field in general, and find a place where you already have talents. For example, my brother is interested in software, and dabbles with it, but his business is in installing computers, fixing them and upgrading them. He’s always been meticulous, so when he installs or fixes your computer you know that it will be in excellent shape when he’s done; not just the software, but all the way down to the cables, which will be bundled neat and out of the way. He’s always had more work than he could do, and he never noticed the dot-com bust. And needless to say, his work cannot be offshored.

I stayed in college a long time, and managed to get by in various ways. I even began a Ph.D. program at UCLA, which was mercifully cut short – I say mercifully because I no longer loved being in college, and the reason I stayed in college for so long was because I enjoyed it so much. But what I enjoyed was typically the off-track stuff. Art and dance classes, working on the college newspaper, and even the handful of computer programming classes that I took (which were off-track because I was a physics undergrad and a computer engineering graduate student). Although I was far from exceptional academically (a delightful irony is that many colleges that would not have accepted me as a student now use my books in their courses!), I really enjoyed the life of the college student, and had I finished a Ph.D. I probably would have taken the easy path and ended up a professor.

But as it turns out, some of the greatest value that I got from college was from those same off-track courses, the ones that expanded my mind beyond “stuff we already know.” I think this is especially true in computing because you are always programming to support some other goal, and the more you know about that goal the better you’ll perform (I’ve encountered some European graduate programs that require the study of computing in combination with some other specialty, and you build your thesis by solving a domain-specific problem in that other specialty).

I also think that knowing more than just programming vastly improves your problem-solving skills (just as knowing more than one programming language vastly improves your programming skills). On multiple occasions I have encountered people, trained only in computer science, who seem to have more limits in their thinking than those who come from some other background, like math or physics, which requires more rigorous thinking and is less prone to “it works for me” solutions.

In one session a conference that I organized, one of the topics was to come up with a list of features for the ideal job candidate:

Learning as a lifestyle. For example, you should know more than one language; nothing opens your eyes more to the strengths and limitations of a language than learning another one.
Know where and how to get new knowledge.
Study prior art.
We are tool users.
Learn to do the simplest thing.
Understand the business (Read magazines. Start with Fast Company, which has very short and interesting articles. Then you can see if you want to read others)
You are personally responsible for errors. “It works for me” is not an acceptable strategy. Find your own bugs.
Become a leader: someone who communicates and inspires.
Who are you serving?
There is no right answer … and always a better way. Show and discuss your code, without emotional attachment. You are not your code.
It’s an asymptotic journey towards perfection.

Take whatever risks you can – the best risks are the scary ones, but in trying you will feel more alive than you can imagine. It’s best if you don’t plan for a particular outcome, because you will often miss the true possibilities if you’re too attached to a result. My best adventures have been ones that have started with “lets do a little experiment and see where it takes us.

Some people will be disappointed by this answer, and reply “yes, that’s all very interesting and useful. But really, what should I learn? C++ or Java?” I’ll fend these off by repeating here: I know it seems like all the ones and zeroes should make everything deterministic, so that such questions should have a simple answer, but they don’t. It’s not about making one choice and being done with it. It’s about continuous learning and sometimes, bold choices. Trust me, your life will be more exciting this way.

★Peter Norvig: Teach Yourself Programming in Ten Years
#

Original link | Peter Norvig

◇Why is everyone in such a rush?
#

Walk into any bookstore, and you’ll see how to Teach Yourself Java in 24 Hours alongside endless variations offering to teach C, SQL, Ruby, Algorithms, and so on in a few days or hours. The Amazon advanced search for title: teach, yourself, hours, since: 2000 and found 512 such books. Of the top ten, nine are programming books (the other is about bookkeeping). Similar results come from replacing “teach yourself” with “learn” or “hours” with “days.”

The conclusion is that either people are in a big rush to learn about programming, or that programming is somehow fabulously easier to learn than anything else. Felleisen et al. give a nod to this trend in their book How to Design Programs, when they say “Bad programming is easy. Idiots can learn it in 21 days, even if they are dummies.” The Abtruse Goose comic also had their take.

Let’s analyze what a title like Teach Yourself C++ in 24 Hours could mean:

Teach Yourself: In 24 hours you won’t have time to write several significant programs, and learn from your successes and failures with them. You won’t have time to work with an experienced programmer and understand what it is like to live in a C++ environment. In short, you won’t have time to learn much. So the book can only be talking about a superficial familiarity, not a deep understanding. As Alexander Pope said, a little learning is a dangerous thing.
C++: In 24 hours you might be able to learn some of the syntax of C++ (if you already know another language), but you couldn’t learn much about how to use the language. In short, if you were, say, a Basic programmer, you could learn to write programs in the style of Basic using C++ syntax, but you couldn’t learn what C++ is actually good (and bad) for. So what’s the point? Alan Perlis once said: “A language that doesn’t affect the way you think about programming, is not worth knowing”. One possible point is that you have to learn a tiny bit of C++ (or more likely, something like JavaScript or Processing) because you need to interface with an existing tool to accomplish a specific task. But then you’re not learning how to program; you’re learning to accomplish that task.
in 24 Hours: Unfortunately, this is not enough, as the next section shows.

◇Teach Yourself Programming in Ten Years
#

Researchers (Bloom (1985), Bryan & Harter (1899), Hayes (1989), Simmon & Chase (1973)) have shown it takes about ten years to develop expertise in any of a wide variety of areas, including chess playing, music composition, telegraph operation, painting, piano playing, swimming, tennis, and research in neuropsychology and topology. The key is deliberative practice: not just doing it again and again, but challenging yourself with a task that is just beyond your current ability, trying it, analyzing your performance while and after doing it, and correcting any mistakes. Then repeat. And repeat again. There appear to be no real shortcuts: even Mozart, who was a musical prodigy at age 4, took 13 more years before he began to produce world-class music. In another genre, the Beatles seemed to burst onto the scene with a string of #1 hits and an appearance on the Ed Sullivan show in 1964. But they had been playing small clubs in Liverpool and Hamburg since 1957, and while they had mass appeal early on, their first great critical success, Sgt. Peppers, was released in 1967.

Malcolm Gladwell has popularized the idea, although he concentrates on 10,000 hours, not 10 years. Henri Cartier-Bresson (1908-2004) had another metric: “Your first 10,000 photographs are your worst.” (He didn’t anticipate that with digital cameras, some people can reach that mark in a week.) True expertise may take a lifetime: Samuel Johnson (1709-1784) said “Excellence in any department can be attained only by the labor of a lifetime; it is not to be purchased at a lesser price.” And Chaucer (1340-1400) complained “the lyf so short, the craft so long to lerne.” Hippocrates (c. 400BC) is known for the excerpt “ars longa, vita brevis”, which is part of the longer quotation “Ars longa, vita brevis, occasio praeceps, experimentum periculosum, iudicium difficile”, which in English renders as “Life is short, [the] craft long, opportunity fleeting, experiment treacherous, judgment difficult.” Of course, no single number can be the final answer: it doesn’t seem reasonable to assume that all skills (e.g., programming, chess playing, checkers playing, and music playing) could all require exactly the same amount of time to master, nor that all people will take exactly the same amount of time. As Prof. K. Anders Ericsson puts it, “In most domains it’s remarkable how much time even the most talented individuals need in order to reach the highest levels of performance. The 10,000 hour number just gives you a sense that we’re talking years of 10 to 20 hours a week which those who some people would argue are the most innately talented individuals still need to get to the highest level.”

◇So You Want to be a Programmer
#

Here’s my recipe for programming success:

Get interested in programming, and do some because it is fun. Make sure that it keeps being enough fun so that you will be willing to put in your ten years/10,000 hours.
Program. The best kind of learning is learning by doing. To put it more technically, “the maximal level of performance for individuals in a given domain is not attained automatically as a function of extended experience, but the level of performance can be increased even by highly experienced individuals as a result of deliberate efforts to improve.” (p. 366) and “the most effective learning requires a well-defined task with an appropriate difficulty level for the particular individual, informative feedback, and opportunities for repetition and corrections of errors.” (p. 20-21) The book Cognition in Practice: Mind, Mathematics, and Culture in Everyday Life is an interesting reference for this viewpoint.
Talk with other programmers; read other programs. This is more important than any book or training course.
If you want, put in four years at a college (or more at a graduate school). This will give you access to some jobs that require credentials, and it will give you a deeper understanding of the field, but if you don’t enjoy school, you can (with some dedication) get similar experience on your own or on the job. In any case, book learning alone won’t be enough. “Computer science education cannot make anybody an expert programmer any more than studying brushes and pigment can make somebody an expert painter” says Eric Raymond, author of The New Hacker’s Dictionary. One of the best programmers I ever hired had only a High School degree; he’s produced a lot of great software, has his own news group, and made enough in stock options to buy his own nightclub.
Work on projects with other programmers. Be the best programmer on some projects; be the worst on some others. When you’re the best, you get to test your abilities to lead a project, and to inspire others with your vision. When you’re the worst, you learn what the masters do, and you learn what they don’t like to do (because they make you do it for them).
Work on projects after other programmers. Understand a program written by someone else. See what it takes to understand and fix it when the original programmers are not around. Think about how to design your programs to make it easier for those who will maintain them after you.
Learn at least a half dozen programming languages. Include one language that emphasizes class abstractions (like Java or C++), one that emphasizes functional abstraction (like Lisp or ML or Haskell), one that supports syntactic abstraction (like Lisp), one that supports declarative specifications (like Prolog or C++ templates), and one that emphasizes parallelism (like Clojure or Go).
Remember that there is a “computer” in “computer science”. Know how long it takes your computer to execute an instruction, fetch a word from memory (with and without a cache miss), read consecutive words from disk, and seek to a new location on disk. (Answers here.)
Get involved in a language standardization effort. It could be the ANSI C++ committee, or it could be deciding if your local coding style will have 2 or 4 space indentation levels. Either way, you learn about what other people like in a language, how deeply they feel so, and perhaps even a little about why they feel so.
Have the good sense to get off the language standardization effort as quickly as possible.

With all that in mind, its questionable how far you can get just by book learning. Before my first child was born, I read all the How To books, and still felt like a clueless novice. 30 Months later, when my second child was due, did I go back to the books for a refresher? No. Instead, I relied on my personal experience, which turned out to be far more useful and reassuring to me than the thousands of pages written by experts.

Fred Brooks, in his essay No Silver Bullet identified a three-part plan for finding great software designers:

Systematically identify top designers as early as possible.
Assign a career mentor to be responsible for the development of the prospect and carefully keep a career file.
Provide opportunities for growing designers to interact and stimulate each other.

This assumes that some people already have the qualities necessary for being a great designer; the job is to properly coax them along. Alan Perlis put it more succinctly: “Everyone can be taught to sculpt: Michelangelo would have had to be taught how not to. So it is with the great programmers”. Perlis is saying that the greats have some internal quality that transcends their training. But where does the quality come from? Is it innate? Or do they develop it through diligence? As Auguste Gusteau (the fictional chef in Ratatouille) puts it, “anyone can cook, but only the fearless can be great.” I think of it more as willingness to devote a large portion of one’s life to deliberative practice. But maybe fearless is a way to summarize that. Or, as Gusteau’s critic, Anton Ego, says: “Not everyone can become a great artist, but a great artist can come from anywhere.”

So go ahead and buy that Java/Ruby/Javascript/PHP book; you’ll probably get some use out of it. But you won’t change your life, or your real overall expertise as a programmer in 24 hours or 21 days. How about working hard to continually improve over 24 months? Well, now you’re starting to get somewhere…

★Paul Graham: Being Popular #

◇1 The Mechanics of Popularity #

◇2 External Factors #

◇3 Brevity #

◇4 Hackability #

◇5 Throwaway Programs #

★Bruce Eckel: A Career in Computing #

★Peter Norvig: Teach Yourself Programming in Ten Years #

◇Why is everyone in such a rush? #

◇Teach Yourself Programming in Ten Years #

◇So You Want to be a Programmer #