I'm not entirely sure that the title is what I'm looking for. What I'm really asking is for intuition as to why $\bar{\mathcal{M}_g}$ is the compactification of $\mathcal{M}_g$. I'm sure this is covered in the more classic papers (like Deligne and Mumford), but I still find those hard to penetrate.
You can test "compactness" of $\bar M_g$ by the valuative criterion of properness, and this in turn comes down to the stable reduction theorem. In more down to earth terms, given a map of a punctured disk D* into the above space, one wants it to extend uniquely to a map from the whole disk. The essential case is given by as a family of smooth projective curves over D*. The stable reduction theorem says that it extends to family of stable curves over D provided one passes to a ramified cover. Thus one gets the desired extension.

2$\begingroup$ This part is really the miracle of the construction for me, that the condition that makes a pointed nodal curve have finitely many automorphisms is also exactly the condition that makes the valuative criterion work. (e.g. if we allow rational componenents with two markings we can get more than one extension to D by blowing up, etc.) $\endgroup$ Apr 12 '10 at 19:40
I think a good first step to understanding how the DeligneMumford compactification works is to understand the moduli space of genus $0$ curves with marked points. This can be worked out very concretely. I recommend reading the first couple of chapters of "An Invitation to Quantum Cohomology" by Kock and Vainsencher, which has an inspired discussion of this.
One perspective is geometric. Nonsingular closed Riemann surfaces have a unique complete hyperbolic metric by uniformization, with a decomposition into thin and thick regions, where the thin regions are collar neighborhoods of short geodesics, and the thick regions are compact, with each component having uniformly bounded diameter (depending only on the genus). In the case of a noncompact Riemann surface of finite hyperbolic areaa, one also has cusp components of the thin part. One also sees that the geometry of a Riemann surface is uniquely determined by the geometry of the thick part, together with which pairs of boundary components bound collar neighborhoods. One may take a limit of a sequence of surfaces with geodesics whose length approaches zero, and the thick parts converge geometrically to a finite area hyperbolic surface, with a pair of cusps for each geodesic whose length has gone to zero. These cusp pairs are naturally identified with noded Riemann surfaces, giving the DeligneMumford compactification. The fact that it is compact follows from the compactness of the thick regions in the GromovHausdorff topology.
It is an interesting result of Masur that the compactification is also given by the metric completion of the WeilPetersson metric on moduli space.
As Donu already pointed out, the key result in this sense is the stable reduction theorem. It says that any punctured curve in $\mathscr M_g$ has a unique limit in $\overline{\mathscr M}_g$, so the latter is compact by the valuative criterion of properness. Let me add that this includes also that it is separated!
As far as understanding what stable reduction means and why it should even hold it might be helpful to think about the more general case of compactifying moduli spaces of canonically polarized varieties in arbitrary dimension. This still works for curves, but in higher dimension you need to say different words and it is more clear what's happening on account of our lesser knowledge in general. (By this last sentence I mean that since we know less we ought to talk about it the "right" way whereas in the case of curves there are in general several ways to get the same result.)
From the higher dimensional perspective a stable reduction is a relative canonical model. We need the base change to get rid of multiple fibers and then we need to find a welldefined relative model. The canonical model is the model to use if we want unique limits (which means separatedness of the moduli space). It turns out that if the total space is a canonical model then the fibers are stable.
$\bar{\mathcal{M}}_g$
. $\endgroup$