Search This Blog

Thursday, October 25, 2012

LaTeX Notes: Structuring Large Documents

LaTeX Notes: Structuring Large Documents


As soon as you start to produce documents which have multiple chapters, or documents which are of any decent size, keeping all the source text in one file becomes unmanageable. There are two basic methods you can use for managing large documents; the first is very easy to use but limited in usefulness, so we'll get that out of the way first. In each case the idea is that you have some top level document file and a number of files that get included in this file automatically when you run LaTeX.

1 The Simple Method

Using the simple method, you set up a top level document like this:
\documentstyle{...}
...
\begin{document}
\input{firstfile}
\input{secondfile}
...
\input{lastfile}
\end{document}
You then segment your text into chunks, which you keep in the files firstfile.tex secondfile.tex and so on; each of these might be a chapter, or a major section. So, firstfile.tex might look like the following (note the use of the macro programname here; this isn't standard LaTeX, it's an instance of declarative formatting):
\section{Introduction}
Way back in the beginning of time,
people used a text formatter
called \programname{nroff} ...
When you run LaTeX on the top-level file, the contents of firstfile.tex, secondfile.tex and so on will then be read in at the specified points. This simply makes it easier to handle your text by breaking it into smaller chunks. Note the following:
  • The name of each included file must actually be something.tex, since LaTeX will automatically add the .tex ending when it looks for the file.
  • You can have nested calls of this sort---i.e., the file firstfile.tex could itself simply be something like:
    \chapter{Introduction}
    There are two sections to this introductory chapter.
    \input{firstsection}
    \input{secondsection}
    
  • From the previous example you can see that:
    • Each inputted file {\em isn't} a standalone LaTeX file (in particular, it doesn't have a \documentstyle{...} line or the \begin{document} and \end{document} lines).
    • You can intersperse calls to input with other arbitrary text and LaTeX commands.
This method is limited, in the following ways:
  1. If your intention is to avoid printing the whole file every time you format it, you have to explicitly comment out (or delete) the \inputs you don't want, and this has the consequence that page numbers, section numbers and so on will only take account of the non-commented-out input files (i.e., if you comment out the \input{firstfile} in the example above then secondfile will start on page 1 as Section 1).
  2. Worse, if you have cross-references betwen the different input files (e.g., suppose secondfile includes a \ref that refers to a \label in firstfile) then LaTeX won't be able to resolve the references.
As a result, this method is best suited to keeping stuff like figures and pictures in separate files, thus making the editing of the actual text less distracting. So, if you have a very complex figure constructed using the LaTeX picture environment, you might put it into a separate file and then include it in the following way:
You can see from the really complex figure 
in Figure~\ref{complex-figure} that my theory
is better than yours.
\begin{figure}
\input{myfigure1}
\caption{My really complex figure}\label{complex-figure}
\end{figure}

2 The More Complex Method

The more complex method of managing large documents is similar to the above, except that you use the \include command:
\documentstyle{...}
...
\begin{document}
\include{firstfile}
\include{secondfile}
...
\include{lastfile}
\end{document}
Again, LaTeX looks for firstfile.tex, and so on. Now, however, each included file gets its own .aux file. This is very important, because you can then do the following:
\documentstyle{...}
...
\includeonly{secondfile}
...
\begin{document}
\include{firstfile}
\include{secondfile}
...
\include{lastfile}
\end{document}
This tells LaTeX to consult the aux files corresponding to each included file, but only to actually include the text of the files listed in the \includeonly line. Because LaTeX looks at the other aux files, it knows about section and page numbers, cross-references, and so on. This means that the output will start at the appropriate page for the text in secondfile.tex, with appropriate section numbers and so on. Simply by changing the \includeonly line and reformatting, you can get different parts of the entire document printed, with all the numbering being that which you would get had you printed the entire document. One potential disadvantage of this method is that, unlike \input, each included file will automatically begin on a new page: so you don't want to use this for small arbitrary bits of a document (such as the example of an inputted figure in the previous section), but probably only for individual sections, or, if they are pretty large and deserve to start on a new page, individual subsections. In a document that consists of multiple chapters, each chapter will start on a fresh page anyway; so you can use this method to keep the text of individual chapters in separate files. For texts where you want to keep individual sections in separate files, one approach is to develop a large document using \includes and then for the final printing change them all to \inputs. So, while a text is being written, each chapter might be an included file which consists of multiple included sections; when any particular sections are printed as a result of being mentioned in the includeonly line, each section will start on a new page. For the final text, the section \includes can be replaced by \inputs so that only the chapters start on new pages.
Note that you can have multiple files specified in the includeonly line, but you have to specify the names separated by commas {\em with no intervening spaces}. So this is okay:
\includeonly{firstbit,lastbit}
but this is not:
\includeonly{firstbit, lastbit}
From the previous example you can see that you can format discontinuous parts of the text. Don't forget that LaTeX can only take account of aux files corresponding to files that are included, but not mentioned in the includeonly line, provided those aux files exist, so you have to format each bit (or all the bits at once by specifying them all in the includeonly line) at least once first.

No comments:

Post a Comment

Thank you