ACLPUB HOWTO: Putting Together an ACL Proceedings Volume

Who Should Read This Document

This guide to the aclpub software will help you create a single proceedings volume for an ACL-affiliated conference. Presumably, you are one of the following:

In short, you are a "book chair." This document will assume that you are specifically the program chair of a workshop, since most workshops are run like this.

Note: For our purposes, the student research workshop and the short paper session are considered to be separate logical books, even though they may be physically bound together (along with tutorial program info) into a single "companion volume."

Note: There is an additional HOWTO aimed at the publications chair of the overall conference.

Overview

We have tried to automate the process as much as possible. However, it still requires manual input. You should allocate 2-3 hours for this process. The final look of your proceedings is your responsibility!

By following these instructions, you will produce the following:

Here is a gratuitous flowchart to illustrate the process. You will be using a makefile that knows all about this flowchart and will rebuild files whenever necessary.

Who's Responsible

Acknowledgments: The aclpub package and documentation were built in 2005 by Jason Eisner and Philipp Koehn, based in part on scripts by David Yarowsky that had been used for several years previously.

Get Software

You'll want to use a Unix or Cygwin machine that offers all the "usual" Unix commands, including CVS, GNU make, and a not-too-old TeX distribution that includes the pdfpages package.

You'll also want to install the xpdf package in order to have the tool pdftops.

First go to your home directory and grab the latest version of the software. This will create a directory ~/aclpub:

cvs -d :pserver:anoncvs.clsp.jhu.edu:/aclpub checkout aclpub
** WARNING ** The above official package is for Letter-size proceedings. For ACL-2013, which will generate A4-size proceedings, please download the special version of the ACLPUB tailored for A4-size proceedings, and with workshop-specific templates customized for your workshop, at http://wwwusers.di.uniroma1.it/~faralli/acl2010/publication/tools/aclpub/aclpub+a4+acl2013.tgz.

If you'd rather put the software somewhere else, go ahead. Just set the ACLPUB environment variable to the directory where you put it:

export ACLPUB=~/editing/myworkshop/aclpub   # in sh or bash
setenv ACLPUB ~/editing/myworkshop/aclpub   # in csh or tcsh
To check whether you're running bash or tcsh, run
echo $0
(that's a zero, not the letter O).

Set Up Working Directory

Create a directory somewhere, called proceedings. You will do all your work in this directory. Eventually, you'll send a copy of this directory to the publications chair.

Copy the file .../aclpub/make/Makefile_bookchair to .../proceedings/Makefile.

Depending on how your system is configured, you may soon get a message asking you to install some less common Perl modules. If you would like to check for this message now, type the following:

make perl-modules
If you don't get any instructions, your system is okay. If you do, following these instructions will result in the download and installation of the necessary Perl models (Text::PDF, and Compress:Zlib, which it depends on). You can choose to install these system-wide, or in the local directory. If you install then in the local directory, make sure to set the PERL5LIB environment variable to include the lib directories for each package:
export PERL5LIB=pathto/Compress-Zlib-1.34/lib:pathto/Text-PDF-0.29/lib:$PERL5LIB   # in sh or bash
setenv PERL5LIB pathto/Compress-Zlib-1.34/lib:pathto/Text-PDF-0.29/lib:$PERL5LIB   # in tcsh

Get Data from START

Assuming you used START to collect camera-ready copy, now you'll want to get the camera-ready PDF files and metadata provided by your authors.

The tarball will soon be unpacked into a directory called final. (Type make final if you want to make that happen now.) If your authors upload any changes to START after this step, you can either download a new tarball or else make manual changes in the final directory.

Check Copyright Signatures

Your first job is to check that your authors signed their copyright forms via START. To create the copyright-signatures file, type

make copyright-signatures

DO NOT EDIT THIS FILE; it is a legal record. Please review it ASAP to make sure that appropriate signatures have been provided for all papers. Then email it to the publications chair, pointing out any papers with missing or inappropriate signatures (e.g., "Mickey Mouse"). You or the publications chair will need to extract hardcopy signatures from those papers' authors in time for their papers to be included.

If you did not use START to gather the papers, you will need to get hardcopy forms from your authors. See our copyright webpage and contact the publications chairs.

Create Metadata

Almost all the data about the papers will be maintained in a file called db (for "database"). To create a first version from the information in the final directory, type:

make db

You will be editing the db file below, so take a look. (Here is a description of its FORMAT.) Entries are separated by blank lines. Make sure that the filename lines do not have trailing whitespace. The first entry for the ACL 2005 main proceedings looks like this:

P: 7                                                                  # submission ID of this paper
T: Coarse-to-fine n-best parsing and MaxEnt discriminative reranking  # title
A: Charniak, Eugene                                                   # author
A: Johnson, Mark                                                      # another author
F: final/7/7_paper.pdf                                                # filename of PDF file
L: 8                                                                  # length in pages

If you did not use START to gather the papers and the metadata, then you have to manually create a directory of the camera-ready PDF papers, called final. The PDF filenames do not need to have the form final/7/7_paper.pdf; for example, you could use final/charniak-johnson.pdf if you prefer. You must also create a db file of metadata in the above format. Required fields in the db file are P:, T:, A:, F:, and L:.

If you did use START, but want to add additional papers (invited talk abstracts, shared task descriptions, ...), then you can add entries to the db file similarly.

At present, each db entry including the last must be followed by a single blank line.

Make sure to get the L: fields right. At present, the L: field in a db entry is only checked against the actual paper length when and if the entry is created from START metadata. You will not get a second warning if you don't fix it, or fix it incorrectly, or create the db file manually without help from START.

Get Rough Cut

The following command extracts the first two pages from each paper and adds margin markings and generates a first rough cut of the proceedings, book.pdf:

make draft

As a convenience, the following command will update book.pdf and fire up Acrobat:

make view

The rough cut is not a full volume yet (e.g., no author index, and only the first couple of pages of each paper will be present). But it will reveal a number of potential problems in the automatically produced db file or the submitted PDF files.

It is possible that your system won't be able to make book.pdf if your version of pdflatex and pdfpages is too old. For example, you may get a message that it cannot handle some submitted file that is PDF version 1.5. You could (1) upgrade your software, (2) run on a machine with more recent software, or (3) ask the author to send you the source .tex of the submitted file so that you can run it through your older version of pdflatex yourself, producing a PDF version 1.4 file.

Use Rough Cut to Check and Fix the PDF files

Flip through the rough cut. Do all the fonts in the papers display and print correctly? If not, you will need to ask the author to email you a corrected version of the PDF file.

The publications chair has probably put up a webpage with instructions for authors, based on the version at .../aclpub/doc/authors. Remind the author to consult that page.

You can substitute the author's corrected PDF file for the bad version in the final directory. If the length, title, or author list has changed for some reason, you will have to edit or recreate the db file as well.

Also ask the author for the LaTeX or Word sources, so that you can produce the corrected PDF file yourself on your machine, if necessary.

There are three common problems:

Use Rough Cut to Check and Fix the db File

Now you will check that the metadata in the db are correct and correctly formatted.

The db file does not currently permit comments.

Use Rough Cut to Check and Adjust Margins

If you are satisfied with author and title information after making appropriate changes in the db file, regenerate the rough cut, by again typing

make draft           # or "make view"

The top/bottom margins of the paper are very often wrong. Unfortunately we have not yet found an automatic method to fix this. However, we have made it relatively easy for you to fix manually.

The rough-cut version of book.pdf includes a margin frame and rulers that should make it easy to detect how much each paper must be shifted to fit. Each paper is also stamped with its submission ID.

Look especially at page 2 of each paper. If the text doesn't fit within the frame, look at the top of the text to see how many millimeters it should be moved, and add an M: line for the paper into db. The format of the line is

M: x-axis-movement y-axis-movement[,more-options]

Positive values move up and to the right. Negative values move down and to the left. For example, if the top of the text is 14mm below the top of the frame, according to the ruler, then enter the values 0 14.

Common values are:
M: 0 6
M: 0 -12

Horizontal correction is rarely needed. In an exceptional case, you may need to shrink a page or clip it against a specified bounding box or something. You can do this by appending some options that will be passed to the LaTeX includegraphics command, which slurps in the page as a graphical object:

M: 0 6,scale=0.95
M: 0 10,bb=0 90 612 792,clip
You should be able to use multiple M: lines in order to handle different pages of a paper differently, but this isn't implemented yet.

After entering the margin movement information, you can regenerate the rough cut (just type make draft again) and check that everything is at the right place. Iterate until convergence.

Note: If you want the rough cut to include more than 2 pages of each paper, look in the Makefile to see where the number 2 was specified.

Set Schedule (or at least order of papers)

In an ACL proceedings, papers should be ordered chronologically by their time in the program. This makes the proceedings volume useful at the conference itself.

But the scripts you are using simply use the order of the papers in the db file. Therefore, it is necessary to reorder the db file chronologically.

We provide another mechanism to help you do this with low risk of error. This mechanism also helps you add special H: and X: lines to the db file. Those lines help determine the workshop program that will appear at the front of the proceedings.

Your first step is to create a draft order file from the current db. Type

make get-order

Here (roughly) is the start of the draft order file generated for the ACL 2005 main conference:

* Wednesday, June 29, 2005
+ 8:45--9:00 Opening Remarks
+ 9:00--10:00 Invited Talk by John Doe
= Session 1: Important Matters Resolved
7 10:00--10:30 # Charniak: Coarse-to-fine n-best parsing ...
20 10:00--10:30 # Liu: Log-linear Models for Word Ali...
36 10:00--10:30 # Boulis: A Quantitative Analysis of Lex...
57 10:00--10:30 # Sasaki: Question Answering as Question...
60 10:00--10:30 # Nivre: Pseudo-Projective Dependency P...
61 10:00--10:30 # Stevenson: A Semantic Approach to IE Patt...
62 10:00--10:30 # Hutchinson: Modelling the substitutability...

You should manually reorder this file and insert additional information about days, sessions, and extra events. Your final order file might begin like this:

* Sunday, June 26, 2005
+ 8:45--9:00 Opening
+ 9:00--10:00 Invited Talk by Justine Cassell
+ 10:00--10:30 Break
= Session M1R: Machine Learning and Statistical Models
215 10:30--11:00 # Ando: A High-Performance Semi-Superv...
304 11:00--11:30 # Trevor: Scaling Conditional Random Fie...
382 11:30--12:00 # Smith: Logarithmic Opinion Pools for ...
= Session M1M: Word Sense Disambiguation
228 10:30--11:00 # Curran: Supersense Tagging of Unknown ...
124 11:00--11:30 # Kohomban: Learning Semantic Classes for ...
240 11:30--12:00 # Dang: The Role of Semantic Roles in ...
= Session M1B: Generation
245 10:30--11:00 # Di Eugenio: Aggregation improves learning:...
417 11:00--11:30 # Paiva: Empirically-based Control of N...
305 11:30--12:00 # Soricut: Towards Developing Generation ...
+ 12:00--1:30 Lunch
= Session Session M2R: Parsing
403 1:30--2:00 # Matsuzaki: Probabilistic CFG with latent ...
177 2:00--2:30 # Miyao: Probabilistic disambiguation m...
65 2:30--3:00 # McDonald: Online Large-Margin Training o...
60 3:00--3:30 # Nivre: Pseudo-Projective Dependency P...

The order file can contain these kinds of lines:

Once you have edited the order file, you can again try

make draft           # or "make view"

You are free to edit the db and order files whenever you want. The db file still controls the look of the proceedings. However, whenever you change the order file, the db file will be rearranged to match it, and this in turn will affect the proceedings. (If the order file is inconsistent with the db file, you'll get an error message about extra or missing entries.)

Warning: Don't be tempted to directly edit the H: or X: lines in the db file. (Such edits will indeed affect your proceedings, but those lines in db will be completely replaced when and if you change order again. A safer scheme would make db by combining human-editable files metadata and order; a human should not edit db in place. This is an easy change and would also simplify the makefile.)

Recreate the Proceedings

Once you're satisfied with the rough cut, have a look at what the final proceedings will look like:

make shipout                         # create "final" version of book.pdf

You can switch between make draft and make shipout as often as you like. The target file is always called book.pdf.

For convenience, make book.pdf will update whichever version of book.pdf you are currently working with, and make view will both update and display it.

Edit Workshop-Specific Pages

A number of *.tex files now have to be manually edited for workshop-specific information:

Final Onscreen Check

Regenerate the proceedings, and continue to edit the .tex files until everything looks good:
make shipout           # or "make view"

Make sure that all names are spelled correctly, etc.

Just in case, check the following production issues:

Final Printed Check

Try printing the PDF file on a black-and-white Postscript printer. Just because everything works onscreen does not guarantee that it will work in the printout.

Make sure everything looks fine onscreen and prints fine! Look through the online and printed proceedings carefully. It is the responsibility of the workshop chair to make sure that everything is correct.

Create a Spine Text for the Proceedings

Edit the file spine.tex to reflect the text that you want to appear on the binding of the volume. (Note: If the volume is too thin to support spine text, this file will be ignored.)

Now type

make spine
and check the result with
gv spine.ps

Make Your CD-ROM Directory

Now it's time to build your contribution to the CD-ROM, including HTML pages and BibTeX entries. For this, you will need to create a file called meta with some metadata about your workshop as a whole. Here is an example of the format (but this format will change soon):

abbrev		EduApp
type		WORKSHOP
title 		Second Workshop on Building Educational Applications Using NLP
url		http://www.ets.org/research/conferences/nlp.html
booktitle	Proceedings of the Second Workshop on Building Educational Applications Using NLP
month		June
year		2005
location	Ann Arbor, Michigan
chairs		Jill Burstein (Educational Testing Service)
chairs		Claudia Leacock (Personal Knowledge Technologies)
bib_url		http://www.aclweb.org/anthology/W/W05/W05-02%02d

Contact the publications chair for the correct value of the about the bib_url field for your workshop. This determines its permament URL in the ACL Anthology.

To create the cdrom directory, try

make cdrom

To create an advertisement webpage for your workshop (basically a copy of cdrom/program.html that has been stripped of actual links to the papers), try

make advertisement

Wrap Everything Up and Send It Off!

You should still be in the "proceedings" directory that you created at the beginning. Type the following:

make all

This will check that everything is up to date, and package up your proceedings directory as a tarball file proceedings.tgz in the parent directory.

Send ../proceedings.tgz by the deadline to the publications chair. Use an email attachment if necessary, but it would be nicer just to email an http: or ftp: URL where the proceedings.tgz file can be downloaded. One day, the makefile should know how to upload to a standard ftp site or something ...

While we want the whole tarball so that we can rerun the scripts in case of problems, it is especially essential that we receive:

Congratulations and thank you!


This document has been derived from the instructions made by the NAACL-HLT 2009 publications chairs (Eric Ringger and Christy Doran)