FAQ Proofreading Guidelines

From DPCanadaWiki
Jump to navigation Jump to search

This is version 4.0 of the Proofreading Guidelines issued September 2012.They will become the basis for P3 Qualifications as of October 1, 2012. Rules for Greek characters, updated effective May 1, 2014..

Proofreading Guidelines refers to a document which contains all the "default" instructions and standards for proofreading (such as how to handle hyphenated words and letters with diacriticals) in rounds P1, P2, and P3. These standards apply to all projects, unless specifically over-ridden by instructions from the Project Manager in the Project Comments or Project Discussion.

NB. If you came here looking for the Formatting Guidelines, please follow this link: Formatting Guidelines

For complete clarity, the PM may make any exceptions to these Guidelines in the Project Comments, and the PPer can make any changes deemed appropriate in the PP stage. The only principle for the PPer to follow is consistency--either make the same consistent change throughout the book, or consistently stick to the original treatment of the author.

Please note that, where it may seem that the Guidelines encourage something other than a strict "match the scan" approach, it is because we are converting paper pages into unpaginated and rewrapped e-text.

Effective IMMEDIATELY, a clarification to Tables, effective as of October 7, 2022.

You can access the Proofreading Guidelines from FAQ Central and from any Proofing Interface window.

IMPORTANT: this is a reference document--beginning proofers do not need to memorize the entire document, but everyone should consult it when problems or questions arise. If the matter is still not clear, raise an inquiry in the Project forum.

The Primary Rule

"Don't change what the author wrote!"

The final electronic book seen by a reader, possibly many years in the future, should accurately convey the intent of the author. If the author spelled words oddly, we leave them spelled that way. If the author wrote outrageous racist or biased statements, we leave them that way. If the author puts italics, bold text or a footnote every third word, we mark them italicized, bolded or footnoted. We are proofreaders, not editors. (See Printer Errors/Misspellings for proper handling of obvious misprints.)

We do change minor typographical conventions that don't affect the sense of what the author wrote. For example, we rejoin words that were broken at the end of a line ( End-of-line Hyphenation). Changes such as these help us produce a consistently proofed version of the book. The proofreading rules we follow are designed to achieve this result. Please carefully read the rest of the Proofreading Guidelines with this concept in mind. These guidelines are intended for proofreading only. There is a separate set of Formatting Guidelines for the volunteers who work on the formatting of the text.

To assist the next proofreader, the formatter, and the Post-Processor, we also preserve Line Breaks. This allows them to easily compare the lines in the text to the lines in the image.

About This Document

This document is written to explain the proofreading rules we use to maintain consistency when proofreading a single book that is distributed among many proofreaders, each of whom is working on different pages. This helps us all do proofreading the same way, which in turn makes it easier for the formatter and for the Post-Processor who will complete the work on this e-book.

It is not intended as any kind of a general editorial or typesetting rulebook.

We've included in this document all the items that new users have asked about while proofreading. If there are any items missing, or items that you consider should be done differently, or if something is vague, please let us know.

This document is a work in progress. Help us to progress by posting your suggested changes in the Suggestions for Proofreading Guidelines Revisions II.

The original version of the Proofreading Guidelines was created in May of 2008, and revised substantially in November 2009. This document is the fourth version, dated September 2012.

General Comment: if in doubt about making a change while proofing, always leave a [**note], and inquire in the Project Forum, rather than making "silent" corrections.

Project Comments

On the main Project Page, there is a section called "Project Comments" containing information specific to that project (book). Read these before you start proofreading pages! If the Project Manager wants you to do something in this book differently from the way specified in these Guidelines, that will be noted here. Instructions in the Project Comments override the rules in these Guidelines, so follow them. There may also be instructions in the project comments that apply to the formatting phase, which do not apply during proofing. Finally, this is also where the Project Manager may give you interesting tidbits of information about the author or the project.

Please also read the Project Thread (Forum). The Project Manager may clarify project-specific guidelines here, and it is often used by proofreaders to alert other proofreaders to recurring issues within the project and how they can best be addressed. (See the next section).

On the Project Page, the link 'Images, Pages Proofread, & Differences' allows you to see how other proofreaders have made changes.

Forum/Discuss This Project

On the proofreading interface page (Project Page) where you start proofreading pages, on the line "Forum", there is a link titled "Discuss this Project" (if the discussion has already started), or "Start a discussion on this Project" (if it hasn't). Clicking on that link will take you to a thread in the projects forum dedicated to this specific project. That is the place to ask questions about this book, inform the Project Manager about problems, etc. Using this project forum thread is the recommended way to communicate with the Project Manager and other proofreaders who are working on this book.

Fixing your own errors on Previous Pages

When you select a project for proofreading, the main Project Page is loaded. This page contains links to pages from this project that you have recently proofread. (If you haven't proofread any pages yet, there will be no links shown.)

Pages listed under either "DONE" or "IN PROGRESS" are available to make proofreading corrections or to finish proofreading. Just click on the link to the page. So if you discover that you made a mistake on a page, or marked something incorrectly, you can click on that page here and re-open it to fix the error.

You can also use the "Images, Pages Proofread, & Differences" or "Just My Pages" links on the Project Comments page. These pages will display an "Edit" link next to the pages you have worked on in the current round that can still be corrected.

For more detailed information, refer to either the Standard Proofreading Interface Help or the Enhanced Proofreading Interface Help, depending on which interface you prefer.

Marking Errors and/or Comments

Throughout this document you will see instructions about marking possible errors and/or your comments with [** comment]. Unless the Project Manager instructs otherwise, the best place to insert your comment is directly after the word or character or other item that you have a concern with or question about; that will allow the next person who sees the page as well as the Post-Processor to know precisely what you are commenting on.

At the very least, put your comment on the same line as the item in question, but be clear as to where, in the line, the problem is. Placing your comment on the line above or below or somewhere else on the page can be confusing for others unless your comment relates to the whole page—in which case, place it at the top of the page.

Even when you post a note in the Project Discussion Forum, it's a good idea to mark the item that you would like someone to look at for you.

There is a [** ] button at the bottom of the Proofreading Interface screen that you can use for inserting your comments. Start your note with a square bracket and two asterisks [** and end it with another square bracket ]. This clearly separates it from the Author's text and signals the Post-Processor to stop and carefully examine this part of the text and the matching image to address any issues. Agreement or disagreement of another proofreader's comment can be added, but even if you know the answer, you absolutely must not remove the comment of a previous proofreader. If you have found a source which clarifies the problem, please cite it so the Post-Processor can also refer to it.

Proofers should not change typos in the text file, but make a note, like [**typo: text].

How To Proof

Paragraph and Line Spacing

Paragraph Spacing/Indenting

Use a blank line to separate paragraphs. You should not indent the start of paragraphs, but if all paragraphs are already indented, don't bother removing those spaces—that can be done automatically during post-processing.

If the page starts with a new paragraph, place a blank line at the top of the page. However, remove any extra blank lines between paragraphs or at the top of the pages.

See the Page Headers/Page Footers image/text for an example.

Line Breaks

Leave all line breaks in so that later in the process other volunteers can easily compare the lines in the text to the lines in the image. If the previous proofreader removed the line breaks, please replace them so that they once again match the image.

Extra Spaces or Tabs Between Words

Extra spaces between words are common in OCR output. You don't need to bother removing these—that can be done automatically during post-processing.

However, extra spaces around punctuation, em-dashes, quote marks, ellipses, etc. do need to be removed when they separate the punctuation mark from the word.

For example, in the sentence A horse ;   my kingdom for a horse, the space between the word "horse" and the semicolon should be removed. But the 2 spaces after the semicolon are fine—you don't have to delete one of them.

In addition, if you find any tab characters in the text you should remove them.

Trailing Space at End-of-line

Do not bother inserting spaces at the ends of lines of text. It is a waste of your time for something that we can take care of automatically later. Similarly do not waste your time removing extra spaces at the ends of lines.

Blank Line at end of page

Do not add or remove blank lines at the end of a page. (minor revision November 2021)

Single Word at Bottom of Page

Proofread this by deleting the word, even if it's the second half of a hyphenated word.

In some older books, the single word at the bottom of the page (called a "catchword", usually printed near the right margin) indicates the first word on the next page of the book (called an "incipit"). It was used to alert the printer to print the correct reverse (called "verso"), to make it easier for printers' helpers to make up the pages prior to binding, and to help the reader avoid turning over more than one page.

Punctuation

Punctuation

In general, there should be no space before punctuation characters except opening quotation marks. If scanned text has a space before punctuation, remove it.

Spaces before punctuation sometimes appear because books typeset in the 1700s & 1800s often used partial spaces before punctuation such as a semicolon or comma. As well, many OCR applications frequently interpret partial spaces as full spaces which will result in spaces appearing where they shouldn't be.

For example, proofread:

and so it goes ; ever and ever.

as:

 and so it goes; ever and ever.

End of Sentence Periods

Proofread periods that end sentences with a single space after them.

You do not need to remove extra spaces after periods if they're already in the scanned text—that can be done automatically during post-processing. See the Sidenotes or paragraph Side-Descriptions image and text for an example.

Dashes, Hyphens, and Minus Signs

There are generally four such marks you will see in books (see the examples below):

Hyphens

These are used to join words together, or sometimes to join prefixes or suffixes to a word. Leave these as a single hyphen, with no spaces on either side. Note that there is a common exception to this shown in the second example below.

En-dashes

These are just a little longer, and are used for a range of numbers, or for a mathematical minus sign. Proofread these as a single hyphen, too. Spaces before or after are determined by the way it was done in the book: generally there are no spaces in number ranges; usually there are spaces around mathematical minus signs, sometimes both sides, sometimes just before.

Em-dashes & long dashes

These serve as separators between words—sometimes for emphasis like this—or when a speaker gets a word caught in his throat——! Proofread these as two hyphens if the em-dash is short and four hyphens if the em-dash is long. Don't leave a space before or after, even if it looks as if there was a space in the original book image. An easy "rule of thumb", to use if you are unsure if the dash should be "two" or "four" hyphens long, is to look at the lines above and/or below the dash: if the dash appears to be at least 3 letters long, then it is a long dash and you should use four hyphens.

Note: If an em-dash appears at the start or end of a line of your OCR'd text, move the em-dash and/or the first word of the lower line to the end of the upper line so that there are no spaces or line breaks around it. Only if the author used an em-dash to start or end the paragraph or line of poetry or dialog should you leave it at the start or end of a line; and do not leave a space between the em-dash and the word next to it.

Occasionally, you may see an em-dash at the end of a sentence which is immediately followed by another sentence. There may be a space following that em-dash; in such cases, it is best to close up the space and leave a [**space?] note for the PPer who will then determine if a space is actually required. Example: "I wish you would--[**space?]Is there any point in telling you what I wish for you?"[June 2019: recommendation for dealing with sentence-ending dashes.]

Deliberately Omitted or Censored Words or Names or Letters

Proofread these as 4 hyphens. When it represents a word, we leave appropriate space around it as if it's really a word. If it's only part of a word, then no spaces--join it with the rest of the word. If the dash looks as if it is the size of the rest of the smaller em-dashes, then proof it as a single em-dash, i.e., two hyphens. If the dash looks as if it is the size of a long dash, then proof it as two em-dashes, i.e., four hyphens.

 Original Text:                                 Correctly Proofread Text:                         Type:
 semi-detached                                  semi-detached                                     Hyphen
 three- and four-part harmony                   three- and four-part harmony                      Hyphens
 discoveries which the Crus-                    discoveries which the Crusaders                   Hyphen
 aders made and brought home with               made and brought home with
 factors which mould char-                      factors which mould character--environment,       Hyphen & Em-dash       
 acter—environment, training and                training and
 heritage,                                      heritage,
 See pages 21–25                                See pages 21-25                                   En-dash
 –14° below zero                                -14° below zero                                   En-dash
 X – Y = Z                                      X - Y = Z                                         En-dash
 I am hurt;—A plague                            I am hurt;--A plague                              Em-dash
 on both your houses!—I am dead.                on both your houses!--I am dead.
 sensations—sweet, bitter, salt and sour        sensations--sweet, bitter, salt and sour--if      Em-dash
 —if even all of these are simple tastes.       even all of these are simple tastes.
 senses—touch, smell, hearing, and              senses--touch, smell, hearing and                 Em-dash
 sight—with which we are here concerned,        sight--with which we are here concerned,
 It is the east, and Juliet is the sun!—        It is the east, and Juliet is the sun!--          Em-dash
 "Three hundred———" "years," she was            "Three hundred----" "years," she was              Longer Em-dash
 going to say, but the left-hand cat            going to say, but the left-hand cat
 interrupted her.                               interrupted her.
 As the witness Mr. —— testified,              As the witness Mr. ---- testified,                 long dash
 As the witness Mr. S—— testified,             As the witness Mr. S---- testified,                long dash
 the famous detective of ——B Baker St.         the famous detective of ----B Baker St.            long dash
 "You —— Yankee", she yelled.                  "You ---- Yankee", she yelled.                     long dash
 "I am not a d———d Yankee", he replied.         I am not a d----d Yankee", he replied.             Longer Em-dash
End-of-page or Start-of-page Em-dashes

If an em-dash appears at the very start or end of a page of your OCR'd text, place an * at the open end to alert the Post-Processor that the dash probably needs to be joined to another line on the previous or next page.

For example, proofread:

  smell, hearing and sight—   (appearing at the end of a page)

and:

  —no matter what inducements (appearing at the beginning of a page)

as:

  smell, hearing and sight--*

and:

  *--no matter what inducements
End-of-line Hyphenation

Where a hyphen appears at the end of a line, join the two halves of the hyphenated word back together. If it is really a hyphenated word like "well-meaning", join the two halves leaving the hyphen in between. But if it was just hyphenated because it wouldn't fit on the line, and is not a word that is usually hyphenated, then join the two halves and remove the hyphen. Keep the joined word on the top line, and put a line break after it to preserve the line formatting—this makes it easier for volunteers in later rounds. See the Dashes, Hyphens and Minus Signs section above for examples of each kind (nar-row turns into narrow, but low-lying keeps the hyphen). If the word is followed by punctuation, then carry that punctuation onto the top line, too.

Words like to-day and to-morrow, that we don't commonly hyphenate now, were often hyphenated in the older books we are working on. Leave them hyphenated the way the author did. If you are not sure whether the author hyphenated a word or not, leave the hyphen, put an * after the hyphen, and join the word together. Like this: to-*day. [This is an exception to the "rule" for putting [** explanation] when you are unsure of something.] The asterisk will bring it to the attention of the Post-Processor, who has access to all the pages, and can determine how the author typically wrote this word.

End-of-page Hyphenation

Proofread end-of-page hyphens by leaving the hyphen at the end of the last line, and mark it with an * after the hyphen.

For example, proofread:

 something Pat had already become accus-

as:

 something Pat had already become accus-* 

The * will indicate to the Post-Processor that the word must be rejoined when the pages are combined to produce the final e-book.

Of course, at the top of the next page, the proofer would insert an * to indicate the partial word needs to be rejoined (if it is obvious that it is a partial word), such as a continuation of the example above.

 *tomed to.

Quotation Marks

Double Quotes

Unless otherwise instructed, for English language projects, you can still proofread these as plain ASCII " double quotes. However, if the PM used "curly" double quotes in the OCR'd text, please leave them as is. Do not change double quotes to single quotes; leave them as the Author wrote them: as double quotes.

Unless instructed otherwise, for LOTE projects, match the quote style as shown in the text images. French guillemets, «like this», are available from the Character Picker menu (+) in the proofreading interface. Quotation marks used in many German texts, like this, are available in the Character Picker menu (+). These LOTE double quote marks are also available as buttons on the Proofing Interface along with Italian double quotes and reversed double guillemets.

The Project Manager may instruct you in the Project Comments to proofread double quotation marks differently for a particular book.(Revised September 2021)

Single Quotes

Unless otherwise instructed, for English language projects, you can still proofread these as the plain ASCII ' single quote (e.g., apostrophe). However, if the PM used "curly" single quotes in the OCR'd text, please leave them as is. Do not change single quotes to double quotes; leave them as the Author wrote them: as single quotes.

Unless instructed otherwise, for LOTE projects, match the quote style as shown in the text images. Single guillemets, ‹like this›, are available from the Character Picker menu (+) in the proofreading interface. Single quotation marks used in German texts, like this, are also available in the Character Picker menu (+). Please do not confuse them with commas or apostrophes which can look different depending on the font you use.

The Project Manager may instruct you in the Project Comments to proofread single quotation marks differently for a particular book. (Revised September 2021)

Quote Marks on each line

Proofread quotation marks at the beginning of each line of a quotation by removing all of them except for the one at the start of the first line of the quotation.

If the quotation goes on for multiple paragraphs, each paragraph should have an opening quote mark on the first line of the paragraph.

Often there is no closing quotation mark until the very end of the quoted section of text, which may not be on the same page you are proofreading. Leave it that way—do not add closing quotation marks that are not in the page image. However, if you are concerned about the lack of closing quotes, you can always leave a [** comment] for the PPer.

Missing Quote Marks

Watch out for a missing opening quote at the start of the first paragraph of a chapter or section, which some publishers did not include, or which the OCR missed, due to a large capital (or drop cap) in the original. If the author started the paragraph with dialog, insert a double quote (or single quote, if that is the style of the book). If you are unsure if quote mark should be added, you can always leave a note [** " ?] or [** ' ?].[2019]

Period Pause "..." (Ellipsis)

The guidelines are the same for English and Languages Other Than English (LOTE).

Use the general rule "Follow closely the style used in the printed page." In particular, insert spaces, if there are spaces before or between the periods, and use the same number of periods as appear in the image. Sometimes the printed page is unclear: in that case, insert [**unclear] to draw the attention of the Post-Processor.

Note: As in every other case, the Post-Processor may make a final determination to insert non-breaking spaces between the word and the ellipsis, or between the dots of the ellipsis, or to format the ellipses throughout the book in some differing style.

To decide whether there should be spaces between dots in an ellipsis, look at how much room there is between the last letter of a sentence and a normal sentence-ending period (full stop). That defines "no space" for that project. If the spacing between the dots of an ellipsis is greater than "no space", add a space between the dots.

When an ellipsis is followed immediately by punctuation (exclamation point, question mark, colon or possibly semi-colon or comma), leave a space if there is one in the scan. If there is (also) an extra long space before the start of the next sentence (a 2en space), use only a normal space, to avoid confusing the PPing software. Do NOT leave a space if the ellipsis is next to any other "container", such as brackets, braces or parentheses.

Spelling and related subjects

Accented/Non-ASCII Characters

Please proofread these using the proper accented characters, where possible, using the character drop-down menus available on the Proofing Interface. See Characters with Diacritical marks for ways to proof some characters that may not be on the menus.

Project Gutenberg Canada requires ISO-8859-1 versions of texts, but welcomes versions using other character encodings which can preserve more of the information from the original text. PGC will normally post HTML and UTF-8 text versions. DPC expects Post-Processors to commit to producing, or arrange for someone else to produce, an HTML version.

For Windows:

If you find the drop-down lists cumbersome, you can use these alternative techniques:

(a) Use the Character Map program (Start: Run: charmap) to select an individual letter, and then cut & paste.

(b) Use an on-line program, such as Edicode.

(c) Type the Alt+NumberPad shortcut codes for these characters. - This is faster than using cut & paste, once you get used to the codes. - Hold down the Alt key and type the four digits on the Number Pad—the number row over the letters won't work. - You must type all 4 digits, including the leading 0 (zero). - These short-cut or key codes are for the US-English keyboard layout. They may not work for other keyboard layouts.

The table below shows the codes we use.

Windows Shortcuts for common UTF-8 characters
` grave ´ acute (aigu) ^ circumflex ~ tilde ¨ umlaut ° ring Æ ligature
à Alt-0224 á Alt-0225 â Alt-0226 ã Alt-0227 ä Alt-0228 å Alt-0229 æ Alt-0230
À Alt-0192 Á Alt-0193 Â Alt-0194 Ã Alt-0195 Ä Alt-0196 Å Alt-0197 Æ Alt-0198
è Alt-0232 é Alt-0233 ê Alt-0234 ë Alt-0235
È Alt-0200 É Alt-0201 Ê Alt-0202 Ë Alt-0203
ì Alt-0236 í Alt-0237 î Alt-0238 ï Alt-0239
Ì Alt-0204 Í Alt-0205 Î Alt-0206 Ï Alt-0207 / slash Œ ligature
ò Alt-0242 ó Alt-0243 ô Alt-0244 õ Alt-0245 ö Alt-0246 ø Alt-0248 œ Alt-0156
Ò Alt-0210 Ó Alt-0211 Ô Alt-0212 Õ Alt-0213 Ö Alt-0214 Ø Alt-0216 Œ Alt-0140
ù Alt-0249 ú Alt-0250 û Alt-0251 ü Alt-0252
Ù Alt-0217 Ú Alt-0218 Û Alt-0219 Ü Alt-0220 currency mathematics
ñ Alt-0241 ÿ Alt-0255 ¢ Alt-0162 ± Alt-0177
Ñ Alt-0209 Ÿ Alt-0159 £ Alt-0163 × Alt-0215
çedilla Icelandic marks accents punctuation ¥ Alt-0165 ÷ Alt-0247
ç Alt-0231 Þ Alt-0222 © Alt-0169 ´ Alt-0180 ¿ Alt-0191 $ Alt-0036 ¬ Alt-0172
Ç Alt-0199 þ Alt-0254 ® Alt-0174 ¨ Alt-0168 ¡ Alt-0161 ¤ Alt-0164 ° Alt-0176
superscripts Ð Alt-0208 Alt-0153 ¯ Alt-0175 « Alt-0171 µ Alt-0181
¹ Alt-0185 ð Alt-0240 Alt-0182 ¸ Alt-0184 » Alt-0187 ordinals ¼ Alt-0188
² Alt-0178 sz ligature § Alt-0167 · Alt-0183 º Alt-0186 ½ Alt-0189
³ Alt-0179 ß Alt-0223 ¦ Alt-0166 * Alt-0042 ª Alt-0170 ¾ Alt-0190


Do not use other special characters (for example, astrology symbols) unless the Project Manager tells you to in the Project Comments.

For Apple Macintosh:

If you find the drop-down lists cumbersome, you can use these alternative techniques:

(a) Use the "Key Caps" program as a reference.

- In OS 9 & earlier, this is located in the Apple Menu; in OS X through 10.2, it is located in Applications, Utilities folder. This brings up a picture of the keyboard, and pressing shift, opt, command, or combinations of those keys shows how to produce each character. Use this reference to see how to type that character, or you can cut & paste it from here into the text in the proofreading interface.

- In OS X 10.3 and higher, the same function is now a palette available from the Input menu (the drop-down menu attached to your locale's flag icon in the menu bar). It's labeled "Show Keyboard Viewer." If this isn't in your Input menu, or if you don't have that menu, you can activate it by opening System Preferences, the "International" panel, and selecting the "Input Menu" pane. Ensure that "Show input menu in menu bar" is checked. In the spreadsheet view, check the box for "Keyboard Viewer" in addition to any input locales you use.

(b) Type the Apple Opt- shortcut codes for these characters.

- This is a lot faster than using cut & paste, once you get used to the codes.

- Hold the Opt key and type the accent symbol, then type the letter to be accented (or, for some codes, only hold the Opt key and type the symbol).

- These instructions are for the US-English keyboard layout. It may not work for other keyboard layouts.

The table below shows the codes we use.

Apple Mac Shortcuts for common UTF-8 characters
` grave ´ acute (aigu) ^ circumflex ~ tilde ¨ umlaut ° ring Æ ligature
à Opt-`,a á Opt-e,a â Opt-i,a ã Opt-n,a ä Opt-u,a å Opt-a æ Opt-'
À Opt-~,A Á Opt-e,A Â Opt-i,A Ã Opt-n,A Ä Opt-u,A Å Opt-A Æ Opt-"
è Opt-~,e é Opt-e,e ê Opt-i,e ë Opt-u,e
È Opt-~,E É Opt-e,E Ê Opt-i,E Ë Opt-u,E
ì Opt-~,i í Opt-e,i î Opt-i,i ï Opt-u,i
Ì Opt-~,I Í Opt-e,I Î Opt-i,I Ï Opt-u,I / slash Œ ligature
ò Opt-~,o ó Opt-e,o ô Opt-i,o õ Opt-n,o ö Opt-u,o ø Opt-o œ Opt-q
Ò Opt-~,O Ó Opt-e,O Ô Opt-i,O Õ Opt-n,O Ö Opt-u,O Ø Opt-O Œ Shift-Opt-Q
ù Opt-~,u ú Opt-e,u û Opt-i,u ü Opt-u,u
Ù Opt-~,U Ú Opt-e,U Û Opt-i,U Ü Opt-u,U currency mathematics
ñ Opt-n,n ÿ Opt-u,y ¢ Opt-4 ± Opt-+
Ñ Opt-n,N Ÿ Opt-u,Y £ Opt-3 × Note 1
çedilla Icelandic marks accents punctuation ¥ Opt-y ÷ Opt-/
ç Opt-c Þ Note 1 © Opt-g ´ Opt-E ¿ Opt-? $ Shift-4 ¬ Opt-I
Ç Opt-C þ Shift-Opt-6 ® Opt-r ¨ Opt-U ¡ Opt-1 ¤ Shift-Opt-2 ° Opt-*
superscripts Ð Note 1 Opt-2 ¯ Shift-Opt-, « Opt-\ µ Opt-m
¹ Note 1 ð Note 1 Opt-7 ¸ Opt-Z » Shift-Opt-\ ordinals ¼ Note 1
² Note 1 sz ligature § Opt-6 · Opt-8 º Opt-0 ½ Note 1
³ Note 1 ß Opt-s ¦ Note 1 * Note 1 ª Opt-9 ¾ Note 1

Note 1: there is no Apple Mac Opt code for this symbol.

Do not use other special characters (for example, astrology symbols) unless the Project Manager tells you to in the Project Comments.

Characters with Diacritical marks

In some projects, you will find characters with special marks either above or below the normal A..Z character. These are called diacritical marks and indicate a special pronunciation for this character. In almost all cases, these characters can be entered using the drop-down menus on the proofing interface.

In rare cases, there may be letters with diacritical marks that are not available on the menus. In these cases, we proofread the letters by using a specific coding, as follows:

Proofreading Symbols for Diacritical Marks
diacritical mark sample above below
macron(straight line) [=x] [x=]
2 dots (diaresis, umlaut) ¨ [:x] [x:]
1 dot [.x] [x.]
grave accent ` [`x] or [\x] [x`] or [x\]
acute accent (aigu) ´ ['x] or [/x] [x'] or [x/]
circumflex ^ [^x] [x^]
caron (v-shaped symbol) [vx] [xv]
breve (u-shaped symbol) [)x] [x)]
tilde ~ [~x] [x~]
cedilla ¸ [,x] [x,]


For example, Adiacrit.jpg̮ becomes [a)].

The "x" in the table represents a character with a diacritical mark. When proofreading, use the actual character from the text, not the x shown in the examples. Be sure to include the square brackets ([ ]), so the Post-Processor knows to which letter it applies. He or she will eventually replace these with whatever symbol works in each version of the text they produce, like 7-bit ASCII, ISO-8859-1, UTF-8, HTML, etc.

Greek characters

Effective May 1, 2014, DPC no longer requires Greek transliteration. Any/all Greek is to be treated as any other language and only the actual Greek characters are to be input: no transliteration is to be included.

The Character Picker on the Proofing Interface includes ALL standard and accented Greek characters. There is no Transliteration Tool on the Proofing Interface.

If the PM requires transliteration, they must request it in the Project Comments.

The actual Greek characters must be entered directly into the proofed text, at the latest in the P3 round. P2 proofers should note that, though they won’t be docked in P3 qualification for attempting to insert Greek and making errors, they will be docked for simply using [**Greek: ] and deleting the OCR material. Because we have to produce both an HTML version and a text version, we need both the Greek characters and a transliteration, entered like this [Greek: transliterated Greek words]. Use the normal Greek transliteration tool in the Proofing interface, or insert the Greek characters from an alternative keyboard map. Ask for help in the project forum if you’re not sure how to proceed.

Exception: If a substantial part of the text is Greek, (enough that Greek is listed as a language for the project), then only the accented Greek characters are needed and no transliteration should be included.

"Non-Standard" Characters

Some projects contain text printed in non-standard or non-Latin characters--that is, characters other than those that appear on your standard keyboard, or may not be included in the Character Picker on the Proofing Interface.

Examples include Cyrillic (used in Russian, Slavic and other languages), Hebrew, Arabic, Turkish, etc.

Follow these basic rules:

#1: Do not transliterate.

#2: Transcribe the text if you can. If you can't, post in the Project Discussion at once so it can be dealt with as soon as possible.

#3: If the text contains a lot of a language you can't deal with, don't work on this text. Especially if you are a newer proofer, working in P1.

If a large portion of the document is written entirely in a non-standard script, it is the best to install a keyboard driver that supports the language. Consult your operating system manual for instructions on how to do that. We will be setting up a forum to deal with these matters. [link needed here]

If you are uncertain about a specific character or an accent, mark it with [** accent? / character?] to bring it to the attention of the next proofreader, a formatter or the Post-Processor.

For scripts which cannot be so easily entered, such as Arabic, surround the text with appropriate markers: [Arabic: **] and leave it as scanned. Include the ** so the Post-Processor can address it later.

Very old characters and letter formats

On occasion, you will see books that use letter/character formats that have no equivalent character in modern fonts. Examples are "long s", yogh, thorn, special symbol and abbreviation characters, etc. Currently DP-INT has several wiki pages discussing such old letters and characters. "Proofing Old Texts" is one such page or "Early English Text". Project Managers will note the existence of any such characters in the Project Comments along with instructions on how to deal with them.

Usage of u/v and i/j/y:

In the 1600s and earlier, the letters u, v, i, and j (and sometimes y) were used differently than they are today. Often v was used only at the beginning of a word, while u was used in the middle and end, as in these words: vpon, vntil, haue, giue. In addition, occasionally i was used where we have a j today (or vice versa): iudge, obiect, jl. The letter j may also occur in Roman numerals such as iij. In capital letters, there were usually no U or J; only V and I were used. Unless the Project Manager gives special instructions otherwise, proof these just the way they appear in the image. Do not modernize the spelling.

Latitude and Longitude

When geographical positions--that is, latitude and/or longitude--are given in text books, they are generally shown using the degree (°) symbol and often with the prime (′) symbol for the minutes and, rarely for the books that DPC does, the double prime (″) symbol for seconds. You will find all three of the symbols on the Character Picker.

Prime chars3.png

Click on the "+" symbol (marked with a red box) of the Character Picker, that will open the appropriate symbol listing. The prime symbols are on the bottom row at the far left (illustration shows "prime" underneath); the degree symbol is also on the bottom row (marked with an arrow).

Be careful to use the correct symbols; do not use other characters or accents is place of the prime or degree symbols--note the comparison of the degree and ordinal symbols noted below.

Latitude and longitude positions will generally have the hemisphere specified, N or S, E or W; sometimes two hemispheres will be used. Follow the instructions in the Project Comments regarding any spaces between the hemisphere, degree and/or prime coordinates. If nothing is specified, please match the scan.

Examples (shown, randomly, with and without spaces; and more precise than the old books we see):

  • North Pole: 90° N
  • Toronto, Canada: 43°44′N 79°22′W
  • Cannes, France: 43°33′N 7°E
  • Greenwich, England: 51° 28′ 38″ N; it is also the antipodal meridian at both 180°W and 180°E
  • Quitsato Sundial, Ecuador: 00°00′00″N
  • Sydney, Australia: 33°51′54″S 151°12′34″E
  • Goonellabah, New South Wales: 28° 49′ 12″ S 153° 19′ 41″ E (or 28.8167° S 153.3167° E)
Ordinals and Degree Symbols—which is which on the Character Picker?

When looking at a page image, it is generally very easy to determine whether that little circular symbol next to the number is an ordinal or a degree symbol by the context. For example: "" in a listing is obviously an ordinal and the circle generally looks like a little "0" (zero); whereas "latitude 55°" is a degree symbol which appears more like a true circle. But since one or both are not necessarily on your keyboard, you will need to use either computer keycodes or the Character Picker on the Proofing Interface. (See above instructions for location of the degree symbol in the image of the Character Picker "page". The ordinal is the other little circle to the left and next to the little "a".)

Depending on the font that you are using for the proofing interface, the ordinal may show with a little line underneath it—then you know you've got the right one. If you use the recommended DPCustomMono2 font, there will be that small line. Sometimes, however, it can be difficult to figure out which of those 2 little circles on the Character Picker is which.... So, if you are uncertain which to use, you can proof it as a superscript: ^{o} and the next proofer, formatter, or the PPer will know that the correct symbol needs to be input.

Superscripts

Older books often abbreviated words as contractions, and printed them as superscripts. Proofread these by inserting an up-arrow (^) followed by the superscripted text; then surround the text--even if only a single character--with curly braces { and } as well. For example:

  Genrl Washington defeated Ld Cornwall's army.

should be proofed like this:

  Gen^{rl} Washington defeated L^{d} Cornwall's army.

In scientific & technical works, format superscripted characters with curly braces { and } surrounding them even if there is only one character superscripted:

  a2 + b2 = c2

would be proofread as:

  a^{2} + b^{2} = c^{2}.

Generally, if you see a complicated scientific or mathematical equation and you are unsure how to deal with it, mark it as [**math] or [**equation] and let the next proofer, the formatters or Post-Processor handle it.

In some poetry books, stanza and line numbers are represented by superscripts and curly brackets. Proof them this way, unless the Project Manager requests that some other notation be used.

  3^{3,4}  to mean stanza 3, lines 3 and 4
  or 2^{1-4}  to mean stanza 2, lines 1 to 4
  or 1^{2}  to mean stanza 1, line 2

If the superscript represents a footnote marker, then see the Footnotes/Endnotes section instead.

Subscripts

Subscripted text is often found in scientific or mathematical works, but is not uncommon in other material. Proofread all subscripted text by inserting an underline character _ before the letter, number or other character that is subscripted. As with superscripts, use curly brackets {} to enclose the subscripted characters.

For example:

  H2O.

would be proofread as:

  H_{2}O.

And:

  The general formula for a simple acyclic alcohol is CnH2n+1OH

would be proofread as:

  The general formula for a simple acyclic alcohol is C_{n}H_{2n+1}OH

Contractions

Remove any extra space (or partial space) in contractions. This was often an early printers' convention, where the space was retained to indicate that 'would' and 'not' were originally separate words.

For example, proofread:

 would n't

as:

 wouldn't. 

Some Project Managers may specify in the Project Comments not to remove extra spaces in contractions, particularly in the case of texts that contain slang, dialect, or are written in languages other than English. Or, the Project Manager may stress that "match the scan" is the overall guideline, in any case of doubt. In either case, follow the Project Manager's instructions.

See also comments under the section Very old characters and letter formats.

Abbreviations

Unless otherwise noted in the Project Comments, please proof abbreviations as in the images: for example, do NOT put spaces in e.g. or i.e. unless shown in the image.

Formatting and related comments

Any formatting tags, such as <i>…</i>, <b>…</b>, or <sc>…</sc> that have come into the proofing rounds from the project prep or OCR stages should be removed in proofing, because they can mask other errors in characters, alignment or spacing.


Font size changes

Do not mark changes in font size. The formatters will take care of this later in the process.

Large, Ornate opening Capital letter (Drop Cap)

Proofread large and ornate graphic first letters of a chapter, section, or paragraph as just the letter.

The rest of the first word and the next word or two of a new chapter may be printed in small caps but those are treated as ordinary text, unless the project comments say otherwise. Leave the addition of small caps markup for the Formatting rounds.

Italic and Bold Text

Italicized text may occasionally appear with <i> inserted at the start and at </i> the end of the italics. Bold text (text printed in a heavier typeface) may occasionally appear with <b> inserted before the bold text and </b> after it. Please remove these formatting tags, as they may conceal errors in the actual text itself. Do not add such formatting where it does not appear. The formatters will do that later in the process.

Words in Small Capitals or in ALL CAPITALS

small caps (capital or upper-case letters that are smaller than the standard capitals) may occasionally appear with <sc> inserted before the Small Caps and </sc> after the Small Caps.

Or, you may see Mixed-Case Small Caps. The presence of small caps can usually be determined IF there is regular-sized text on the same line (small caps are generally similar in height to lower-case letters); therefore a chapter/section title on a line by itself that looks like all small caps is really ALL CAPS.

If the scan shows small-capped words, you should use the ABC/Abc/abc buttons at the bottom of the Proofing Interface to change the case of the affected text to match the scan.

To be specific, words that are in full-sized ALL CAPS and those that are in all small caps should be proofed in upper case (that is, in ALL CAPS). For Mixed-Case Small Caps, large capitals should be upper case and small capitals, lower case.

For example, the text image shows:
Small Caps.jpg

Please proof as follows:

 This is ALL CAPS.
 This is ALL SMALL CAPS.
 This is Mixed-Case Small Caps.

Old books often printed the first word or two of every chapter in ALL CAPS or Mixed-case Small Caps; replace the upper-case or small caps so the letter case is what you would see in a normal sentence, unless the project comments give other instructions. (Clarification: February 2021)

Example:

Original text: MANY PEOPLE eat cakes for afternoon tea. 
....would be correctly proofed as:
Many people eat cakes for afternoon tea.

Fractions

Proofread fractions as follows: use the drop-down menus for ¼, ½ and ¾. All other fractions are proofed as follows: whole number, followed immediately by a dash, followed immediately by the numerator, then a forward slash, then the denominator. For example 2-5/8, or 11-15/16. The hyphen prevents the whole and fractional part from becoming separated when the lines are rewrapped during post-processing. Improper fractions follow the same rules: 29/7 or 17/16.


If the text has more than a simple fraction and you are uncertain how to proof it, leave a comment explaining the problem or concern; for example [**unsure how to proof this number].

Mathematics, Equations & Scientific Notations

Generally, proofers do not deal with formatting math or scientific equations or notations except for ensuring that the numbers and/or letters are correct. If there have been no specific instructions noted in the Project Comments, and/or if you are unsure how to deal with something that seems complicated, mark it as [**math] or [**equation] or a similar comment and leave it for the formatters or Post-Processor to handle.

Line Numbers

Line numbers are numbers in the margin for each line, or sometimes every fifth or tenth line, and are common in books of poetry or drama. Since poetry or metred drama will not be reformatted in the e-book version, the line numbers will be useful to readers.

[December 2012 updated]Proof the line numbers by using six spaces to separate them from the other text on the line so that the formatters can easily find them. If the line numbers are located on the left side of the column of poetry or drama, please move them to the right side--on the same line.

Bracketted Lists

When multiple lines in a list are bracketted (to indicate ‘spoken by multiple actors’ in a drama, for example) either at the start or end of the line, proofers are asked to use curly brackets to indicate this inter-line linking. These will be preserved by the PPer in text versions, and possibly replaced by a “big moustache” or large bracket in the HTML version table.

Parts of a book, special text forms, or other situations

Front/Back Title Page

Proofread all the text, just as it was printed on the page, whether all capitals, upper and lower case, etc., including the years of publication or copyright.

Sample Image:

Title Page


Correctly Proofread Text:


GREEN FANCY

BY

GEORGE BARR McCUTCHEON

AUTHOR OF "GRAUSTARK," "THE HOLLOW OF HER HAND,"
"THE PRINCE OF GRAUSTARK," ETC.

WITH FRONTISPIECE BY

C. ALLAN GILBERT

NEW YORK

DODD, MEAD AND COMPANY

1917


Table of Contents

Proofread the Table of Contents just as it is printed in the book, whether all capitals, upper and lower case, etc. Page numbers should be retained.

Ignore any periods or asterisks (leaders) used to align the page numbers. These will be removed later in the process.

Sample Image:

Title Page

Correctly Proofread Text:

CONTENTS

CHAPTER PAGE

I. The First Wayfarer and the Second Wayfarer
   Meet and Part on the Highway . . . . . 1

II. The First Wayfarer Lays His Pack Aside and
    Falls in with Friends . . . . . . . . 15

III. Mr. Rushcroft Dissolves, Mr. Jones Intervenes,
   and Two Men Ride Away . . . . . 35

IV. An Extraordinary Chambermaid, a Midnight
    Tragedy, and a Man Who Said "Thank You" . 50

V. The Farm-boy Tells a Ghastly Story, and an
    Irishman Enters . . . . . . . . . . 67

VI. Charity Begins Far from Home, and a Stroll in
    the Wildwood Follows . . . . . . . . 85

VII. Spun-gold Hair, Blue Eyes, and Various Encounters . . . . . . . . . . . . 103
VIII. A Note, Some Fancies, and an Expedition in
    Quest of Facts . . . . . . . . . . 120

IX. The First Wayfarer, the Second Wayfarer, and
    the Spirit of Chivalry Ascendant .... 134

X. The Prisoner of Green Fancy, and the Lament of
    Peter the Chauffeur . . . . . . . . 148

XI. Mr. Sprouse Abandons Literature at an Early
    Hour in the Morning . . . . . . . . 167

XII. The First Wayfarer Accepts an Invitation, and
    Mr. Dillingford Belabors a Proxy . . . . 183

XIII. The Second Wayfarer Receives Two Visitors
   
at Midnight . . . . . . . . . . . . 199
XIV. A Flight, a Stone-cutter's Shed, and a Voice
    Outside . . . . . . . . . . . . . 221

Note: [Sep 12] this table appears to be in small caps, so the text has been changed in Upper and Lower case for later conversion to SC by the Formatters.

Chapter Headings and Other Major Headings

A Chapter Heading will start a bit farther down the page than the page header and won't have a page number on the same line. Treat all Chapter Headings as "stand-alone paragraphs" even though they may contain only a single word. And as paragraphs they will require a blank line before them even when they are the first, or only, item on the page. [Clarification effective December 2020.]

Proofread Chapter Headings as they appear in the text. Chapter Headings are often printed in ALL CAPS; if so, leave them as ALL CAPS and start them at the left margin.

Other major headings or divisions in the text such as Preface, Foreword, Table of Contents, List of Illustrations, Introduction, Prologue, Epilogue, Appendix, References, Conclusion, Glossary, Summary, Acknowledgements, Bibliography, etc., including section headings, should be proofed in the same way as Chapter Headings, including placing a single blank line before the heading as though it is a "stand-alone" paragraph. [Addition effective December 2020.]

Page Headers/Page Footers

Remove page headers and page footers, but not footnotes [see below], from the text.

The page headers are normally at the top of the image and often have a page number opposite them. Page headers may be the same all through the book (often the title of the book and the author's name), they may be the same for each chapter (often the chapter name), or they may be different on each page (describing the action on that page). Remove them all, regardless, including the page number.

Page footers usually consist of a page number (if it is not part of the page header). Sometimes you will see other letter or number codes on various pages throughout the book--they are generally for the printer's use. Very old books may have "folio" numbers at the bottom of the page. All such footers are to be removed.

Sample Image:

Header1.jpg

Correctly Proofread Text:

In the United States?[*] In a railroad? In a mining company?
In a bank? In a church? In a college?

Write a list of all the corporations that you know or have
ever heard of, grouping them under the heads public and private.

How could a pastor collect his salary if the church should
refuse to pay it?

Could a bank buy a piece of ground "on speculation?" To
build its banking-house on? Could a county lend money if it
had a surplus? State the general powers of a corporation.
Some of the special powers of a bank. Of a city.

A portion of a man's farm is taken for a highway, and he is
paid damages; to whom does said land belong? The road intersects
the farm, and crossing the road is a brook containing
trout, which have been put there and cared for by the farmer;
may a boy sit on the public bridge and catch trout from that
brook? If the road should be abandoned or lifted, to whom
would the use of the land go?

CHAPTER XXXV.

Commercial Paper.

Kinds and Uses.--If a man wishes to buy some commodity
from another but has not the money to pay for
it, he may secure what he wants by giving his written
promise to pay at some future time. This written
promise, or note, the seller prefers to an oral promise
for several reasons, only two of which need be mentioned
here: first, because it is prima facie evidence of
the debt; and, second, because it may be more easily
transferred or handed over to some one else.

If J. M. Johnson, of Saint Paul, owes C. M. Jones,
of Chicago, a hundred dollars, and Nelson Blake, of
Chicago, owes J. M. Johnson a hundred dollars, it is
plain that the risk, expense, time and trouble of sending
the money to and from Chicago may be avoided,

* The United States: "Its charter, the constitution . . . Its flag the
symbol of its power; its seal, of its authority."--Dole.
(Note the substitution of an ellipsis for the asterisks in this footnote)

Footnotes/Endnotes

Footnotes are placed out-of-line; that is, the text of the footnote is left at the bottom of the page and a tag placed where it is referenced in the text.

The number, letter, or other character that marks a footnote location should be surrounded with square brackets ([ and ]) and placed immediately following the word being footnoted[1] or its punctuation mark,[2] as shown in the text and the two examples in this sentence.

When footnotes are marked with a series of special characters (*, †, ‡, §, etc.) we replace them all with [*] in the text, and * next to the footnote itself.

Proofread the footnote text as it is printed, preserving the line breaks. Leave the footnote text at the bottom of the page. Be sure to use the same tag in the footnote as you used in the text where the footnote was referenced.

Place each footnote on a separate line in order of appearance. Place a blank line between each footnote if there is more than one.

If a footnote or endnote is referenced in the text but does not appear on that page, keep the footnote/endnote number or marker and don't be concerned. This is common in scientific and technical books, where footnotes are often grouped at the end of chapters. See "Endnotes" below.

If a footnote/endnote continues past the end of the page, do nothing special. End the first page just as if it were ordinary body text, with an asterisk only if the last word is hyphenated. Begin the next page as if it were normal body text, with an asterisk only if needed.

For example, proofread:

The principal persons involved in this argument were Caesar1, former military
leader and Imperator, and the orator Cicero2. Both were of the aristocratic
(Patrician) class, and were quite wealthy.
1 Gaius Julius Caesar.
2 Marcus Tullius Cicero. 

as:

The principal persons involved in this argument were Caesar[1], former military
leader and Imperator, and the orator Cicero[2]. Both were of the aristocratic
(Patrician) class, and were quite wealthy.

1 Gaius Julius Caesar.
2 Marcus Tullius Cicero.


In some books, footnotes are separated from the main text by a horizontal line. We don't keep this so please just leave a single blank line between the main text and the footnotes.

Endnotes are just footnotes that have been located together at the end of a chapter or at the end of the book, instead of at the bottom of each page. These are proofread in the same manner as footnotes. Where you find an endnote reference in the text, retain the number or letter. If you are proofreading one of the chapter or book ending pages with the endnotes text on it, put a blank line after each endnote so that it is clear where each begins and ends.

Footnotes in Poetry should be treated the same as other footnotes.

Footnotes in Tables should remain where they are in the original text.

Original Footnoted Poetry:

Mary had a little lamb1
Whose fleece was white as snow
And everywhere that Mary went
The lamb was sure to go!

1This lamb was obviously of the Hampshire breed, well known for the pure whiteness of their wool.

Correctly Proofread Text:

Mary had a little lamb[1]
Whose fleece was white as snow
And everywhere that Mary went
The lamb was sure to go!

1 This lamb was obviously of the Hampshire breed, well known for the pure whiteness of their wool.

Sidenotes or paragraph Side-Descriptions

Some books will have short descriptions of the paragraph along the side of the text. These are called sidenotes. Proofread the sidenote text as it is printed, preserving the line breaks. Leave a blank line before and after the sidenote, so that it can be distinguished from the text around it. The OCR may place the sidenotes anywhere on the page, and may even intermingle the sidenote text with the rest of the text. Separate them so that the sidenote text is all together, but don't worry about the position of the sidenotes on the page. The formatters will move them to the correct locations.

Sample Image:

Sidenotes

Correctly Proofread Text:

Burning
discs
thrown into
the air.

that such as looked at the fire holding a bit of larkspur
before their face would be troubled by no malady of the
eyes throughout the year.[1] Further, it was customary at
Würzburg, in the sixteenth century, for the bishop's followers
to throw burning discs of wood into the air from a mountain
which overhangs the town. The discs were discharged by
means of flexible rods, and in their flight through the darkness
presented the appearance of fiery dragons.[2]

The Midsummer
fires in
Swabia.

In the valley of the Lech, which divides Upper Bavaria
from Swabia, the midsummer customs and beliefs are, or
used to be, very similar. Bonfires are kindled on the
mountains on Midsummer Day; and besides the bonfire
a tall beam, thickly wrapt in straw and surmounted by a
cross-piece, is burned in many places. Round this cross as
it burns the lads dance with loud shouts; and when the
flames have subsided, the young people leap over the fire in
pairs, a young man and a young woman together. If they
escape unsmirched, the man will not suffer from fever, and
the girl will not become a mother within the year. Further,
it is believed that the flax will grow that year as high as
they leap over the fire; and that if a charred billet be taken
from the fire and stuck in a flax-field it will promote the
growth of the flax.[3] Similarly in Swabia, lads and lasses,
hand in hand, leap over the midsummer bonfire, praying
that the hemp may grow three ells high, and they set fire
to wheels of straw and send them rolling down the hill.
Among the places where burning wheels were thus bowled
down hill at Midsummer were the Hohenstaufen mountains
in Wurtemberg and the Frauenberg near Gerhausen.[4]
At Deffingen, in Swabia, as the people sprang over the mid-*

Omens
drawn from
the leaps
over the
fires.

Burning
wheels
rolled
down hill.

1 Op. cit. iv. 1. p. 242. We have
seen (p. 163) that in the sixteenth
century these customs and beliefs were
common in Germany. It is also a
German superstition that a house which
contains a brand from the midsummer
bonfire will not be struck by lightning
(J. W. Wolf, Beiträge zur deutschen
Mythologie, i. p. 217, § 185).

2 J. Boemus, Mores, leges et ritus
omnium gentium (Lyons, 1541), p.226.

3 Karl Freiherr von Leoprechting,
Aus dem Lechrain (Munich, 1855),
pp. 181 sqq.; W. Mannhardt, Der
Baumkultus, p. 510.

4 A. Birlinger, Volksthümliches aus
Schwaben (Freiburg im Breisgau, 1861-1862),
ii. pp. 96 sqq., § 128, pp. 103
sq., § 129; id., Aus Schwaben (Wiesbaden,
1874), ii. 116-120; E. Meier,
Deutsche Sagen, Sitten und Gebräuche
aus Schwaben (Stuttgart, 1852), pp.
423 sqq.; W. Mannhardt, Der Baumkultus,
p. 510.

Blank Page

Most blank pages, or pages with an illustration but no text, may already be marked with [Blank Page]. Leave this marking as is. If the page is blank, and [Blank Page] does not appear, use the [Blank Page] button at the bottom of the Proofing Interface screen.

If there is text in the proofreading text area and a blank page image, or if there is an image with text but the text in the proofreading area does not match the image, follow the directions for Bad Images or Wrong Image for Text.

Typing in Missing Text

On occasion, all or part of a page may not OCR very well especially if the image is poor or a script or other non-standard font has been used for the text. This sometimes occurs with title pages or the verso (the page after the title page) which usually has publishing, printing and/or copyright information in small, often italicized, text. In such a case, you can type in the missing information yourself.

It's not uncommon for the image to be good, but for the OCR scan to be missing the first line or two of the text. Please just type in the missing line(s). If nearly all of the lines are missing in the scan, then either type in the whole page (if you are willing to do that), or just click on the "Return Page to Round" button and the page will be reissued to someone else. If there are several pages like this, you might post a note in the Project Comments forum to notify the Project Manager.

If the title page of the book is substantially an illustration, please type in the text items, to form part of the text version at PP stage.

A Project Manager will generally advise if there are any pages that will require typing in.

Please DO NOT type in comments or inscriptions like "digitized by...", sometimes found on pages harvested from The Internet Archive, Google Books or similar sources.

Also see Type-in Projects.

Multiple Columns

Proofread ordinary text that has been printed in two columns as a single column.

Spans of multiple-column text within single column sections should be proofread as a single column by placing the text from the left-most column first, the text from the next one after it, and so on. You do not need to mark where the columns were split, just re-join them.

See also the Indexes and Tables sections of the Proofreading Guidelines.

Indexes

Please retain page numbers in index pages. You don't need to align the numbers as they appear in the scan; just make sure that the numbers and punctuation match the scan and retain the current line breaks. [Sep 12] You do not need to remove leaders, join broken rows of page numbers, nor add commas or extra line breaks between items.

Specific formatting of indexes will occur later in the process. The proofreader's job is to be sure that all the text and numbers are correct.

Illustrations

Proofread any caption text as it is printed, preserving the line breaks. If the captioned illustration falls in the middle of a paragraph, do not move it, but do use blank lines to set the caption apart from the rest of the text. Formatters will deal with the placement of illustrations when they format the captions.(Revised February 2022)

If the captioned illustration is located at the top of a page, do not add a blank line before it; but ensure that there is a blank line between the illustration and the rest of the text on the page, if any. If the captioned illustration is alone on a page, do not add a blank line before it.(Revised February 2022/Oct 2023, struck-out words to be removed; IonaV)

If there is no caption in the original text, then the mark-up of the illustration is left to the formatters.

Most pages with an illustration but no text will already be marked with [Blank Page]. Leave this marking as is.

In the case of illustrations that contain significant amounts of text, the contained text should be proofed as a separate paragraph immediately following the illo, for the benefit of those with different abilities. The PPer will make the ultimate decision whether to use this material. Use judgement--the letters identifying parts of a diagram do not need to be proofed. If there are a large number of illos in a project, the PM may wish to consider including extra, more specific instructions on the Project Comments page.

Sample Image: (Simple illustration)

Illustrations

Correctly Proofread Text:

Martha told him that he had always been her ideal and
that she worshipped him.

Frontispiece
Her Weight in Gold

Sample Image: (Illustration in middle of paragraph)

Middleimage.jpg

Correctly Proofread Text:

such study are due to Italians. Several of these instruments
have already been described in this journal, and on the present

Fig. 1.--APPARATUS FOR THE STUDY OF HORIZONTAL
SEISMIC MOVEMENTS.

occasion we shall make known a few others that will
serve to give an idea of the methods employed.

For the observation of the vertical and horizontal motions
of the ground, different apparatus are required. The

Tables

A proofreader's job is to be sure that all the information in a table is correctly proofed. Details of formatting will be handled later in the process. Provide enough space (minimum of 2 spaces) between entries on a line to clearly indicate where each item ends and begins. Retain line breaks within each entry.

Heading words in vertical text are to be proofed horizontally. When one or more of the heading items requires multiple rows, the words of second (third, etc.) row start at the left margin; do not use extra spaces between the words.

Remove any trailing, extraneous punctuation in the columns--e.g., dots/dashes/underlines (i.e., leaders)--except those in a cell which may indicate that there is no data or information.

Tables are much easier to proof when you use a monospaced font such as DPSansMono (updated version), DPCustomMono2 (earlier version) or Courier.

Footnotes in tables should remain where they are in the original.

It is important to read the Project Comments as the PM may provide specific instructions for proofing tables.(Revised October 2022)

Sample Image 1:

Table1.jpg

Correctly Proofread Text:

Deg. C.  Millimeters of Mercury.  Gasolene.
Pure Benzene.

-10°  13·4  43·5
0°  26·6  81·0
+10°  46·6  132·0
20°  76·3  203·0
40°  182·0  301·8

Note that the decimal points in this table are actually represented by "center dots" (·), which are available from the drop-down menu on the proofing Interface. The PPer may choose to replace them with ordinary decimal points, but you should match the scan.

Sample Image 2:

Table2.jpg

Correctly Proofread Text:

TABLE II.

Flat strips compared   Copper.   Iron.   Parallel wires 30 cm. in   Copper.   Iron.
with round wire 30 cm.   length.
in length.

Wire 1 mm. diameter  20  100  Wire 1 mm. diameter  20  100

STRIPS.  SINGLE WIRE.

0·25 mm. thick, 2 mm.
wide   15   35   0·25 mm. diameter   16   48
Same, 5 mm. wide   13   20   Two similar wires   12   30
"   10   "   "   11   15   Four   "   "   9   18
"   20   "   "   10   14   Eight   "   "   8   10
"   40   "   "   9   13   Sixteen   "   "   7   6
Same strip rolled up in   Same, 16 wires bound
the form of a wire   17   15   close together   18   12


Sample Image 3:

Ex5 vertical heading.jpg

Correctly Proofread Text:

Catholics  Secular Priests  Regulars  College and  Churches and  Hospitals and  Parishes and  Academies  University  Juniorates and  Religious  Convents
Seminary  Chapels  Homes  Missions  Scholasticates  Communities

Ottawa  168,300  137  162  2  135  12  136  9  1  11  26  13
Pembroke  36,000  41  ...  ...  68  2  68  ...  ...  ...  ...  4
Timiskaming  22,584  18  13  ...  38  3  38  ...  ...  ...  ...  6

226,884  196  175  2  241  17  242  9  1  11  26  23


Sample Image 4:

Tableex3.png

Correctly Proofread Text:

                         Agents.  Objects.
         { 1st person,   I,       me,
         { 2d   "       thou,    thee,
Singular { 3d    "  mas. { he,      him,
         {       "  fem. { she,     her,
         {             it,      it.

         { 1st person,   we,      us,
Plural   { 2d    "       ye, or you,    you,
         { 3d    "       they,    them,
                         who,     whom.

Poetry/Epigrams

Insert a blank line at the start of the poetry or epigram and another blank line at the end, so that the formatters can clearly see the beginning and end, but do not insert a blank line at the bottom of a page.

If you are not sure whether a stanza starts at the beginning of a page, please leave a note such as: [**unsure if stanza starts here].

Leave each line left-justified and maintain the line breaks. Do not try to center or indent the poetry. The formatters will do that part.

Do insert a blank line between stanzas.

Footnotes in poetry should be treated the same as regular footnotes during proofreading.

Line Numbers in poetry should be kept. Separate them from the main text with six [Dec 2012] spaces. See instructions on Line Numbers.

Check the Project Comments for the specific text you are proofreading for any special proofing instructions.

Sample Image:

Poetry1.jpg

Correctly Proofread Text:

to the scenery of his own country:

Oh, to be in England
Now that April's there,
And whoever wakes in England
Sees, some morning, unaware,
That the lowest boughs and the brushwood sheaf
Round the elm-tree bole are in tiny leaf,
While the chaffinch sings on the orchard bough
In England--now!

And after April, when May follows,
And the whitethroat builds, and all the swallows!
Hark! where my blossomed pear-tree in the hedge
Leans to the field and scatters on the clover
Blossoms and dewdrops--at the bent spray's edge--
That's the wise thrush; he sings each song twice over,
Lest you should think he never could recapture
The first fine careless rapture!
And though the fields look rough with hoary dew,
All will be gay, when noontide wakes anew
The buttercups, the little children's dower;
--Far brighter than this gaudy melon-flower!

So it runs; but it is only a momentary memory;
and he knew, when he had done it, and to his

Plays: Actor Names/Stage Directions

In dialogue, treat a change in speaker as a new paragraph, with one blank line between speakers.

Stage directions are reproduced as they are in the original text. If the stage direction is on a line by itself, proofread it that way; if it is at the end of a line of dialogue, leave it there. When closing up a hyphenated word leaves a formerly right-justified stage direction on its own line, leave it there--do not move it up to the end of the previous line; do not put a blank line before it.

Stage directions often begin with an opening bracket and omit the closing bracket. This convention is retained; do not close the brackets.

Sometimes, especially in metrical plays, a word is split due to page-size constraints and placed above or below following a large (, rather than having a line of its own. Please treat this as a normal end-of-line reattachment.

Sample Image:

Drama1.jpg

Correctly Proofread Text:

Am. Sure you are fasting;
Or not slept well to night; some dream (Ismena?)

Ism. My dreams are like my thoughts, honest and innocent,
Yours are unhappy; who are these that coast us?
You told me the walk was private.

Sample Image:

Drama2.jpg

Correctly Proofread Text:

Has not his name for nought, he will be trode upon:
What says my Printer now?

Clow. Here's your last Proof, Sir.
You shall have perfect Books now in a twinkling.

Lap. These marks are ugly.

Clow. He says, Sir, they're proper:
Blows should have marks, or else they are nothing worth.

La. But why a Peel-crow here?

Clow. I told 'em so Sir:
A scare-crow had been better.

Lap. How slave? look you, Sir,
Did not I say, this Whirrit, and this Bob,
Should be both Pica Roman.

Clow. So said I, Sir, both Picked Romans,
And he has made 'em Welch Bills,
Indeed I know not what to make on 'em.

Lap. Hay-day; a Souse, Italica?

Clow. Yes, that may hold, Sir,
Souse is a bona roba, so is Flops too.

Anything else that needs special handling or that you're unsure of:

While proofreading, if you encounter something that isn't covered in these guidelines and that you think needs special handling or that you are not sure how to handle, post your question, noting the png (page) number, in the Project Discussion thread (a link to the project-specific forum is in the information "table" above the Project Comments); also put a note in the proofread text (at the point where you have your question or concern) explaining the problem. Your note will explain to the next proofreader, formatter or Post-Processor what the problem or question is.

Start your note with a square bracket and two asterisks [** and end it with another square bracket ]. This clearly separates it from the Author's text and signals the Post-Processor to stop and carefully examine this part of the text and the matching image to address any issues. Agreement or disagreement of another proofreader's comment can be added, but even if you know the answer, you absolutely must not remove the comment of a previous proofreader. If you have found a source which clarifies the problem, please cite it so the Post-Processor can also refer to it.

If you are proofreading in a later round and come across a note from a proofreader in a previous round that you know the answer to, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation in the future. Please, as already stated, do not remove the note.

Previous Proofreaders' Notes/Comments

Proofers notes must be enclosed within square brackets with 2 asterisks to draw the PPer’s attention, as in [**content of note]. Proofers should not change typos in the text file, but make a note, like [**typo: text]. Short notes should be left immediately next to the material referenced. While ideally notes are placed next to the referenced text, long notes that could disguise the original format of the text can be moved to a separate line, as long as it’s clear what they reference.

Any notes or comments put in by a previous volunteer must be left in place. You may add agreement or disagreement to the existing note but even if you know the answer, you absolutely must not remove or move the comment. If you have found a source which clarifies the problem, please cite it so the Post-Processor can also refer to it.

If you are proofing in a later round and come across a note from a volunteer in a previous round that you know the answer to, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation in the future. Please, as already stated, do not remove the note.

Common Problems

OCR Problems: 1-l-I

OCR commonly has trouble distinguishing between the digit '1' (one), the lowercase letter 'l' (ell), and the uppercase letter 'I'. This is especially true for books where the pages may be in poor condition. Old style fonts may also use what appears to be a small upper case "I" rather than a "1" and it often OCRs as a lower case "i".

Watch out for these. Read the context of the sentence to determine which is the correct character, but be careful—often your mind will automatically 'correct' these as you are reading. Noticing these is much easier if you use a mono-spaced font such as DPCustomMono2 or Courier.

OCR Problems: 0-O

OCR commonly has trouble distinguishing between the digit '0' (zero), and the uppercase letter 'O'. This is especially true for books where the pages may be in poor condition.

Watch out for these. Normally the context of the sentence is sufficient to determine which is the correct character, but be careful—often your mind will automatically 'correct' these as you are reading.

Noticing these is much easier if you use a mono-spaced font such as DPCustomMono2 or Courier.

OCR Problems: Scannos

Another common OCR issue is misrecognition of characters. We call these errors "scannos" (like "typos"). This misrecognition can create errors in the text:

Examples include:

1. A word that appears to be correct at first glance, but is actually misspelled. This can usually be caught by running the spellcheck from the proofreading interface.
2. A word that is changed to a different but otherwise valid word that does not match what is in the page image. These are subtle because they can only be caught by someone actually reading the text.

Possibly the most common example of the second type is "and" being OCR'd as "arid." Other examples: "eve" for "eye", "Torn" for "Tom", "train" for "tram". This type is harder to spot and we have a special term for them: "Stealth Scannos." DP-INT has a very valuable collection of examples in [Stealth Scannos]

Spotting scannos is much easier if you use a mono-spaced font such as DPCustomMono2 or Courier.

Handwritten Notes in Book

Do not include handwritten notes in a book (unless it is overwriting faded, printed text to make it more visible). Do not include handwritten marginal notes made by readers, etc.

Bad Images

If an image is bad (not loading, chopped off, unable to be read), please put a post about this bad image in the Project Thread. Do not click on "Return Page to Round"; if you do, the page will be reissued to the next proofreader. Instead, click on the "Report Bad Page" button so this page is 'quarantined'.

Note that some page images are quite large, and it is common for your browser to have difficulty displaying them, especially if you have several windows open or are using an older computer. Before reporting this as a bad page, try clicking on the "Image" line on the bottom of the page to bring up just the image in a new window. If that brings up a good image, then the problem is probably in your browser or system.

It's fairly common for the image to be good, but the OCR scan is missing the first line or two of the text. Please just type in the missing line(s). If nearly all of the lines are missing in the scan, then either type in the whole page (if you are willing to do that), or just click on the "Return Page to Round" button and the page will be reissued to someone else. If there are several pages like this, you might post a note in the Project Thread to notify the Project Manager.

Wrong Image for Text

If there is a wrong image for the text given, please put a post about this bad image in the Project Thread. Do not click on "Return Page to Round"; if you do, the page will be reissued to the next proofreader. Instead, click on the "Report Bad Page" button so this page is 'quarantined'.

Previous Proofreader Mistakes

If the previous proofreader made a lot of mistakes or missed a lot of things, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation so that they will know how deal with the issue in the future.

Please be nice! Everyone here is a volunteer and presumably trying their best. The point of your feedback message should be to inform them of the correct way to proofread, rather than to criticize them. Give a specific example from their work showing what they did, and what they should have done.

If the previous proofreader did an outstanding job, you can also send them a message about that—especially if they were working on a particularly difficult page.

Printer Errors/Misspellings

Correct all of the words that the OCR has misread (scannos), but do not correct what may appear to you to be misspellings or printer errors that occur on the scanned image. Many of the older texts have words spelled differently from modern usage and we retain these older spellings, including any accented characters.

If you are unsure, place a note in the txet [**typo for text?] and ask in the Project Thread. Include the two asterisks ** so the Post-Processor will notice it. [May 2012: removed comments related to correcting text as DPC does not want proofers to make corrections to printer errors or typos.]

This same comment applies to discrepancies in spellings—which is very common in old books. If you should note differences in the spellings of words on the same or different pages, place a [** comment] next to the word(s) to explain. For example: honour [** spelled "honor" above] or chateau [** spelled "château" on previous page]. Such differences in spellings are perfectly correct, but the printer or typesetter may have been inconsistent. The Post-Processor will determine which spelling to use or whether a Transcriber's errata note will be included in the posted version.

Factual Errors in Texts

In general, don't correct factual errors in the author's book. Many of the books we are proofreading have statements of fact in them that we no longer accept as accurate. Leave them as the author wrote them. This also includes "politically incorrect" terms or other words or phrases we no longer use or do not consider acceptable to use.

A possible exception is in technical or scientific books, where a known formula or equation may be given incorrectly, especially if it is shown correctly on other pages of the book. Notify the Project Manager about these, either by sending them a message via the Forum, or by inserting [**note sic explain-your-concern] at that point in the text.

Uncertain Items

If you should come across a situation where none of the above instructions or suggestions clarifies what you are to do and/or you have not received a satisfactory answer to your question(s) in either the Project Thread or one of the other forum threads, the best thing to do is to ask a direct question on the subject in the Project Forum. Make sure you fully explain the problem, concern and/or question (as much as you are able to) and then go on to the next page. Do not worry overly long about something that is trivial or minor; perhaps a more experienced proofer in the next round (who has not seen the Project Thread) will know the answer.

If you are so concerned that you'd like direct feedback, mention the specific page where the problem occurred and ask the next proofer of that page to send you a private message about how you could/should handle a given situation in the future.

You can always send a message to the project manager asking how you could/should handle a given situation.

See also Formatting Guidelines.