|
|||||||||
|
March/April 1998 | Contents
How Accurate Are Your Archives?
Newspapers by Bruce William Oakley
Oakley is editor of Arkansas Online, the Internet edition of the Arkansas Democrat-Gazette in Little Rock. John Brummett knew he'd made a mistake, and now it was staring him in the face. Brummett, a political columnist at the Arkansas Democrat-Gazette in Little Rock, couldn't let this error get published. He had written that a public figure served time, when in fact the conviction was overturned and the man had never been behind bars during his appeal. Fortunately, Brummett realized his error early, grabbed the page proof, and fixed the incorrect passage. He gave the proof to the page designer, the corrections were made and all was well. Or so he thought. Several days later, the public figure's attorney wrote a letter that quoted the mistaken passage and demanded corrective action. Brummett was stunned to learn that his mistake, which never appeared in newsprint, had made it into the paper's archive in the Lexis-Nexis database. Eventually, the issue was settled amicably. But how had the problem occurred? The Democrat-Gazette had a computer problem. In this case, page-proof corrections had been made in the design stage but did not go back to the editorial version captured for the archive. The paper in effect transmitted a draft to Nexis, and inaccurate material that was never printed became readily available in cyberspace. Is this the rarest of disasters or dangerously common? I studied the question for four months at the University of North Carolina, under a grant from the John S. and James L. Knight Foundation, and found that misunderstood technology, misguided assumptions, poor planning, and plain inattention all play roles in dirtying newspapers' electronic archives. And the situation is worse than you probably think. I compared articles in the commercial electronic archives, such as Lexis-Nexis or DataTimes, of four newspapers to the paper versions from their national and local fronts on arbitrarily chosen dates. Not one archived version flawlessly matched newsprint. The errors ranged from incorrect punctuation to incorrect headlines and bylines. There were also more substantial errors. A search of the Democrat-Gazette's electronic archives for corrections showed that not one of five corrections published in mid-March 1997 was in the electronic archive in mid-April, either standing alone or attached to the inaccurate article. One librarian, at the Nashville Tennessean, sighed and said that microfilm or microfiche of the real, printed newspaper is the archive of record; the digital version is secondary. But rare is the reporter -- or law firm, student, or ordinary citizen -- who will turn to microfilm if a digital record exists. This librarian lamented that there is no way for her small staff to take about 100 articles a day and "edit them line by line and get it right." My study uncovered problems at every step -- from the first capture of information to the last connection between a commercial database and a searcher. Some problems involve elementary typography, but they can create dangerous misimpressions. In the Tennessean's archive, for example, parentheses around explanatory information in quotations were missing, dangerously attributing words that had never been spoken. The parentheses presumably were lost in sloppy computer translation. In some cases, errors were introduced in archives when information had to be retyped, rather than copied and pasted from existing files. Even at The News & Observer in Raleigh, North Carolina -- which has a large library staff and strong quality control -- minor errors slip though. Archival quality control is a never-ending task that extends beyond the newsroom. Jackie Chamberlain, library director at The Press-Enterprise in Riverside, California, began wrestling in mid-1996 with a software problem resulting in truncation of articles after they were transmitted to the Lexis-Nexis commercial archive. In mid-1997, the problem -- a coding glitch -- was finally resolved. Chamberlain kept newsroom librarians informed as she worked through the problem so that papers with similar software could benefit from her findings. At the very least newspapers should check their assumptions: Do you assume the archival capture comes after final page proof corrections? Better check. Assume headlines, captions, and corrections are electronically cut and pasted where they belong rather than retyped before archiving? Better check. Assume corrected versions sent to a commercial database supplant incorrect originals? Better check. Assume that an article retrieved from a commercial database matches newsprint? Better double check. |
||||||||