In the early years of computing, machines were so expensive that nobody had more than one of them. Douglas Hartree speculated in 1951 that the United Kingdom might need a total of four computers [Corn]. We now have a world-wide digital library, the Web. Computers are cheap, and libraries have many of them. How should a library be organized today?
The digital library contains both many different media and many different subjects. In a traditional library, it is common for the top level organization to be by form: photographs, sound recordings, maps, and so on are stored separately, rather than placing each photograph of an individual next to that person's biography in the bookshelves. The materials in a digital library, however, are all in equivalent formats. A library may use various digital media: disk, tape, cartridges, and so on; but these are interchangeable for storage of bibliographic citations, text, images, or sounds. Thus, the conventional special collections in a library might well be unified with the book collections in similar subject areas.
Digital libraries focus less on collections than on access. Since material is likely to be available with almost equal speed from any online site, librarians will be using their expertise much more to find material outside their physical location, as well as knowing what is directly stored in their building or under their control. The organization of a digital library will reflect what users want rather than what the librarian has been able to afford to buy. A digital library will try to provide support for searching and acquiring information, and get more involved in the overall user information needs. Whether libraries will charge for any of the information fetching services is not yet known and will depend on the individual library and individual information requests.
Many undergraduates today rely heavily on the World Wide Web as their
information source. With about two terabytes of text on line, or the equivalent
of perhaps two million books, this is a full-text indexed free library.
It is increasing in size by a factor of ten every year, and is likely
to catch the Library of Congress next year.
As a collection, it is strong in areas like disk drive prices, current gossip, new technical reports, and corporate ads; while it is weak in traditional scholarship (and anything before 1995). But it is open 24 hours a day, convenient, and the exclusive source of information for some people; to quote one Cornell undergraduate asked to look something up in the traditional way, ``I don't do libraries.'' Given the Web, we no longer can ask whether we should have a digital library; we're just arguing about its content. What libraries may find themselves doing is giving more help and advice to those who are less competent to evaluate the content and utility of what they have found so easily.
Three major new subjects are imposing themselves on libraries: they are technology, law and economics. Each will have effects on digital libraries and how they are organized.
The new technology is being used for a variety of functions, which range from finding aids to video delivery. From the familiar OPAC to the online journal delivery system, people are accustomed to using workstations to obtain information. What does the library have to do?
The library must maintain descriptions of the information it holds, in whatever format. This is typically stored in the form of MARC records and delivered via a commercial OPAC. Sometimes integrating these systems with actual information delivery is hard; they are designed for, and specialize in, the description of printed books. It is more likely that the catalog system will be linked to a book circulation system than that it will be linked to an online delivery system. The input to the description system is more and more likely to come from central sources such as national cataloging authorities; cataloging, as a skill, is exercised less frequently in libraries now that records are routinely retrieved from central cataloging files.
Online delivery is often done off the premises of the library. Reference libraries need seating for their users; a typical guideline in the United States is that the number of seats in the library should be at least 1/4 the number of students on the campus. Nowadays it is expected that most students have computers in their dormitory rooms and they are connected to the campus network. The number of people physically present in the library is no longer a good measure of how many people are using the library's services. Not only may electronics help with the need for ever more bookstack space, it might alleviate the need for chairs and tables as well.
Whether the demand for seats actually goes down depends on whether access to primary material on the desktop starts to replace reading in the library. Until now most people have done bibliographic work on their terminals but then read online. Don King suggests that searching takes perhaps 1/5 of the time engineers spend with information, with 4/5 of the time spent reading. Thus, the improved bibliographic search capabilities of computer catalogs have often increased the time users spend in libraries, since they find more things to read. However we now see people reading more and more material on their desktop (or at least printing it outside the library). As this spreads, the need for libraries to provide places to read will decline.
Thus, what does the library need technologically? It needs above all networking capacity. A future organization calling itself a library might have no physical site with shelves and reading rooms. Instead, a physical complement of routers, and a personnel complement of people knowing where to find things, could make a virtual library effective. The online bookstore Amazon.com does this for retail sales; there is no shop, no stock, and only remote access.
The machines involved can be located anywhere; there is little reason to use an expensive central-campus building with impressive architecture to hold machines only a few staff ever see. Nor do reference personnel working mostly over the telephone need to be anywhere central; some computer companies provide their telephone support for all of Europe from a single site in Ireland. So the digital library needs less attention paid to its physical space (admittedly today this function is often provided elsewhere in an academic organization), and considerably more to its technology -- disk drives, computers, routers, tape storage units, and the like.
In the past, a major problem with digital library implementation was the need to deal with the variety of computer hardware possessed by different users. Today, the Web has defined the standard interface which everyone must support, and we no longer have arguments about whether or not we can expect users to have graphic capability. Nevertheless, many OPACS are still designed for extremely dumb terminals, and will be phased out in favor of more attractive Web-based systems.
There are still incompatibility questions, of course. Some models of the digital library involve image scanning of printed material, particularly for retrospective conversion; others involve ASCII-based text. In principle, much of the image-scanned material can be converted through OCR, and such projects as JSTOR do this; but there are still other materials which are provided as images and which may be hard to search. With time, though, the advent of standards will decrease the various conversion problems that plague us today.
If a digital library needs to obtain much of its material from other sites, it needs to know what it can do with such material. Various rules such as ``fair use'' have meant that for traditional printed books most use in research libraries is permitted without additional charge beyond the original purchase. This situation is likely to get much more confusing, partly because digital information may not have the permissions associated with paper, partly because multi-media information already has different rules from printed paper, and partly because there are many revisions in the law under consideration.
The purchase of digital information often involves a license rather than a traditional purchase, even though it looks like a purchase to the library. In a recent case (Pro CD v. Zeidenberg) the U. S. courts have upheld shrink-wrap licenses, even when the material protected by the license would not normally have been protectable by copyright. Licenses may limit the people allowed to access the material, and they may require accounting of how many accesses are made, or even payments proportional to either the number of users or the time required.
Libraries need to worry about some of the requirements vendors may try to place upon the use of their material. In particular, U. S. libraries protect the privacy of the records of which patrons use which information resources. Information vendors are often very anxious to find out who their users are and libraries may have to be careful about contracts which might ask for delivery of user data. The most difficult issue to date has been the definition of the library's user community. Many libraries serve an extended community which may, for an academic library, include alumni and even members of the public (often state universities or land-grant universities have obligations to serve the general public in some area). Publishers are particularly anxious that remote use be limited to some reasonable extension of students. staff and faculty.
What this means to the organization of a library is that considerable effort must be spent tracking the library's activities with respect to the contracts signed with information vendors. The library will be trying to persuade publishers to agree to unlimited use site licenses, and the publisher may prefer some kind of pay-by-use (although many publishers also prefer flat-rate licenses to avoid the administrative costs and to get their money ahead of use rather than afterwards). The library may prefer to join a consortium for purchasing digital material as a way of decreasing administrative costs and increasing bargaining leverage.
The problems with non-print material are already difficult. There is no ``fair use'' on recorded music, for example, so that libraries should not in general be making copies of published sound recordings without obtaining permission. Video material is often involved in a great many complex copyright difficulties, and little of it is actually old enough to be in the public domain. Perhaps the best known example is the movie ``It's a Wonderful Life'' whose copyright holder neglected to renew it after the first 28-year term. When television stations, realizing that this movie could be played free, broadcast it so frequently that it turned into a cult favorite, the studio investigated and found that although the movie copyright had not been renewed, the publisher of the story on which the movie was based had kept the story in copyright, and so it was possible to recapture control. Each new technology has created new legal problems, since the courts have typically held that contracts to publish in one format do not transfer the right to future formats. The result has been a rash of language in copyright forms about every format now known, imagined, or to be invented, some of which may be valid.
Proposed changes to the United States law raise further difficulties. Among the important issues to libraries are digital fair use, registration, digital preservation, moral rights, data base protection, and liability for plagiarism, libel, and incorrect information.
Digital fair use. The White Paper proposed last year for copyright revisions would have declared there to be no fair use of digital materials (and also clarified that digital transmissions were indeed an infringement of copyright). Potentially, this requires negotiations with copyright holders for virtually all digital use, imposing a heavy burden on libraries and their users for what may now be thought of as commercially insignificant uses. The proposal was not adopted (partly because Congress simply ran out of time), but its proponents have not gone away.
Copyright notice and registration. Under the Berne convention, already in force in the United States, published works no longer need to be registered with the Copyright Office nor need they carry a statement identifying the copyright owner and date. Any material is protected from the date of creation. This means that in the future a librarian may be faced with a printed document which contains no author name or date, and yet need to know whether the author has been dead for fifty (or perhaps 70) years. Fortunately this problem will not arise until about 2038; few works going out of copyright before then escaped the earlier notice requirements, with an exception for certain European materials.
Digital preservation The current copyright law allows libraries to copy deteriorating materials for preservation purposes, but only to make facsimile copies. It is proposed that the next change in the law permit digital copying for preservation, but this is of course uncertain. There are actually two separate issues, both now requiring permission: one is the scanning of deteriorating paper material to convert it to digital form, and the other is the conversion of one digital form to another as technologies become obsolete and we need to convert, let us say, CD-Rs to DVD.
Moral rights. The United States has started to adopt the principles of ``moral rights'' in which even after selling the copyright, the creator of a work retains certain rights, particularly an entitlement to be identified as the author and to prohibit some kinds of changes. This is already true for visual works in the US. The law will have to develop to be clear whether this constrains libraries. As an example of what the adoption of extended European concepts might mean to U. S. libraries, consider the effect on an architecture school library of the French rule that you can not copy a picture of a building without the permission of the architect.
Data base protection. The WIPO treaties proposed last year, but not yet ratified, would have provided that factual information in databases would again be subject to copyright, restoring the situation somewhat to the rules that prevailed before 1991. The treaty did provide for exemptions for excerpts from data bases, but nevertheless it was attacked by many who felt that academic research would be severely constrained. With many revisions likely before anything like these treaties become law, it is not clear what the impact on libraries will be.
Liability. Traditionally, libraries have not been responsible for the content of the books they give their patrons. In a digital world it is not clear whether this will change. There are at least three issues: plagiarism, libel and tort liability. The plagiarism risk is that a change in the law may make people other than the original publisher responsible for copyright violations. Publishers are asking for this as a way of dealing with the amount of material that may be placed in digital form by individuals with minimal financial resources, who are not worth suing; they hope to place a responsibility on intermediaries to at least block plagiarized material, even if not to pay damages. The libel risk is the possibility that distributors of information will be sued as well as originators. In the United Kingdom, John Major recently recovered damages from the newsstand operators W. H. Smith and John Menzies for a libel published in a magazine with few financial resources. So far, this is not a problem in the United States. Tort liability is the risk that someone who gets bad information from a publication (e.g. bad investment advice or health advice) will try to sue the library as well as the author or publisher. An example is the lawsuit brought against the publisher of an aeronautical chart with a claim that the chart was badly designed and contributed to an accident. Again, this is not a problem for libraries today. However, as libraries become more ambitious in their provision of services, they may start to look more like publishers (recommending what to read, putting together packages for students in classes, etc). As they do this they will run the risk of increased legal responsibility for what is chosen.
The message from all of this is that the digital library involves a larger legal effort than does a more traditional effort, and the organization of the library will have to reflect that.
A digital library, able to obtain material from remote sources, will be regularly purchasing things on-demand rather than entirely by payment in advance (what is called ``just in time'' rather than ``just in case'' obtaining of information). This will certainly mean a great deal of transactional purchasing. Whether the library also needs to do transactional selling, i.e. chargebacks to its users, is not clear. Many libraries believe strongly that charging users is a bad idea, whether for reasons of tradition, or because they fear discouraging the use of information, or because feel that centralized provision is a more efficient practice.
However, as libraries become involved in more and more specifically priced services, some of which are very expensive, it will be tempting to do chargebacks. This is certainly standard economic wisdom. Experience is, however, that ordinary users place a value on predictability; they shy away from services with per-minute pricing where they can not know what they will wind up spending. Telephone companies have found in the past that people often choose flat rate service rather than measured service even if measured service would be cheaper; they fear that some month they will make an unusual number of calls and wish that they had flat rate pricing. Libraries are large organizations, and may be a good place to provide for averaging of these costs.
In fact, the new economics of publishers offers an opportunity for libraries or consortia of libraries. The typical publisher is small, selling 1.2 journals. Our familiarity with (and the high public profile of) Reed Elsevier and Wiley cause us to overlook the large number of small societies that publish one magazine. These societies are as frustrated as libraries by their inability to understand how they can thrive in the electronic environment of the future, and it may make sense for libraries to act as the agent for managing their distribution. The Stanford University Library, for example, has set up High Wire Press as an electronic printer for several journals, most notably the Journal of Biological Chemistry.
Here are some numbers from the American Economic Association, which publishes three journals:
The future economics of libraries are very unclear. If libraries are going to obtain information on demand from other sites, and deliver it to people on their office desktops, they begin to sound as if they will compete with bookstores. If they are going to provide detailed advice on what kinds of things to read, they become more like teachers. And if they provide descriptions of information, they become more like publishers. Libraries have always had these different roles, but they had in the past been entirely subordinate to the basic job of choosing and accumulating printed matter. When the basic primary information may be available online, these other functions will become more important. This will affect how libraries are funded and how they budget, and may force them to consider charging for some of the new services and reorganizing to emphasize support over storage and selection.
In all of the discussion about charging and copyright, do not forget that much of the material in libraries is not published for a fee. University dissertations, technical reports, government documents and other sources make up much of what scholars use today and will continue to use. Libraries should not, however, be encouraging their patrons to use second-rate or erroneous material just because it is available without charge. Again, some degree of judgment and assistance will be required in the digital library, just as it is today.
In summary, economics is another topic that will occupy more and more attention in libraries, and require a greater presence in the organization chart.
Libraries need to cooperate for several reasons. First, it spreads the need for expertise and resources over a larger group of institutions and thus lowers the burden on all of them. Secondly, it increases the bargaining power of the libraries against others who may try to capture the benefits, particularly the commercial benefits, of the new technology exclusively for themselves. Thirdly, it encourages the development and adoption of standards to make everybody's training problems easier. The new technology not only pushes towards cooperation for these reasons, but also makes it feasible.
The same technology, of course, is encouraging cooperation among other
groups. In the first half of the 1990s the fraction of papers with a British
first author and a second author from some other country doubled, according
to Derek Law. Another study is shown in the chart below, which was made
by examining multi-author papers in one particular
journal, and counting the fraction of times all authors were
from the same institution. This measure dropped in the 1990s.
If groups are going to work together over longer ranges, will they be unified or cooperating? Will we find that larger libraries buy up smaller ones, the way chains of drugstores or hardware stores replace individual businesses? Or will we find a set of cooperating but still independent operations? Even within one university, will we find that staff costs and operations cost make it less reasonable to operate many small libraries, and instead start to concentrate effort in a few large places?
In academic libraries, I expect cooperation to win out. Most university administrators still consider their libraries to be a particular institutional strength, and are unlikely to be willing to forfeit this distinction in favor of simply being a branch location of Library-Mart or some such vendor. Librarians also have a strong tradition of cooperation, dating from interlibrary loan and cooperative cataloging. I hope that academic libraries can build on this history, and develop a sufficiently smooth and rapid cooperation so that administrators above them see no reason to try to outsource the library.
Centralization within a campus, however, is another matter. Here there may well be cost savings without any perceived impact on the reputation or independence of the organization, and the value to the users of a small local branch library may not be defendable as electronics makes more services available at a distance. Furthermore, to the extent that separate libraries are supporting different kinds of collections (photographs, sound recordings, etc.) the electronic storage devices that support all of them decrease some of the reasons for keeping such libraries independently housed.
A particular point where centralization or cooperation may be very important is preservation. In the digital world, just as in the traditional paper world, libraries will have to keep material around. Sometimes publishers may provide this service, but sometimes libraries will find themselves wishing to maintain digital files. This involves some degree of simple systems maintenance, and some reformatting for new software. Libraries may well find that it is more efficient to do this only once per document throughout at least the nation if not the world, and need to organize how preservation responsibilities will be handled. A likely strategy, for example, is that only one library converts a document to a new format and then distributes copies to a few other sites to be safe against earthquakes, floods and hurricanes. The recipient sites would verify the copies and then absorb them. This saves effort at the expense of organization.
One example of cooperation is the United Kingdom system of purchasing site licenses for the entire academic community through the UK Office of Library Networking (UKOLN). This has given the UK libraries greater bargaining power with the commercial publishers, albeit at the expense of some flexibility for individual libraries. Cooperatives such as this also save administrative costs, and are likely to grow.
Here are the top-level organizational units of the Cornell library, as posted on the Web:
|Technical Services: cataloging, ordering, receiving|
|Development and Public Affairs|
|Technical Services Support|
Even smaller libraries tend to have acquisitions, cataloging, circulation, and reference. Here are some other sample library organizations, sometimes with a more subject-oriented structure.
|Victoria University, Wellington, NZ||Davidson Library, University of California Santa Barbara||Peking University, Beijing, China||HK UST, Hong Kong|
|Circulation||Arts||Ancient Books||Archives & Special Collections|
|Periodicals||Library Personnel||Branch Libraries||Collection Development|
|Reference||Map and Imagery||Cataloging||Document Supply|
|Technical Services||Reference||Circulation||Media Resources & Microfilm|
|Sciences and Engineering||Document Services||Reference|
Certainly we will still have accounting, human resources, shipping/receiving, and development/public affairs. Human resources are likely to be more important as the range of skills needed in libraries increase. Acquisitions and Cataloging may change their focus, to emphasize evaluation and helping users, rather than operating entirely behind the scenes. Collection development may become a more difficult task, given the need to consider the wide variety of stuff on the Internet and help users judge what is useful. Facilities, Photocopies, and Access Services (library cards) are likely to become less important, along with circulation, as more and more people use the library without walking in the door. Preservation will have a whole new set of problems, but will still be there. Library Technology and support will similarly face new problems but larger ones. And as mentioned before, efforts in legal matters and economics will probably be added to the plate, along with greater needs for reference services and training.
Cornell maintains a variety of specialized libraries in such topics as the Hotel School, Industrial and Labor Relations, Africana and Veterinary Medicine. Whether or not it makes sense to keep separate buildings for these subjects, the staff still needs expertise in the areas,
Since no one forsees any lessening of the economic pressures on universities in general, libraries will still see severe budget limits. Yet the digital library movement adds a need for staff trained in computer technology, who are expensive, and any staff reductions seem mostly likely to occur among the relatively lower-paid staff who do physical materials handling. Reference work, and other interactions with patrons, will remain important. Libraries today are seeing increased traffic and circulation as a result of online catalogs; this also puts general burdens on buildings and staff.
Libraries may wonder where they can find technically trained staff. The number of computer science bachelor's degrees given in the United States dropped from 39,000 in 1985 to 24,000 in 1994. In fact the general level of computer literacy is increasing greatly, and people can be found with the necessary skills, but not at the salaries traditionally paid for library assistants. As libraries become part of a vaguely defined `information industry' they will find themselves competing for staff with telecommunications companies, broadcasters, and publishers, plus new industries such as Website maintainance. So they will need to pay higher salaries at a time when total budgets will continue to feel pressure.
What can libraries do about this? One possibility is charging for services as a way of boosting revenues; another is to try to cut costs further. Charging for services is disliked by a great many librarians (and some university administrators). There are new services in the digital world that don't correspond directly to services now provided free (e.g. delivery of information to dormitory rooms), but it seems unlikely that university students or university research grants are suddenly going to turn into a source of funds for library support.
Reducing costs seems more practical. If a few computer nerds can replace a great many shelving clerks, their higher salaries may still cost less in total. More practically, libraries might be able to cooperate in the provision of support services as well as in sharing collections. If library consortia could develop shared software packages, and perhaps maintain shared computer centers, this might help keep down costs at individual libraries. We might see a world in which what we now think of as a library is really just a staffed outlet, a branch, to some kind of backroom operation shared among several universities. The hope would be to minimize the amount spent on computer and backroom staff, in favor of money on reference staff and others providing direct service to the patrons.
An alternative to providing technical support to many libraries at once would be to try to share technical efforts with other parts of the university (as libraries now share building maintenance). Even today university administrative computing often supports libraries. The special needs of libraries (and for that matter of administration) are likely to render this rarer in the future. As the cost of computing becomes less and less hardware, and more and more software, the advantages of sharing co-located hardware decrease, while the advantages of sharing software with other libraries increase.
The easiest form of sharing is of course among branch libraries in the same university system. As more and more work is done in student rooms, and less and less in library buildings themselves, and as the need for special collections and special handling decreases, we can expect that reducing the effort spent in maintaining many branches will decline. Branches might become bookless `clinics' devoted only to help and assistance, or might be supplanted by telephone help-lines, perhaps using groupware programs to simplify the job of assisting a student. Again, as the physical location of more and more material becomes irrelevant, libraries will save money by not buying multiple copies of material, but by buying access rights and actually storing materials only in a few sites, located where land is cheap. The purpose of central campus buildings will increasingly be to provide personal contact for patrons, not storage of objects.
The organizational structure of the digital library thus looks more centralized in some areas and more distributed in others. Acquisitions and storage merge, into a general capacity for information provision. That is probably managed centrally for groups of libraries, perhaps spreading across universities. Some of the provision is obtained by buying new electronic publications; some by converting older materials in the possession of the individual library; and some by licensing access from other libraries. Technical services becomes a computer program, run perhaps at a distance. Reference and training, however, are still local functions, and more important than ever, as it will be a while before the use of electronic resources is as transparent as that of books. Libraries will have to train their users in searching, in judging the value of what is found, and in the debugging of computer network problems.
Library budgets and administration are thus entering a time of flux. We can imagine a future in which virtually everything is digital, stored at sites related not to where they are used but where they can be found. Libraries spend their efforts helping patrons and not taking care of books, since the care and feeding of disk drives has been relegated to the remote centers. More of what librarians do will look like teaching, and it will deal more with people and less with buildings and materials stored in the buildings. Moving from this world to that one will be difficult in a time of limited budgets; substitution of electronic purchasing, closing of branches, cooperative agreements, and innovative ideas for services will all be needed.
The traditional library selects, stores and supplies information. Those functions are still around in a digital world. However, the selection will be increasingly on-demand, the storage may be off-site, and the supply may be electronic Web page delivery. These new techniques are more complex to the user, and will require extra training, assistance, and help. Perhaps the most difficult change libraries will face is persuading the administrators above them that they need to be valued and supported in different ways. Success as an academic library in the future is not a matter of piling up books, but of satisfying readers. The traditional metrics of books and chairs (``bums on seats'', in the British phrase) won't make sense. Libraries need to be encouraged and funded to provide assistance to readers rather than raw material. They need to be evaluated on their success in reference questions, training, and not on acquistions. Changing the administrative view that universities have of their libraries may well be harder than anything to do with CPUs or bytes.
[Corn] Joseph Corn, Imagining tomorrow: history, technology, and the American future, MIT Press, Cambridge, Mass. (1986); pages 58 and 190.
[Follett] Sir Brian Follett et al., Joint Funding Council's Libraries Funding Review Group: Report, HEFCE (Higher Education Funding Council for England), Coldharbour Lane, Bristol BS16 1QD (1993).
[Mellon] Anthony Cummings et al.,University Libraries and Scholarly Communication, Association of Research Libraries for the Andrew W. Mellon Foundation, Washington, DC (1992). Chapter 4.
[SFPL]. Elizabeth Reveal et al., San Francisco Public Library Strategic Audit, Coda Partners, Washington, DC (1997).