Computers and the Quest for Access
1853 - Poole’s Index to Periodical Literature published
1873 – Shepard’s Citations initiated when Frank Shepard began collecting legal citations (Shepardizing)
1901 – Wilson’s Reader's Guide to Periodical Literature is first published
1958 – KWIC Indexing devised
1960 – Eugene Garfield and the Institute for Scientific Information publish the Science Citation Index
1965 – “Hypertext” term is coined by Ted Nelson.1
1971 – ERIC and MEDLINE database first become available online
1972 – Dialog offers first publicly available online research service
1975 – Ohio State University develops first large-scale online catalog (telnet access)
1981 – The PC is born
1985 – Wilson’s first electronic project, a version of the Reader's Guide, debuts
1992 – 30 online publications considered scholarly
1993 – World Wide Web becomes publicly available and first Web browsers developed
Mid 1990s – Emergence of Web search engines and Web-based OPACs
2004 – Google Book Search initially launches as “Google Print”
2006 – Debut of next-generation online catalogs
2007 – WorldCat Local launches local discovery and delivery services
2007 – iPhone released
2010 – iPad released
2010 – 78,130 publications considered academic/scholarly by Ulrich’s Periodical Directory; 34,723 of these are online
Christine L. Borgman delineates the history of early online catalogs (OPACs).
The first generation of online catalogs followed either of two query-oriented design models: Online “card” catalog models, emulating the familiar card catalog, or Boolean searching models, emulating information retrieval systems such as DIALOG or Medline. Second-generation online catalogs merged these two design models and improved access points, search capabilities, and display options ( Hildreth, 1987, 1993). Most online catalogs currently in use provide second-generation functionality.2
Evolution of OPACs
-
First Generation: computerized access to catalog records using MARC bibliographic format
-
Second Generation: Keyword searching, Boolean operators and eventually Web graphical interfaces.
-
Next Generation: Linkages to external sources such as book images, partnering with external sources such as bookstores and outside partners, meaningful relevance rankings, community tagging, useful post-search modification aids such as facets.
The emphases justifying these indexes were speed with which they could be produced, and the low cost of production. All of this motivated, of course, by the fact that the technology was possible. Beginning with the mid- to late-1950s these indexes began to appear.
Why We Can’t Find It
We have libraries filled with hundreds of thousands to millions of books. Yet students continue to approach the academic reference desk saying, “Why doesn’t your library have anything on my topic?” No wonder they say this: our access points are deponent.
Students seem to be a bit happier when they search for journal articles. Why the difference? I call this problem “the information access anomaly.” This problem can be seen when we look at the size, structure, and extent of the surrogate bibliographic record for each respective information type compared with the full text of the item.
|
Online Catalogs
|
Online Article Databases
|
1990s
|
First gen OPACs replicate card catalogs; searching left-anchored
|
Text-based searching moving from mediated to public CD access
|
2000s
|
Second gen OPACs offer keyword and Boolean searching with Web interface
|
Web based searching of many descriptors and lengthy abstracts, but generally no FT
|
2010s
|
Next-Gen OPACs offer enhanced searching, but no FT searchingGoogle Books searches FT of books
|
Google Scholar searches FT of articles
|
It’s no wonder we are losing library users to Google and other search engines. They can actually find things there. You may have heard the phrase, “Information wants to be found.” But our library computer systems are working at cross-purposes with that. The chart below helps to illustrate why this is the case.
The Information Access Anomaly: Books vs. Periodicals
|
Book (average)
|
Journal Article (average)
|
Typical Length - full text (FT)
|
200 pages x 4001 = 80,000 words
|
15 pages x 4001 = 6,000
|
Surrogate Record (SR)
|
50-100 words (75 ave.)
|
300-500 words (400 ave.)
|
SR to FT ratio
|
1 to 10,666
|
1 to 15
|
1Ave. 400 pages per book (http://www.writersservices.com/wps/p_word_count.htm)
These differences in the ratios between words in surrogate records to words in the full text inform our search strategies (at least they should inform them). When we are searching for extended full text by means of rather brief surrogate records (as is the case with most library online catalogs), we must use very broad, conceptual thinking when searching. If we don’t we are not likely to retrieve many results. This is the reason users so often express displeasure with our public and academic library holdings. “Why doesn’t your library have anything I want to read” is a common refrain heard at library help desks.
The situation is quite a bit better in the case of journal and periodical literature. There are two reasons for this: journal articles are shorter than books, and the surrogate records that describe articles typically contain abstracts as well as a wealth of subject access terms. Thus, fewer complaints are usually heard about journal content.
As we will see in the chapter on the various Google products, the ratio problem is generally solved by Google’s indexing, since the indexing is of entire full texts, a 1-to-1 ratio of books and journal articles. But this raises another problem; with fewer constraints, how can relevancy be enhanced?
A generalization can be made at this point. One type of search strategy is needed when searching surrogate records, whereas another strategy should be employed when searching the full text of items.
Search Type
|
Recommended Strategy
|
Surrogate Records
|
Expanding strategies (Boolean AND, OR)
|
Full text
|
Constraining strategies (Proximity operators)
|
When searching index records for periodicals or MARC records for books, the records contain fields such as author, title, some kind of subject or keyword analysis, and often an abstract. Thus searching will need to be extremely broad and generous, focusing on the big picture of the work, rather than minute points that may be covered.
It’s all about what you are searching and how you are searching for it.
We go to Google and type in election statistics for colorado and we get 223,000 results. We type the same words in an academic library’s online catalog and we get 10 results. We conclude that libraries are not helpful.
Dostları ilə paylaş: |