John’s article on Open Source Software in Libraries

Modus Operandi for 21st Century Library Technology:
Using open source software to become the library triumphant

By John Brice
Executive Director, Meadville Public Library
System Administrator, Crawford County Federated Library System

“Open source” allows independently developed modules to work together in a large system due to mutually agreed principles. It insures integration and interoperability because: the system as a whole provides a framework -an architecture- that allows for both independence of structure and integration of function.
MIT Open Source Building Alliance White Paper

Introduction

Open Source Software (OSS) is any program whose source code is made available for use or modification by whomever wants to modify it. Historically, the makers of proprietary software have generally not made source code available. Open source software is usually developed as a public collaboration and then distributed freely.

OSS can be an operating system, like Linux or a application program like Firefox. The Firefox browser is an excellent example of open source software (OSS). It is a browser that supports all operating systems, it has easily added components, called plug ins, and continues to develop without the ownership of a company.

OSS is based on the principle of the scientific method. Since Renaissance times science has improved because all significant scientific principles can be tested and retested. The scientific method requires that everyone involved have free and open access to all principles, methods and written instructions. The open source process allows software to evolve as developers rewrite and improve previous programmers efforts. With OSS programmers can focus on writing the most efficient software.

Companies have recently discovered the power and sophistication of OSS . For examples Google uses an OSS operating system called FreeBSd on all of its servers. What Google has discovered is that standard PC equipment and OSS software can be integrated into a system that exploits the best capabilities of both.

Approximately eight years ago the library I work for the Crawford County Federated library System (CCFLS) stumbled upon OSS. Once we completed our first project and discovered that the software worked, could be installed on any type of equipment, was easily supported, didn’t crash and was essentially free we were hooked. Since then every time we have added technology to our library system we have evaluated OSS. In most cases OSS wins out and we begin developing our next project.

Below is an article that describes the journey we have followed using OSS and were we plan to take it into the future. Part one describes the difference between proprietary (closed source) and free (OSS) software. The second part describes how the CCFLS used OSS to implement our technology infrastructure including how we installed our new state of the art ILS system. The final part lays out a future in which libraries do not have to follow predetermined paths of outside parties such as vendors but can develop our own technology infrastructure by embracing the OSS methodology.

Part 1 Open vs. closed source technology

…high-tech companies—stop messing with us on your treadmill of upgrades while making the old stuff obsolete. It may be that any software company that didn’t routinely upgrade its product would go out of business. But what if the rest of the world worked this way? Oh, I lost a sock. I need to get a whole new wardrobe because the replacement sock is version 2.0.1, and the stores now only sell version 2.0.3.
Jamie Hynamenn – Mythbusters

In 1981 I started working as a per-diem clerical worker at the Erie County Library System. One of my duties was backing up the CLSI computer before we began operating. The computer was the size of a large desk with the hard drives housed in draws large enough to fit the paperwork for a LSTA Project. Everything was prepackaged and worked out of the box, so to speak. Had a problem call Boston, need more speed call Boston, need a catalog module, wait for Boston to write a module and then send a check.

Of course as the years progressed circulation systems added OPAC’s, Acquisition, Periodical modules, etc. The original vendors went out of business or merged with other vendors. As more libraries became customers it was impossible to tailor the product for your own policies or circumstances. Libraries learned to change policies to fit the requirements of the program. Libraries also learned that if a system went obsolete it had to be changed when the vendor said now, often at a significant price. And Library Directors learned that they had to budget an ever increasing sum every year for “support”. Vendors learned that patron information did not have a standard so it was fair game to organize it into proprietary formats. The more proprietary the system the harder it is for a library to migrate. Wall Street hedge funds learned that if you merge two ILS companies together you can fire half the staff and move all of the customers onto one platform.

While all the above was happening to circulation systems libraries discovered that they needed personal computers, patron kiosks, administrative software, e-mail, web servers, etc. Of course when confronted with these needs libraries relied on strategies from past IT projects. We purchased canned software from a vendor and then figured out a way to integrate it into our operations or more likely, rearrange operations to fit how the software works. One mainframe circulation computer morphed into Local Area Networks then Wide Area Networks and then Virtual Private Networks. All the time more software was being purchased and of course licenses had to be purchased and of course there was the support costs. In fact many libraries had to purchase software just to manage all of the digital rights of all the software they had purchased.

The Crawford County Federated Library System (CCFLS) did all of the steps described above. By 1998 we had multiple vendors supplying us with software to do all of the tasks required to run a 21st century library. I was not happy though for two reasons. First, the vendors controlled the architecture and purchasing decisions for my network. I was constantly throwing away good serviceable hardware because some new piece of software needed a faster cpu or more RAM. Secondly, I was sick and tired of having to work around the software. In 1999 we decided to start using open source in our operations.

Part II Implementing Open Source

Talk is cheap. Show me the code.
Torvalds, Linus (2000-08-25).

The CCFLS is now very comfortable in implementing Open Source Solutions in our nine member libraries. We learned to build upon success. We view our open source projects as building infrastructure. We started implementing small projects that were not necessarily mission critical. As we became more comfortable rolling our own systems we developed the confidence to tackle more difficult tasks.

It all started rather innocently enough. In 1999 we needed a router/firewall for our seven rural libraries. One of my rural Directors was married to a retired meteorology professor, Ben Bullock. Ben had been using Linux for five years to run his home brewed weather station. Ben suggested we could do the whole project for the cost of one new computer. We purchased 10 old Compaq 486′s for $20.00 each ripped out the harddrives, installed some network cards and had the whole thing boot up off the floppy using a Linux OS called LTSP. The computers worked great. As an administrator I learned that with OSS capability does not necessarily correspond with high cost. The old routers were gradually replaced with newer higher performance equipment when the libraries switched over to broad band. Of course all of the new routers ran an open source operating system.

System wide (nine libraries) we use open source tools for our web site (Apache, WordPress, the Scout Portal Toolkit), e-mail (qmail, squirrelmail, Spam Assassin), Internet firewalling, proxying, and filtering (OpenBSD’s pf, squid, squidGuard, and Dans Guradian). We also use a thin client based Kiosk system based of the LTSP for public Internet access. Of course we have stand alone Linux workstations for both staff and the public. Even with Windows computers we still use OSS tools such as Firefox, Thunderbird, and OpenOffice.

We finally converted our old circulation system to the open source Koha ILS in May 2007. Koha has worked well for CCFLS. It provides us with superior quality software that does not crash, that is tailorable, and is very well supported by an international community of developers who are deeply passionate about the product.

The common underlying theme of successfully installing open source software is an active community of users and developers. With an active community of developers and a large number of users open source projects become self sustaining. With a large group of users there is incentive for the developers to improve and support the program. With a large group of programmers the users can expect to receive continual upgrades and can easily find support.

Getting Koha installed in Crawford County took over four years and cost over $50,000. So much for the idea of open source software being free. When we first began to investigate replacing our old DOS based system, we actually looked at another proposed open source ILS called OpenBook. Unfortunately OpenBook became vapor ware. We originally did not consider Koha a viable option because it lacked support for MARC (MAchine Readable Cataloging).

In 2002 Nelsonville Public Library funded the project that added MARC to Koha. When we visited Nelsonville we were concerned about Koha’s response time. Nelsonville was about 80% of our size and their response time was just barely acceptable. Koha had what is commonly called a scalability problem. What worked fine for a library with less than 50,000 items would not work for a library system with more than 300,000 items.

After CCFLS evaluated commercial software in 2005 we decided to go to the OSS route and selected the Koha ILS program. However, before implementing Koha we hired Liblime to create a version of the program that used the Zebra Indexing engine. This modification significantly increased the speed of the program for large library’s and library systems. The new version of Koha called Koha Zoom became operational in early 2007 and was in fact implemented in a public library in western Proveance, France in January of 2007.

While Koha’s index engine was being rewritten CCFLS designed a new user interface for the program. After a series of meetings with system librarians we rewrote the Nelsonville templates to create our own look and feel to the program. Since Koha is completely web based the client computers can be any type as long as it can open a web page. We used what some describe as a riot of colors on our pages. This was done due to help support our many small libraries which use volunteers and part time staff to man the libraries. It is very easy to tell someone to click on the big red button.

We began using Koha at our largest Library, Meadville Public Library, in May of 2007. We shut the old circulation system down at closing on a Saturday and had the new system and all of the data up and running on Monday morning when the staff came in. Not only did we change the software but we also reconfigured the computers from running Windows locally to a thin client architecture using Ubuntu as the operating system.

The staff, as a whole, has accepted Koha well. There are a number of things that the staff did not like about the program. In most cases we could fix the problems our selves or write a new script which added capability to the program. For example, some of the staff did not like the search function for titles. Se we added a new script which gave the added ability to list books in the same manner as our old circulation system. In fact the biggest problem now, is that the staff knows we can alter the program so they are regularly proposing new things to add.

As for costs we did spend almost exactly the same amount for Koha as we would have implementing a commercial vendor. The cost of the upgrade to the Koha software was over $35,000. We also spent slightly over $15,000 on new computer servers and barcode scanners. We did reuse our old Windows machines by converting them over to thin clients. The savings in using an OSS solution was what we paid for the upgrade benefited all current and future libraries using Koha. In a way it is a version of paying it forward. Libraries such as Howard County Library System in Maryland are planning to use Koha, that library could not have used Koha without the work of the Nelsonville Public Library and the Crawford County Federated Library System paid to have done to the program.

The CCFLS Network Administrator, Cindy Murdock has written a pretty good analogy concerning open source software and cost. Cindy explains it this way:

...it’s like renting a house vs. buying one… If you’re renting, you usually can’t do much to the house; your landlord isn’t likely to let you tear walls down or build addition. However, if the plumbing breaks…he’ll send someone to fix it. If you own your own house, you can do whatever you want to it, but you’re responsible for either doing the work yourself or paying someone else to do it. So OSS’s is like owning the house…

Two common complaints about OSS is that there is no support available and commercial support is much superior. We have found both complaints to be false.

Usually a commercial software company has a department called customer support which answers customer problems. Customer support bases their answers on a database of questions that was developed by the programmers and on previous answered questions. If a customer has a previous unreported problem or bug then that problem will have to go from the customer support department to the programming department. The programming department may not have the time to fix the problem so they will get to it when they can.

With OSS we make sure that the program has an active community with a Frequently Asked Questions (FAQ) list that is updated on a regular basis. Most problems can be answered through the FAQ lists. If you have an uncommon problem or discover a bug, you usually can contact the person who wrote the program or who is currently writing code for the next upgrade. In most cases the problem can be fixed through e-mails. If the program doesn’t have time to do it right know you can always ask a local programmer to help with the problem. We have done this technique using our in staff programmer, Kyle Hall.

In our experiences we have found OS software to be more reliable and we receive faster support than commercial software firms. However, you have to realize that you need access to someone, staff member or consultant, who understands Linux and is comfortable using a command line. Fortunately these skills can be learned.

Another option is to hire a commercial service to help install, manage, support your OSS. For example Red Hat supports its own version of the Linux Operating System. CCFLS has purchased a year of support from a commercial Koha service, Liblime. We hired Liblime because we will be needing help as we migrate. However, once the migrations end we may decide that we have enough in house skill to manage the program without a commercial support contract.

From the perspective of the 1980′s using a circulation system from a third party closed source vendor was a smart solution. Programming computers was difficult, computer hardware was very expensive, support for hardware in a third tier city was next to impossible and there really was only one computer to support and manage.

From todays perspective using third party closed source vendors to provide libraries with ILS software is no longer the simple solution it once was. Software can be easily written in modern computer scripts, running on hardware that is cheap and easily supported by individuals who can be trained at local community colleges. A big difference between the eighties and now is that the ILS is the primary point of entry for the public to access library resources. This makes the relationship between library and the ILS vendor extremely important in fact critical for the successful operation of the library.

As Meadville Public Library has shown any library , no matter what the size, can be a significant player in developing OSS solutions. Any library, no matter what the size, could fund developers to write installation packages, user manuals and provide help to participating libraries.

OSS may not be the solution for everyone, however, every library should have a viable open source solution option available to them. If, for no other reason, than to help negotiate the best possible price from an IT vendor.

Part 3 Open Source and the 21st century library

“The librarian must be the library militant before he can be the librarian triumphant.”
Melvil Dewey

Melvil Dewey was a library militant. His classification scheme was extremely controversial. When Dewey implemented his scheme at Columbia University the professors were so outraged, books were no longer organized by size, they forced him out as University Librarian.

Dewey understood that the industrial revolution created large quantities of book titles. The result of the new book technology was large libraries. Large libraries acted differently than small libraries. One person could no longer remember what was in a large library. Dewey was daring enough to publish his classification scheme so that it became the standard. Dewey’s genius was not just classification and cataloging but also realizing that libraries needed to follow the same standards. His work was so successful that today the public knows how libraries are organized, no matter what the size or type.

Or, I should say, they did know how libraries are organized. In the past ten years technology has usurped the card catalogs, Dewey/LC classification schemes, the Periodical Guide, encyclopedias, etc. This transition has happened swiftly and right under our collective noses. We can thank one law for all of this, Moore’s Law.

Moore’s Law is named after its author Gordon E. Moore co-founder of Intel. Moore’s Law was first published in Electronic Magazine April 19, 1965. The law states that the number of transistors that can be inexpensively placed on an integrated circuit is increasing exponentially, approximately doubling every two years. This law is the reason computer power continually decreases year after year after year. Moore’s law explains how we all can purchase a $500 laptop that is more powerful than a 1998 super computer which cost millions of dollar.

In my opinion, the most important influence on librarian ship for the past thirty years has been Moore’s Law. Why is Moore’s Law so important? Because society as a whole and librarians in particular failed to properly plan for the wholesale unleashing of ever cheaper digital technology on our society. Society and libraries have done nothing but react to the latest technology.

Instead of anticipating Moore’s Law we have adapted to it. As technology has spread over the past generation we have incorporated it into our operations. Libraries have been very good at stretching adapting and incorporating the latest digital wonder to fit into the standards and practices that past generations developed. Of course libraries have been clever enough to add and expand the old standards to make it work a little easier with digital technology however, the bottom line is we have grafted twenty first technology onto nineteenth century standards.

The results have been startling to see. While libraries cling to old standards and practices companies not tied to library school thinking have begun to access information through databases not catalogs. When the price of storing a full book digitally costs mere cents why not access information through a large electronic concordance index instead of catalog? Dewey couldn’t catalog every word in every book manually, but we can digitally.

Due to libraries failing not to anticipate the consequences of Moore’s Law libraries have lost the monopoly on information. In the past, no matter what the type of library: academic, public, special, school, libraries had the largest concentration of information. Patrons, customers, students, had to come to the library to get the information they needed. Now society has changed and so has the the perception of accessing information.

A large portion of society feels that when they need information they can get it first off the web, they only come to the library if they cannot find what they need off the internet first. As any librarian well knows the perception is not the reality. The web may be an ocean of information but it is only four inches deep. However, the public in general, believes that the web can answer any and all questions. Why travel to a physical space when cyber space has all the answers?

Of course within that last thought there is a germ of truth to it. Cyber space can provide useful and easily accessible information. The quote from Melvil Dewey above did not come from a print resource but was the result of a Google search. It wasn’t because I did not know how to look up a quote in print it was because I was to lazy to go the the reference department to look it up. Same result less effort.

Libraries need to develop new thinking in acquiring, spreading and disseminating information. Instead of of thinking how to incorporate the latest technology into the library we should be thinking how technology can be incorporated into the idea of the library. In other words we should have the standards, practices, and services in place to be able to distribute reliable and accurate information no matter how it is provided. On top of that we have to provide the information just as easily as commercial providers.

Libraries cannot move forward by saying that we will educate the public on the importance of using libraries to do research. Frankly, we are not as convenient, flexible, and as nondescript as a Google search. Google just provides a web page where the answer may of may not be. And as hard as it may be for librarians to to accept we do not have the influence or marketing power to change that convenience.

As a profession librarians need to look at our standards, policies and services and to develop new ones that meet the reality of todays cheap information technology and widespread distribution methods. We need to think 5, 10, 20 years out about what will happen when internet access is available in the air and computers are the size of a piece of paper and can be folded into your pocket.

CCFLS is looking at these options right now and trying to investigate solutions to these questions. Within the next few months CCFLS will be opening a research and development lab in some unoccupied space at one of our libraries. The idea will be to see what can be done now, practically and what is still pie in the sky. And even if it is pie in the sky today we can reliably guess when it will be practicable by using Moore’s Law.

Some of the services we are investigating include: a universal electronic user interface that answers questions, not just points users to information resources. The need to provide electronic databases of our complete non-fiction collections that can be accessed and searched by our patrons through the universal interface. The need to provide access to all materials in whatever the format throughout our facility.

Of course any one of these ideas is a huge challenge and considering the size of CCFLS be considered quite grandiose. However, by using modern development and communication tool we can reach out and develop a community of like minded libraries that can break huge tasks into many small steps. Many libraries have the time to do a little project now and again. By using the open source idea libraries of all sizes can come together and build the 21st century library. CCFLS asks all libraries, software developers, and anyone else interested to help incorporate the idea of the library into the latest technology.