Thanks to Kevin for collecting a number of useful links that inspired me to write on this topic. I imagine that most readers here are familiar with Google Print, the project that Google has launched to scan a very large percentage of the published works in print, index them, and serve up sections of content in response to queries. It's an ambitious project, and it's controversial -- at least as far as publishers are concerned. Google claims that what they're doing does not violate copyright, because they will only serve up portions of the book. Kevin pointed to some good essays about this.
I'll come right out and say it: I'm all in favor of Google Print. Why? I was ambivalent until a couple of days ago when my father called to tell me something that I initially found absolutely amazing. Upon his retirement a number of years ago, I had finally convinced him to get a computer, and I've been his primary tech support ever since. (And unlike the case with certain other family members, I never get "it doesn't work" reports from my dad.) He's turned into an avid web user over the past few years, using it to research topics of interest to him. His latest find was a digitized phone book from Warsaw, Poland, from the time when he still lived there -- nearly 70 years ago. He found it in a collection on the Library of Congress web site. He found the listing for his family as well as listings for various relatives and acquaintances.
When my dad told me about this, my initial reaction was amazement, but my next reaction was "why not?" Why shouldn't this type of information be out there? Why shouldn't it be easy for anyone who is interested in Warsaw in 1939 to find it? Of course it should! That's when I came firmly to the conclusion that Google Print is a good thing, and that one day the effort to digitize the world's printed material (or the efforts, as I understand that Google's isn't the only project) may one day be recognized as one of the biggest contributions to the advancement of human knowledge since the printing press.
This site is licensed under Creative Commons, but I am still a firm believer in respect for copyright. I do recognize that what Google is doing involves some thorny questions with respect to fair use, but what I contend is that these questions aren't actually new. I contend that Google is actually opening up no new ground in the copyright arena.
How can it be true that Google is opening no new questions in copyright when they are clearly doing something that nobody else has done before by scanning so many books? I say that it's just a matter of scale, and I'll explain why.
What is the difference between what Google Print is doing and what every bricks-and-mortar bookstore does? It's just scale. What is the diference between what Google Print is doing and what every bricks-and-mortar library does? It's just scale. In the case of both bookstores and libraries, it's just the scale of works available for search. In the case of libraries, it's the scale of material available to searchers -- and libraries give away more than Google Print will. It all boils down to the fact that Google print will make excerpts more books available to more people than any bookstore, chain of bookstores, or library ever will.
I can walk into the local Barnes and Noble and do a search. I can do the same in the local library. It's not a particularly cost-effective search mechanism for broad topic areas, but it does work well when the topic area is narrow -- and if time were no object I could do it very well in broad topics, too. I may be hampered a bit because a small percentage of the books in Barnes and Noble are shrink-wrapped, but none of them are in the library. It's clear from the lack of shrink-wrap on most books in most bookstores that publishers don't have a problem with potential buyers being able to check out the contents of their books before purchasing.
I can walk over to the section of the store or library that contains the bulk of the books that are likely to be relevant, examine the titles and leaf through the pages looking for relevant information. I've done this several times when looking for a good explanation and example of some obscure windows programming area that I have't been able to find in a web search. I can quickly narrow down the search to a half dozen or so books, check the indexes for API calls that are related to what I'm interested in, and in ten minutes or so I might decide to check out or purchase one or more of these books.
But Google makes a scan! That's a copy, so that's the difference. Bookstores don't make a copy. In fact, they've already paid for (or contracted to pay for, at least) their stock. That's what the publishers say and want you to believe. Let's consider, however, the typical setup in a different type of store. Consider your local video store. No, they don't make copies of the DVDs and show you excerpts, but if they operate the way the ones I go to do, they do something that is in a way analogous. They don't put the actual disks on display. They just put out empty boxes. Those boxes have the cover art and notes, and that is material chosen by the publisher to act like an excerpt. Sure, it's material chosen by the publisher so it's not analogous to material chosen by Google's algorithms, but the video stores use that material because it's convenient, however, not because that's the only material they're allowed to use! In theory, any video store could design their own alternative cover art and notes for the boxes that they put out on their shelves. If they think they can do a better job of marketing the material than the publisher, they are free to do so -- and while I'm not an lawyer, I see no reason why that cover art couldn't include a screen shot of the actual video. In fact, If the technology existed to do it cost effectively, I see no reason why the box on the shelves in video stores couldn't actually show the customer the trailer for the video on a small LCD screen. It could even show a few excerpts that weren't chosen by the publisher, and nobody's copyright would be violated by that. In addition, although I don't think I've seen this at video stores, there are many music stores where you can pick up a pair of headphones and push a button -- or wave a bar code, even! -- and hear excerpts from CDs. Again, nobody's copyright is violated by this! I don't see, therefore, how the fact that Google makes a scan and shows excerpts has any bearing on whether or not the use of that scan violates anyone's rights.
So, who profits from this? The library doesn't. The bookstore does. The folks that the bookstore pay their advertising dollars to in order to entice me to come into their shop will profit from it. The Starbucks corporation, whose franchise within the bookstore I might patronize while browsing will profit from it, too. For that matter, the oil companies I buy the gasoline from will profit from the three mile drive to Barnes and Noble, and New Balance will profit from the wear and tear on my sneakers from walking around the bookstore. And, yes, the publisher profits and the author profits. too. (The author is probably the one who profits the least, but that's a whole different matter, isn't it!?)
I have to ask, thererfore, why should publishers treat Google Print any differently than they would treat a mobile library that comes to your door, shows you a variety of books on topics that you've asked to see, and lets you browse them? Imagine that my friend "Gu Gill" (I little license here, please. Yes, I really did have a college freind who was called "Gu", and I do have another friend whose last name is "Gill", but they are not the same person) found a way to make money by taking calls from people who told him what they were interested, and he stocked up his mobile library van and drove to their hoses, and he found a way to make money at it. How could it possibly be a violation of copyright? To make the analogy to Google Print more complete, what if Gu's mobile library made money solely through the advertisements written on the sides of the van and through affiliate programs with booksellers that he would happily refer you to if you decided you wanted to buy a book? He would be showing you the entire book, close to the comfort of your own home, and there would be no copyright problems at all. Why does the fact that Gu Gill carries around the actual books, but Google uses a scan of the book -- a book that somebody paid for, by the way, before it was scanned -- matter in this?
I say that it doesn't. It doesn't matter that Google will profit from this any more than it would matter if Gu Gill does. Google Print is acting in the same capacity as the mobile bookstore that doesn't actually sell the books, and they're actually going to give me less access to the content of the book than I get at the bricks-and-mortar library or Barnes and Noble, or that I would get from the hypothetical service run by Gu Gill.
The only difference is scale. Google (or someone) will pay (or will have paid) for one copy of the book, and the scan will be available to loads of people all around the world. Clearly, for Gu Gill to reach the scale of Google, he would need a fleet of vans and he will have to have purchased many more copies of each book in order to make the service work, so the publisher would make a bit more money from Gu's effort even if nobody bought a single book after browsing in the mobile library. One could imagine a FedEx-like just-in-time shipment to get the right books to the right mobile units at the right time, so Gu actually wouldn't have to have one copy of every book for every mobile unit, but he would definitely need more than one. Not as many as you might think, though, becuase if Gu just showed excerpts, like Google Print, he would only need to be flying around five or ten pages at a time, ripped out of each book. He could probably would make do with just one -- or a few -- copies of most books. I'll concede, however, that he probably will need more than one for the more frequently accessed books
Given that, perhaps there should be some formula for measuring the scale of use that a book within Google Print gets and establishing a virtual number of copies that the publisher should be paid for. The Nicholas Carr article (one of the links above) suggests that putting some sort of organization like ASCAP in place to measure usage and extract royalties from search engines might be the answer, and indeed that thought might lead to a fair way to settle the dispute. I somewhat doubt that the publishers are actually thinking along lines anywhere close to the analogies that I've drawn here, so I kind of doubt that they'd be satisified with something like that.
Google will probably also not like that type of solution, however, and I think they actually would have a pretty good argument against paying for multiple copies based on access if they simply took my analogy and said that that if Gu simply drove faster he'd never need more than one copy of the book. That's what Google Print does: it takes a legally purchased book, sitting on a shelf somewhere, and moves it virtually, very fast. Google Print breaks the speed limits of the bricks and mortar world. If publishers were profiting from the speed limits of the bricks and mortar world, that was an accident of that world. It's a different world now.
A generation or two from now, somebody is going to go searching for something. It will be something one might have never thought would be available on line. Something contemporary to our time now, and as unlikely in our current mindset for somebody to find as the 1930's phone directory for Warsaw was to until a few days ago. Now, there's really nothing remarkable about a 1930's Warsaw phone book on its own, but it really does show the possibilities. If the economics of Google Print work out, it can only lead to greater demand for making non-book content available on line. If all the books in print in Warsaw in 1939, and all the newspapers and magazines, too, and in fact all the books, newspapers, and magazines in print everywhere in 1939 were easily available on line for that matter, imagine how much primary source research could be done so much more easily than it can be done now. Google Print will change change our mindset about what we should be able to find easily and what we shouldn't, and in the course of doing so they'll make some money and they'll help a lot of publishers sell their books. Where's the downside?
1. Ben Langhinrichs10/23/2005 11:42:47 PM
While I am not entirely sure how I feel about Google Print, I would take issue with one part of your logic. Your analogy falls short on two fronts, in my opinion.
The first is the issue of copying. Barnes and Noble, or for that matter your pal Gu Gill, do not let you copy the parts of the book, but just look at them. Granted, you could copy books from the library, but they go to some effort to prevent it and post prominent notices to prevent copying books wholesale (and will stop such behavior if they see it). As we have seen with Napster and Kazaa and elsewhere, copying on-line is just easier, and so different.
The second issue is that of a sample not being the whole thing. You may think the video store has a right to pick and choose, but I doubt you are right. Imagine if a video store posted a screen shot that gave away the plot, or a bookstore used quotes from the book that were the most important part? The publishers would try to stop them. Similarly, a supermarket may give away samples, but they have a choice about when and where and how. Just because a sample of cheese may amount to a twentieth of the amount of cheese the normal person buys does not mean that anyone can go into a store and take a tentieth of any product they like any time they like. That is what Google is saying is OK, and I think that part is just wrong. Part of the problem is that Google is taking over the decision about which part is OK to sample. If I have an important strategy that I publish in a book, I don't want to have the core algorithm published, even if it is just on one page. I'd fight a newspaper printing the algorithm, but I would have a hard time fighting Google's printing that excerpt if it came up in a search.
So, no, I think the analogies you make do NOT work, and do NOT absolve Google of its basic violation of copyright. I still am not sure whether this will work to the good or bad of humankind, but I also don't know whether removing all historical land ownership and splitting all lands up equally would be to the good or bad of humankind, but that doesn't make me in favor of it.
2. Richard Schwartz10/24/2005 03:22:57 AM
@Ben: I don't think your objections hold up. Your supermarket analogy doesn't hold up because copyright law definitely does allow anyone to take a small sample of any published work at any time -- subject to certain conditions. It is not like a supermarket at all. The publisher does not get to exercise control over where and how fair use samples are taken.
As for Google taking over the decision of which part is ok to sample, the publisher doesn't have that right, so Google can't be taking it over! Under copyright law, the publisher doesn't get to dictate in advance what constitutes infringement The publisher gets the right to challenge post facto any alleged infringement. The publishers may indeed try to stop a video store that shows excerpts that give away the plot, but unless there is a contract between the publisher and the store that gives the publisher rights over and above what copyright law allows, the publisher can not impose prior restraints on the video store. What they could do is charge infringement on the grounds that such an excerpt would exceed the "amount and substantiality of the portion used in relation to the copyrighted work as a whole" that is allowed for fair use. That language is taken from http://www.copyright.gov/fls/fl102.html
Google Print, by presenting a few pages chosen as a result of a search, and making them available for copying may indeed facilitate some infringements when the portion that they present for copying would in fact constitute "an amount and substantialty... etc." Most of the time, however, what they present will not be facilitating infringement. Are you familiar with the Sony case? The more recent Grokster case did not overturn it. Sony was not guilty in their case because their BetaMax technology's purpose went far beyond simply facilitating infringement, and indeed it was heavily used for non-infringing purposes. The Grokster case, on the other hand, was lost because Grokster was clearly designed to facilitate infringement, anything else it could do was mere window dressing, and non-infringing use constituted a tiny minority of its use. Google Print is clearly more analogous to Sony than to Grokster. The legiimate use for research clearly dwarfs the potential for infringement.
As far as Google allowing copying because "copying on-line is just easier", that might be true -- but I've tried out Google print and in my opinion it isn't. They present images, not text that can be copied to the clipboard. They disable right-click so directly copying the images isn't easy. You can screen-shot them, of course, but you can carry a cameraphone into a bookstore and get page images that way, too. Is Barnes and Noble liable for that copyright violation? If they are, why aren't publishers insisting that booksellers must shrink-wrap all books? And Google Print does not make it easy at all to copy an entire book. You have to log in, and they track the images that they show you. They won't show you more than a few images from any one book in a short time. You could, of course, create lots of accounts -- just as you could send 400 people into a library and have each one take a photocopy of one page of a book; but if the library isn't liable for the infringement that you and your gang of 400 perpetrated, why would Google be liable for you and your gang of accounts? (And for all I know, Google is tracking IPs, as well, in order to make it even harder.)
As for your desire to protect a core algorithm that you publish in a book, even if it is just one page, I'm sorry but I don't think you have an enforceable right there. By the letter of the law you do have that right, but no library that I know of will stop someone from photocopying one page out of a book -- or from transcribing it by hand. They are not obligated to watch over their patrons and make a legal decision about whether each individual photocopy constitutes an excerpt that exceed the allowable "amount and substantiality..." If there is an infringement, the liability for the offense lies entirely with the library's patron, not with the library. Nor would a bookstore be liable if someone came into their shop and ripped that particular page out of the book, pocketed it, took it home, and made lots of copies. That's been possible ever since there have been bookstores, but publishers haven't insisted that bookstores shrinkwrap their books to prevent it. Again, the offense in this case lies entirely with the person who does the ripping and copying.
Why, then, should Google Print be subject to liability for infringement that might be committed by their users? You might claim that the fact that Google Print allows this infringement right from within the privacy of one's home, so there is no jeopardy of being caught ripping the pages out of the book or parading 400 people up the copier in the library, but the Sony case was about technology that enabled potential infringement in the privacy of one's own home as well, and that point wasn't enough for the courts to rule against Sony, so I don't see it applying to Google Print either.
3. Alan Bell10/24/2005 08:07:23 AM
I think that it is pretty clear that the end users of the Google service are not exceeding what they could do with the "fair use" provisions applying to books they could have obtained from libraries or bookshops. The question is whether Google is making fair use of the book they have purchased (in fact whether or not it was purchased is pretty irrelevant). The detail of what you can and can't do without permission is actually in article 10 of the Berne Convention.
(1) It shall be permissible to make quotations from a work which has already been lawfully made available to the public, provided that their making is compatible with fair practice, and their extent does not exceed that justified by the purpose, including quotations from newspaper articles and periodicals in the form of press summaries.
(2) It shall be a matter for legislation in the countries of the Union, and for special agreements existing or to be concluded between them, to permit the utilization, to the extent justified by the purpose, of literary or artistic works by way of illustration in publications, broadcasts or sound or visual recordings for teaching, provided such utilization is compatible with fair practice.
(3) Where use is made of works in accordance with the preceding paragraphs of this Article, mention shall be made of the source, and of the name of the author, if it appears thereon.
If they are going for the teaching provision then I think they have a weak case, and anyhow this is done by country level legislation, the convention just suggests there should be some.
I think they have to augue that their quotations do not exceed that justified by the purpose. This is going to be tricky because the purpose or intent of a Google print user is hard to prove. Other bits of Berne that could apply are reproduction, broadcast, recitation and adaption. None of which seem to help Google.
I wish I knew less about this subject.
4. Ben Langhinrichs10/24/2005 09:06:53 AM
I just think it is a weak argument, Rich.
5. Alan Bell10/24/2005 09:11:53 AM
perhaps I should clarify my position, I think the whole concept is a great idea. It should be supported and open to competition. I just think it is legally questionable under the current provisions, so the lawyers will go to court to get the question answered in the USA, and perhaps a clarification to Berne will be required at some stage.
6. Richard Schwartz10/24/2005 09:55:12 AM
@Alan: I don't know nearly as much about Berne as I do about US copyright law and precedents. In particular, I don't know whether Berne supports the right of libraries and individuals to make copies for archival and preservation purposes. Under US law, that right is recognized. I think Google will is relying heavily on that right, plus the Sony BetaMax decision. Clearly their scanning goes beyond simple archiving and preservation, but if the purpose of that scanning is to facilitate fair use by end-users it seems to me that it has the same underlying benign intent as the scanning that libraries are allowed to do (in the US). It's just a matter of scale, in that Google opens up their archive to people on the net, not just to people who walk in the door, and thus the argument against them is that even though their intent is analogous to the scanning that a bricks-and-mortar library is allowed to do, the effect on the copyright owner is much greater. IMHO, if Google loses on this, it will be new law in the US, not an extension of existing rulings. It will be a new limit on previous rulings that allow copying, and the limit will be based solely on scale and convenience. It will be like saying "it's ok to be a library, but you just can't be a really, really big library that's really, really convenient". Whether Berne is any more clear about this, I can't say.
@Ben: I'm not an attorney, as you well know. I've never even worked for a law firm and I may even agree with you on a certain level -- that there's something suspect about Google Print from some idealistic or moral point of view; but from what I do understand about how copyright law has been applied in the US, I think they do have a strong legal position. And if there is a ruling against them that boils down to "you're too big and too convenient to claim the rights that smaller libraries have", I would consider that very worrisome. I realize that it's a balance between the rights of the user, the rights of the library, and the rights of the publisher, and scale matters when balancing rights -- but the Sony decision involved very large scale, and that precedent is pretty well established so I don't think scale would be the true deciding factor if a ruling goes against Google. It would have to be substantially based on convenience, and IMHO it will be quite chilling if a ruling comes down that says that a system that has substantial non-infringing intent is illegal because it makes fair use too convenient for too many people.
7. Alan Bell10/24/2005 10:04:32 AM
I expect US law trumps Berne convention in the US, but I think this is an international issue, if they can stay within Berne without relying on country specific provisions then they are safer. You are right that conceptually they are equivalent to a very fast travelling library, however in copyright things that are conceptually equivalent might not be legally equivalent. e.g. reading a book aloud over the radio or internet is conceptually the same as visiting all the listeners in their homes and reading it to them, just more convenient. One is fair use, the other is broadcasting which requires separate permission from the rights holder.
8. Richard Schwartz10/24/2005 11:05:22 AM
@Alan: Very good point there. That's a clear way of looking at the issue apart from all the analogies I can throw at it. But here's the thing: a broadcaster doesn't need permission to broadcast excerpts as long as those excerpts fall within fair use. Thus the analogous question would be: does US law or Berne allow a broadcast to make a copy of the full work in order to facilitate their right to do many individual broadcasts of excerpts, each of which would qualify as fair use, and even take together would not constitute a violation? Better yet, consider a broadcast network that wants each of their affiliate stations to be able to broadcast their own choice of non-infringing excerpts from some book in response to listener requests that come into each station. Could the broadcast network buy just one copy of the book, make a full recording of it, break that recording into excerpts, and send packages of those excerpts down to their stations for them to use as they so choose and be covered by fair use? Or is the broadcast network obligated to buy individual copies of the book for each of their affiliates in order to earn the right to send them their package of fair use exceprts? Is there precedent one way or the other on the question when formulated that way? I doubt it frankly, and here's why: the bigger broadcasters generally don't pay for their copies of the copyrighted materials that they excerpt. They get them free from publishers who are only too eager to have their works excerpted on the air. So I doubt any case like this has ever come up in the broadcast industy, and if one does come up I think there's ample evidence that years and years of standard practice in the industry establishes that publishers have relinquished any broad claims to royalties for copies made for the purpose of facilitating fair use. For example, local broadcasters routinely make tapes of network feeds and play the tape at a later time, and there's never to my knowledge been a challenge of that act of copying by a publisher on the grounds that they should have been paid for that copy. Evidence, of course, doesn't constitute proof, but even in this line of reasoning I think Google has a pretty firm set of legs to stand on.
9. Alan Bell10/24/2005 11:26:40 AM
broadcasters do pay for broadcast rights. There are agencies which collect royalties in bulk for music broadcasts, book readings are absolutely done with the explicit permission of the rights holder. Generally this is childrens stories read on channels like Cbeebies, or within programs aiming at the Sesame Street audience. Bigger publishers have rights departments with people specifically to handle such requests. I think you are correct that not a lot of money changes hands, and it could be in any direction, but I don't see it as a relaxed standard practice which would help Google. I think Google are balancing on one leg, and they have a bunch of lawyers eager to have fun with this case. They know it is coming and they think they can afford it, and probably win it.
10. Richard Schwartz10/24/2005 01:36:11 PM
Broadcasters pay for broadcast rights to works played for entertainment purposes. It's been years since I was involved with this, but my recollection is that playing/reading short excerpts within non-entertainment programming (news, public affairs, education) is outside of the purview of the royalties that they pay.
11. Alan Bell10/24/2005 05:57:26 PM
yeah, I guess you are right about broadcasts of excerpts. Thats the intersection of two activities. I bet some lawyers would argue the rights and wrongs of it either way anyhow.
The USA tends to have more restrictive laws on copyright than other places, mostly because of Disney & Hollywood, I guess if Google can win in the US then they can win anywhere.
I bet they will make it a constitutional issue
To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;
They could argue that they are promoting the progress of science and useful arts (as long as they exclude Kansas biology text books from their searches) and that progress is more in the objective and spirit of the constitution, the exclusive rights granted are a means to the end of progress. Basically copyright and patents are intended to let people profit from creativity so that they have an incentive to be creative. If Google argue they are not discouraging creativity then they have a case which would probably go to the supreme court with the lawyers they can afford.
The US constitution would trump Berne in the US, but I think it would be unfortunate if they can't do it from Berne because it wouldn't legitamise and open up international competition.
12. Richard Schwartz10/24/2005 06:50:33 PM
We'll see. I probably ought to get back to reading Lessig's blog. He probably has a thing or two to say about it.
13. Alan Bell10/25/2005 10:50:08 AM
just saw this article on the subject:
and here is some more from Lessig:
I think we all agree that Google should win, the interesting part will be how they win.
14. Ben Langhinrichs10/25/2005 11:18:11 AM
One other point to clarify. US law does not trump international treaties. If we have signed an international treaty (and I don't know whether the Berne Convention counts as such or not), it actually trumps US law and is basically at the level of the US Constitution. We may not always act that way, but that is the way the US Constitution reads, if I recollect correctly.
15. Richard Schwartz10/25/2005 10:23:55 PM
@Ben: Technically you are completely right about that, but reality is a lot more cloudy. Article VI of the constitution binds the judiciary to give treaties the status of supreme law of the land -- the same level as the constituion itself. If the legislature passes laws that are contrary to treaties, the courts can and should strike them down. The laws, not the legislators, or maybe... .
The thing is, there are a lot of catches.
First, the judiciary can't do anything unless someone brings an action, and to get a court to rule that a US law is in violation of a treaty you've first got to get a court to accept jurisdiction over the case. Issues of jurisdiction in international law are very hairy, and getting a US court to actually accept jursdiction on a matter of a treaty could turn out to be very, very hard -- maybe prohibitively hard -- for any litigant to do.
Second, if a case actually did go to court it could go either way. The judiciary doesn't just have the right to void a law because it is contrary to a treaty. They aslo have the right to void any provision of a treaty that is contrary to the constitution. The treaty's status as "supreme law" is actually "supreme law, except for the constitution". (Without that interpretation, the US could enter into a treaty that repealed the bill of rights.) Wikipedia says that's never happened, but one could easily imagine that if the DOJ really wanted to get the US out of a provision of Berne being enforced in a court case, they could probably find a way to argue that the provision of Berne is contrary to the constitution -- and they might win.
Third, treaties are essentially contracts and generally include provisions that allow a country to withdraw at any time, and that power rests with the executive. So Berne is supreme law in the US until the President says it isn't.
Finally, if the US courts go along with it, the only things that prevent the US from signing a treaty and then ignoring whatever clauses it wants yet claiming that it hasn't withdrawn are (a) the possibility of being taken before the UN or ICJ or facing sanctions, and (b) the desire to keep other countries from ignoring whatever clauses they want. The former is not much of a realistic threat to the US, so it's really only the latter that applies, and if the powers that be in Washington decided that what they really want is to force change in the provisions of Berne, they could just set up a fait accomplis and let the dust settle where it may.