Copyright Access Center
 
 

COCOA FAQ

By Dr. Andrew Burt,
Chair, The COCOA Association

(All opinions expressed herein are those of the author)

Table of Contents [version 1.53, 10/18/06]:





Definitions

[Top]

What's a "CDS"?

CDS stands for Content Display Sites -- such as Google Print, Amazon.com's Search Inside the Book, Microsoft/Yahoo! and the Open Content Alliance. It gets tedious writing "Google and Amazon and..." so I'll call them "CDS" in here.

[Top]

Who's a "copyright owner"?

An author or a publisher. Authors are the original owner of the copyright for works they create (except certain cases, such as when they write a book for hire for someone else). Authors typically license publishers the right to print their work via contracts. Electronic rights (to make/sell electronic versions) are a separate right. (Other rights would be rights to make a film of a book, translate the book to another language, create sequels, etc.)

Some works have no copyright, such as works in the public domain. Copyright eventually expires, meaning anyone can then do what they want with the work. Works created prior to 1923, for example, are highly likely to be public domain. Shakespeare, Dickens, etc. are in the public domain.

Some rights are also granted to others by law. See Fair Use below.

Note that a CDS has to either acquire the rights to a work, or have some other legal right to it, such as if the work is in the public domain, or the use is covered under fair use provisions.

[Top]

What's "Fair Use"?

Fair use refers to rights that copyright law grants that a copyright owner has no control over. The "Fair use" provision of the copyright law is here (but note that a lot of what is consider fair use is determined by case law, i.e. lawsuits that someone won or lost).



About COCOA

[Top]

What is COCOA?

COCOA stands for "Copyright Owners' Control of Access." The "COCOA standard" or "COCOA Protocol" is a technical description how copyright owners' would describe what visibility they would like for scanned paged images of their work. (It also works for other media -- music, film, etc.) COCOA is administered by the COCOA Association, a non-profit established to implement and operate the COCOA protocol (such as authenticating members), disseminate information about COCOA, and promote its use.

[Top]

Who created COCOA?

Here is a list of the original committee members. Note particularly that the design team are spread across the far ends of the copyright spectrum, from "copyright conservative" to "copyright liberal." This gives COCOA broad appeal to people no matter how they feel about copyrights.

[Top]

How does COCOA work?

COCOA is simple: a one-stop shop where copyright owners, from individual authors to huge publishing conglomerates, can specify the visibility of scanned pages and other content. Thus, if one needs to block the last three pages of a certain short story in an anthology for contractual reasons, one quick visit to a secure web page and blink, all CDS's will be informed and can block pages 214-216 of that ISBN number. Likewise, it lets publishers establish default visibility standards - for example, "unless otherwise specified, block the last 25% of novel pages from view" or "block every 4th page of our textbooks."

It's easy for publishers and authors to use, and easy for CDS's to implement.

COCOA was designed with other media in mind, so that music and video files can be described in terms of what amount could be played for free (say, the first 30 seconds of all songs from a certain publisher or 100% of a specific video).

COCOA is flexible, easy to use, comprehensive, and adheres to the copyright law precept that it's the copyright owners who specify what's visible. With COCOA, publishers and authors can place their work into CDS's with full confidence that only what makes sense to show of their work is what will be shown.

Details of how COCOA works are here.

[Top]

What are the problems COCOA solves?

There are four primary problems with displaying scanned page images that aren't obvious:

(1) Many copyright owners are concerned that even if security limited one person to looking at five pages per book, there are millions of people using "peer-to-peer" (P2P) file sharing programs, so if a handful of people look at five pages each, it only takes 40 people to download a 200 page book and share the assembled results, undetectable by Amazon. This can be done undetectably with as few as 10 people, and copyright owners are concerned this process may become automated. Amazon actually limits one person to viewing approximately 15% of the pages of one book, not five pages; Google blocks out approximately a random 15% of the pages of copyrighted books from view by anyone; nobody can see more than 85%. Some copyright owners are concerned that people have the ability to view 85% of a book from Google and the remaining 15% from Amazon. They worry this may become easier as more sites display pages, such as Yahoo! and Microsoft and the Open Content Alliance.

(2) Often a few pages have value. Thousands of short stories in anthologies can be read in full on Amazon and Google. Some copyright owners are concerned that, for example, teachers could assign readings this way to avoid having students purchase text books, or that feference books, cook books, etc. could be used like a free online library, without spend time and money going to the library or bookstore (which may cost $8 or more in time and expenses). Entire books have had to be removed because one author of a work inside had already sold electronic rights elsewhere. Even with 15% per-book page limits, some copyright owners are concerned that many books could lose value when too many pages can be read online. An author or publisher may need to block only a small number of pages, such such as to block a poem or photograph or song lyric that they do not have rights to put online. Today, to block just one page from view, Google and Amazon require taking the entire book out of their program.

(3) Copyright owners have the right to control digital publication of their work. Rightly or wrongly, the law is written so that online publication of copyrighted works (beyond fair use) must be authorized by the owner of the rights. Some copyright owners prefer not to have their work online, as is their right. (For whatever reason -- be it that they are concerned about piracy or any other reasons; the law gives them this right.) Any desired changes to this law should be made via Congress.

(4) The "All-e-book" future. It is conceivable that in the not-too-distant future a large percentage of book content will be read electronically. If not via an ebook reader of today, perhaps via some sort of digital paper that allows creation of reading devices that look and feel exactly like ordinary paper but display computerized content. Philips (who invented the CD) and E-Ink are already mass-producing a promising technology. While the economic cost from book piracy today is largely in the form of counterfeit copies of books produced in third world countries, and there is little evidence that online piracy of books causes economic harm, authors and publishers are concerned that an all-e-book future could significantly change that.

These problems -- and several others -- are explained in greater detail here.

[Top]

How is it different for Google to index a book vs. a web page?

Two ways come to mind:

As far as just creating a searchable index of a book vs. a web page, for a web page, the author put it up on a web site for everyone in the world to come visit. Authors of books for sale generally write them with the hope of getting paid when people buy the book -- they didn't post them on a web site for all the public to read for free. (See also Isn't browsing a book on a CDS the same as free bookstore browsing or library borrowing?.) As stated elsewhere in this FAQ, it's currently being litigated whether it's legal to create a searchable index.

The second issue, completely aside from the first, is that the CDS's have gone beyond a mere searchable index. They display the actual scanned images of the pages, from which it's very simple to extract an entire book. Even if Google wins the lawsuits about creating an index, it will still be illegal for them to allow entire books to be extracted (and that would be a lawsuit I doubt they could win). COCOA solves this problem.

[Top]

What do you mean when you say current CDS's fail because they're "one-size-fits-few"?

Basically, they're trying for a "one-size-fits-all" solution, and failing.

The CDS's have each offered up system where they say, "If you don't like how our system works, tough, don't put up your books." They offer no flexibility for a world where each book, and certainly each kind of book (novels vs. textbooks vs. poetry vs. cookbooks etc.), have different needs. Nor are they respectful of different views of authors and publishers about how visible page images should be (whether that be more or less than their rigid systems offer). They all say, "our way or the highway." And besides just being arrogant, that doesn't work for a lot of books.

Today, if you wanted all the pages of your book indexed, searchable, and page-displayable, except for one page, which has, say, a photograph or poem for which you have no right to put it up on the net for all to see, you have to remove the entire book from the CDS. And not just the page images, but removal from a CDS means it is no longer indexed, no longer searchable, no text snippets, the whole works.

With COCOA, the whole book could be indexed, entirely searchable, including text snippets, and all the page images could be displayed, except for the one page with the problematic photograph or poem.

To block one short story in an anthology, or even part of a story, requires the entire book be removed from all CDS, indexing/searching and all.

Textbooks, cookbooks, reference books are a similar situation: It may not make sense to the copyright owner to allow free reading of 15% of such a work, whereas it might make sense to the copyright owner to block every 4th page (75% of pages available to any one reader instead of 15%). Current CDS operators don't allow this. Their attempt to use one-size-fits-all "security" systems are really "one-size-fits-a-few."

Adopting COCOA would greatly expand the number of titles in CDS systems, and the number of pages visible in each title.

[Top]

How will COCOA greatly expand the number of titles and pages in CDS systems?

More titles:

Currently many copyright owners are hesitant to have their books in CDS's because of several concerns:

(1) that CDS's show too much of their work (because of their one-size-fits-few rigidity) thus reducing the value (such as using a reference book online and then not needing to pay for it);

(2) that their works might be stolen (whether it causes economic harm or not, the fear is real, and keeps books out of CDS's);

(3) that they have no rights to display certain pages for contractual reasons (such as a certain poem, photograph, etc.); or

(4) concerns that the CDS's are illegally using their work (authors are notorious for getting angry at illegal rights grabs; understandably, since rights are what ensure authors can pay for food).

COCOA addresses all of these concerns:

(1) Only those pages that the copyright owner is comfortable with would be shown in page image form. If it makes sense to block every 4th page of a text book to deter online use without paying, they can do that. Whereas today, that book would not appear at all. Consider O'Reilly books, the technical books with the funky animals on the cover. Most O'Reilly books are not present in Amazon's Search Inside, even though Tim O'Reilly is a visionary and generally liberal with electronic rights. It seems likely that a lot of his titles (if not all) would become available if they could exercise page-level control over what's shown.

(2) Some people believe COCOA inhibits piracy from CDS's since all CDS's would have the same pages -- it wouldn't be possible to assemble a book from any or all CDS's (if the copyright owner didn't permit it; they could, of course, choose to make 100% of their pages available if they wanted).

(3) COCOA allows specifying "all pages of this book except for page 42", if page 42 had a photograph or poem which the copyright owner lacked permission to display on a CDS. Today, if just one photograph or poem can't be shown, the whole book has to be pulled from a CDS. COCOA solves that silliness.

(4) Lastly, COCOA is the legal, copyright-oriented means by which a copyright owner would willingly convey the rights for their work to be displayed on a CDS -- it makes the display 100% legal, with their permission. Authors who have control over permissions are enormously more likely to permit use in a CDS than authors who feel their rights are being stolen from them.

Thus, with all of these concerns addressed, copyright owners will feel comfortable putting a great deal more of their titles up on CDS's.

More pages:

Beyond the obvious fact that more titles means more pages (all those pages from the books that are currently not in a CDS at all), there is the increase because customers would be freed from the existing (largely ineffective) page viewing limits that each CDS imposes.

That is, Amazon, for example, puts the following roadblocks in your way to viewing pages:

  • Requires you to have a credit card to view pages (ostensibly a credit card that has made a purchase on Amazon). That keeps out a lot of people.

  • Once you can see a page image, you're restricted to seeing two pages forward and back.

  • Amazon tracks how many pages you've viewed in a book, and once you hit around 15%, they prevent you from seeing any more pages in that book.

  • If you've seen "too many" pages in "too many" books, they prevent you from seeing any pages in any book, ever again. Zero pages, forever.

  • Google likewise requires logins and uses a convoluted cookie scheme to track what you've seen. They have the same two forward/back roadblock. Again, when you've seen "too many" pages in a book, you're locked out. Google also prevents anyone from seeing about 15% of randomly selected pages in copyrighted books; those pages just aren't viewable; they aren't in the page image database, as it were. It may be that they're the wrong 15% of pages, but the copyright owner has no control over that.

  • Right now you're not allowed to share, download, or print any of the page images.

So, if you follow the rules (which are, alas, easily broken, as noted elsewhere), you can't see more than five pages in a row, and not more than, say, 15% of the pages in any given book. Hit that 15% limit too often, and you're locked out for good. That's a very severe limit on what pages you can see.

Via COCOA, copyright owners can specify what makes sense to them, for their books. They could authorize seeing 100% of pages without limitation, as many authors would like to do (but aren't allowed to!). They could offer 99.9% of pages of a book, blocking just a few that are problematic.

More to the point, the credit card, login, 5-page forward/back, and "15% max" limits can be eliminated. You could see every page in a book that a copyright owner has authorized. If they've authorized 100%, you can read the whole thing online (and probably share it with others). Why? Because the copyright owner authorized it.

A survey conducted among authors showed that the vast majority of authors would permit more pages per book than CDS's currently allow.

Morever, consider if Google wins the lawsuits against it: Google will still be limiting the number of pages you can see, five consecutive, etc. With COCOA, those restrictions are unnecessary, and you'll be able to see more pages. If Google loses the lawsuits, copyright owners will have to authorize indexing, and COCOA makes that easier than each copyright owner having to make a separate arrangement with each CDS. That only results in more titles and more pages than now.

No matter how you slice it, the net result is that you will be able to see not just pages in books you can't see any of now, you'll be able to see far more pages in existing books. You'll be free to see every single page the copyright owner has authorized.

At present, you are severely limited in how many pages you can see. COCOA increases the number of pages you can see dramatically.

Bottom line: In all scenarios, COCOA considerably increases the number of titles and pages you can see.

[Top]

How does COCOA increase the amount of indexing and searching of copyrighted materials?

First, as noted elsewhere, COCOA does not prevent indexing or searching or the display of text snippet search results. But it will increase it. Here's how:

Currently, indexing/searching is inextricably tied to displaying page images. Thus, if a copyright owner keeps their work out of a CDS (for example, because they feel it shows too many pages, not because of the indexing), that book is no longer indexed or searchable at all.

With COCOA, a book could be entirely indexed, entirely searchable in terms of reporting what page a word or phrase appears on, and even showing that in a text snippet, but, if it's the copyright owner's desire, not have the page image be displayed.

In fact, it's possible books could have 0% of page images available but be made searchable. COCOA allows decoupling indexing/searching vs. page image display in a way that is not now possible.

Thus, substantially many more books could be indexed and made searchable, if only the copyright owner had control of page image display rules -- which is exactly what COCOA does.

COCOA does not inhibit indexing and searching -- indeed, COCOA means more indexing and searching of copyrighted material.

[Top]

How does COCOA help the blind and visually impaired?

An unfortunate victim in all this are the blind and visually impaired. There's a non-profit, charitable organization called BookShare.org that provides books in digital form to those with visual or other print disabilities. BookShare.org operates under the "Reproduction for the blind" section of US copyright law, which says they can offer such a service if they meet certain criteria.

Bookshare offers text-to-speech software, etc. Having access to the books all these CDA's have scanned would increase BookShare's collection more than ten-fold -- a gigantic improvement for the visually impaired.

The problem is that Google and Amazon won't provide their scanned images to BookShare.org because of all the problems flying around. I've been told that if we can solve this problem -- to which COCOA is the solution -- then the roadblocks in BookShare's path are removed.

(In the meantime, copyright owners can use COCOA to provide explicit grants to BookShare.org; but the real benefit is when the CDA's adopt COCOA.)

[Top]

How does COCOA compare to Creative Commons, etc.?

Below is a table comparing COCOA to Creative Commons, Gnu GPL, and Sun's open DRM (DReaM). In brief, Creative Commons and Gnu GPL are used to grant certain, very specific rights, primarily focused on making works freely copyable; DRM (such as DReaM) is a means to enforce rights; whereas COCOA is a means to (a) specify any rights (not just specific ones), (b) distribute that information to users; (c) verify the authenticity of a license. COCOA can be used to specify CC or GPL rights, but can also specify many other kinds of rights, both commercial and non-commercial. COCOA is flexible enough to be used as a DRM system, though that is only one small aspect of it.

Here is a point-by-point comparison:

Feature COCOA Creative Commons GNU GPL Open DRM
Purpose Framework to specify & distribute rights and license information. Set of licenses; many, but targeted to specific tasks, centered around free copying for non-commercial purposes Set of requirements to ensure work (typically software) can be freely used and modified DRM is a means to enforce rights to protect content (generally commercial). Open Media Commons DReaM is an attempt to make DRM more widely available; also a set of software to help others create reusable content. DReaM is not yet implemented.
Flexible rights / Works with any rights Yes, can specify any kind of rights; has some predefined ones for common purposes No, each CC license grants specific rights, cannot be extended/modified No, each GPL license grants specific rights, cannot be extended/modified Yes(?), since is software it should allow one to control any digital rights.
Requires software No, can be implemented by humans off-line or via software No, can be implemented by humans off-line or via software No, can be implemented by humans off-line or via software Yes, requires software to enforce rights; intended for use embedded in devices
Operational Yes Yes Yes No, still being designed
Helps locate owner (to negotiate other uses) Yes, maintains contact database for rightsholders No; identifies author but offers no contact information No; identifies author but offers no contact information Not a designed purpose, but may be able to use for this
Maintains searchable database of content Yes; any content can be included Only CC content No Not a designed purpose
Sampling Yes, allows fine-grained control and offers machine-readable definitions of how to create samples Limited; no definition of what constitutes a "sample" No Should be possible to write software to allow sampling
Can use to make "public domain" grants Yes Yes No Should be able to
Jurisdictions supported Can specify any country, list/group of countries, or global. Works with laws of any country or could be used to specify country-specific rights Applicability of detailed contractual language may vary by country; has some licenses for a limited number of specific countries No country-specific wording Unclear; difficult to implement in software
Can authenticate licenses (prevent "forged" licenses) Yes, includes credential information and has server(s) one can use to verify authenticity of a license No No: Yes; this is the primary purpose of DRM
Layered rights / covers multiple works at once as well as individual works Yes; an author and/or publisher can specify rights for all their works, then layer overrides on top, for specific groups of works or individual works (example: all books by a publisher - allow reading 75% for free; all textbooks by same publisher - deny reading every fourth page; specific book: allow reading 100%) No; individual work only No; individual work only No; individual work only
Open standard Royalty free - Yes; Modifiable / extensible - Yes Royalty free - Yes; Modifiable / extensible - No (creators cannot modify rights language) Royalty free - Yes; Modifiable / extensible - No (creators cannot modify rights language) Royalty free - Yes; Modifiable / extensible - Yes
Allows time-limited grants Yes No No Yes
Requires creator to share work non-commercially No, can be used for both free and/or commercial licences Required (is the intent); sampling licences can be used to limit free sharing to just samples Required (is the intent) No, intended for commercial use, could be used for non-commercial licenses
Requires allowing users to make/distribute derivative works Not required; terms set at owner's choice Not required; terms set at owner's choice (can choose "no derivatives" licenses) Required (is the intent) Not required; terms set at owner's choice
Requires inclusion of license in content itself At owner's choice (licenses can be embedded in content if desired; or granted separately, such as to avoid altering existing content or physical products) Yes, license must accompany content, per the license Yes, license must accompany content, per the license Yes, must be present to be processed by software
Target property Any property (geared toward digital/intellectual property but could be used for tangible property as well) Artistic/digital creations (text/video/audio/software; inapplicable to other kind of property) Software, documentation (some applicability to other artistic creations, inapplicable to other property) Any (though unclear how applies to tangible goods without software)
Both machine and human readable license Yes, both human and machine readable, in XML and plain-text forms Yes, both human and machine readable, in XML and plain-text forms No, human readable only Unclear - Machine readable, unclear if produces human readable licenses
Ability to enforce content access restrictions Yes, at owner's choice, not required No No Yes, Required
Specific grantees Yes, can specify who license is granted to, or grant to the public No, can only grant to the public No, can only grant to the public Yes, can specify who license is granted to, or grant to the public
Allow multiple grantors Yes, content with multiple rightsholders can grant rights at separate times, and multiple rightsholders can be authenticated via multiple digital signatures on the license No, all rightsholders must act as if one single rightsholder No, all rightsholders must act as if one single rightsholder Yes (assumedly)
Defines specific algorithms for sampling Yes, includes many useful algorithms to define what a "sample" is for a given work; other algorithms may be added No; no definitions of what constitutes a "sample" of any kind No; no definitions of what constitutes a "sample" of any kind Not addressed, but assumedly could, since is software-based
Unique ID for each work for tracking purposes Yes (a COCOA-ID) No No Not addressed, but assumedly could, since is software-based
Includes product ID (ISBN, SKU, etc.) for tracking outside the rights system Yes, can include ISBN, SKU, etc. No, generally not included, would have to add manually No, generally not included, would have to add manually Not addressed, but assumedly could, since is software-based
Source of grant Works with any (the COCOA record itself can be the legal source of the grant, or it can contain embedded copies of contracts, pointers to existing contracts, Creative Commons/GPL/etc. licenses, etc.) Only works for Creative Commons licenses Only works for GNU licenses Not addressed, but assumedly could, since is software-based
Includes content description Yes, can include description of content (and/or sample, to help identify work) Yes, XML record can include a description No Unknown
Media types supported Any Any Primarily for software and documentation Any
Facilities to link to external systems (such as other rights control systems) Yes No No Yes
Can use license to locate content itself Yes; a license that is separate from its content can point to an authoritative source for the content Limited; has URL to source but generally doesn't permit licenses separated from content and lacks authentication to ensure authoritative source for content Very Limited; creator can include location of source but generally doesn't permit licenses separated from content and lacks authentication to ensure authoritative source for content Yes
Allows confidential grants (to allow private grants to be distributed via public databases, or keeping contractual terms private, etc.) Yes; if desired, any/all fields in a record can be encrypted so are only readable by grantee No No Unknown
Searchable database Yes, searchable for author, title, type of work, use; entire database public, to enable any kind of search desired (via own search engine as well as Google, etc.) Yes, searchable for author, title, certain specific uses (via Google) No Not addressed, but assumedly could, since is software-based
Authentication/enforcement at time of use Yes, if desired No No Yes, is primary purpose
Allows rights to vest with person, not just device Yes Yes Yes Yes (typical DRM vests with a device, DReaM can vest with a person)
Level of legal binding Can bind legally, voluntarily, or combination Legally binding only Legally binding only Not addressed, but assumedly could allow either, since is software-based
Dynamic licenses (change over time, e.g. to allow subscriptions or other conditions) Yes No No Yes
Work with rights other than copyright (e.g. patents, contracts) Yes, any rights No No Not addressed, but assumedly could, since is software-based
Interoperability Can work with other licensing systems so long as it can be described in plain text (such as Creative Commons licenses, GNU); can work with others e.g. DReaM, if users have the other systems' necessary software installed. No No Unclear; may allow writing modules for other license frameworks.

For more information, visit...

[Top]

How would I use COCOA in conjunction with a Creative Commons, GNU, or
other kind of license?

Very simply: When filling out the form to create a COCOA record, (1) choose Grant Source as "Per Creative Commons or GNU license" and (2) include a copy of the license text or URL pointing to it.



Common Misconceptions

[Top]

Won't COCOA prevent a CDS from indexing and searching books?

Basic answer: NO.

Adopting COCOA will not prevent indexing and searching. (It will, in fact, cause an increase.)

This is a slightly more complex question because the AAP and AG have sued Google over this, so you'll have to pay close attention here: Because of these lawsuits, it's now for the courts to decide if it's legal for Google to (a) make an index of all the words in a book, (b) make that index available online, then (c) present (say) 20-word "snippets" of the original text as part of the search results. (How the courts will rule I can't guess. As Charlie Petit notes, "One of the rights of copyright holders is the preparation of indices of their works." So the courts could rule that Google can't index copyrighted material without permission. However, the courts might also find that they can -- VCR taping was considered illegal until Sony challenged and won. So who knows?)

This is NOT a statement of desirability, whether I want that to be legal or not -- it's a statement of fact that this is being litigated, and the resolution is not clear.

Now.

COCOA has nothing per se to do with preventing indexing. Let's suppose, purely hypothetically, that the courts ruled that Google's indexing/searching was legal. Play along with me. The problem still exists, and COCOA is definitely about who has complete copies of books. So indexing and searching don't require storing complete copies of books:

You don't need to make a copy of a book to build an index. You just write down every page where every word occurs. Maybe you write down a few words of context around each word. But you don't have to store all those words in order of the book itself. This is not splitting hairs: A file with all the words that appear in this web page, listed in sorted order, "a a a a a ... aardvark aardvark ... about about about", is NOT the same as this web page.

You don't need to make a copy of a book to return a search result, such as, "the word 'aardvark' appears on page 23 of this book." You could do that entirely without ever making a copy of the book. (You could make a list of words and what page they appear on: 'aardvark', page 23, 42, 77... 'about', page 18... 'the'...)

You could return "snippets" (say 20 words of text) around every word as search results without storing a verbatim copy of the book. How? You could have a database that looks like this:

		...
		aardvark:
			page 23
			"aardvarks are fun and frolic in the sun" [20 words max]

			page 42
			"he gave the aardvark a cookie"

			page 77
			"the aardvark ambled about the apartment"
		...
		about:
			page 18:
			"Bob said, "That's about it," and the gate opened"
			...

		...
		the:
			...

That doesn't involve storing a copy of the book. That's an index.

So a CDS could do all the above without making a copy of a book. (Whether Google or Amazon etc. do it that way is another question. But remember we're working on the hypothetical assumption for the moment that indexing/searching itself is found legal.)

The serious problem arises when a CDS allows others to use their system to make a copy of copyrighted works without permission.

It was originally possible to extract a complete text copy of a book from these systems by searching on certain words then building a chain of all the words in a row -- extracting the whole book. When we demonstrated this, they made changes to prevent it. So that hole is closed, and not an issue as far as I'm concerned. (Any future CDS would have to likewise prevent plain text copies from being built, of course.)

However, what CDS's also do, is let you see the scanned images of those pages. As documented elsewhere, it's extremely simple, and easily automated via software, to extract all those page images. That gets you the whole book -- a copy that wasn't authorized by the copyright owner. (And it has nothing to do with indexing or searching.)

THIS is what COCOA addresses. Page images. COCOA lets the copyright owner specify which page images can be shown. From 0% to 100%, in assorted convenient ways, which should have the net effect of making more pages of more books available today. But nothing per se to do with indexing or searching.

Now, that's not quite true, as COCOA does relate to the lawsuits over scanning/indexing/searching/snippets in the following two ways:

(1) If scanning-to-index-and-providing-snippets is ruled illegal [that's IF! folks, IF!], then Google will have to obtain permissions to scan-to-snippet, and copyright owners could choose to use COCOA to grant scan-to-snippet rights.

(2) With the legality of scan-to-snippet up in the air, copyright owners could use COCOA today as a simple way to grant that right, if they wanted Google to scan/index their work without it being fuzzy if they have the right; if it's later proven a piece of fair use, then it's moot and harmless.

But, by and large, COCOA has nothing to do with indexing, searching, or showing text snippets. COCOA has to do with showing scanned page images. Indeed, COCOA will increase the amount of copyrighted material that can be searched.

[Top]

Isn't browsing a book on a CDS the same as free bookstore browsing or library borrowing?

It's different for these reasons:

(1) The library paid for the copy you're borrowing. (Or somebody paid for it, in case the book was donated to the library.) Thus the author was paid for that copy. If you read a whole copyrighted book via a CDS and never buy the book, the author wasn't paid. Copyright law is about creating new copies; you're not creating a new copy when you read in a store or from a library.

(2) Browsing in a bookstore is pretty inconvenient. You can't take the copy with you to look at any time you want. (Unless you buy it! That's sort of the point.) Bookstores know that few people really read entire books in the store -- else they'd go out of business. However, reading a book from a CDS doesn't have that limitation: You can take it with you, on your laptop, etc. This is particularly critical in light of digital paper, when the digital copy is the paper copy.

(3) Libraries and bookstore reading isn't anywhere near free: You have to move your physical body to the bookstore to read. For one thing, you can't likely do that at 3am. (And certainly not in your pajamas.) You can't do it from your bed, couch, or desk, without getting up. You have to spend time to move your body down there, which might be 10min-30min each way; 20-60min round trip, plus say 10min to find the book, a place to sit, etc; call it 30-70min. If you value your time at say, $10/hr, that's $5-12. Then there's the cost of transportation. If the library/bookstore is three miles away, 6mi. round trip, and gas costs $2.50/gal., and you get 20mi/gal., that's another $.75. The IRS figures driving a car costs $.405/mile in repairs, wearing it out, etc., so that's another $2.40. So you're at something like $8-15 to go read a "free" book.

Really -- if it were that free, people would do a lot more of it.

Yet reading a free copy from a CDS doesn't have those limitations. It is much closer to $0, actually and truly free. THAT's the problem.

(4) Your library or bookstore might not have a physical copy on hand. Then you're either spending more time/money to move your body to another library/bookstore that has a copy, or you don't get one. Whereas, a copy can always be available online. Another example of libraries "costing" more to borrow from.

(5) You can't pass on a "free" copy you read in the store or from the library. You have to leave the book at the bookstore (or buy it); you have to return the book to the library. Reading a book in digital form that was stolen from a CDS, you could pass that copy on to others by email, via a web page, P2P software, etc.

So, bottom line, bookstore/library reading is fine and dandy, since it isn't really free. CDS copies are essentially free, and that's the problem. They're too convenient to read free. While this may not be an economic harm today (and may in fact generate sales of print copies of books, today), the concern is for the future, should most reading be done digitally, when illegally obtained free copies are then as valuable as legally obtained ones.



Critics Addressed

[Top]

McDaid's humorous parody -- funny, but misses the point

I enjoyed John McDaid's a humorous parody of the COCOA call to action, though of course it entirely misses the point :-), that having a copy of a book to read that you purchased is precisely the legal right paid for with the purchase (or that a library purchased, or that somebody purchased), whereas there's no purchase nor conveyance of rights if someone uses software to obtain a copy from a CDS's database of images.....

[Top]

BoingBoing bounces off-target

BoingBoing perpetuates the McDaid link, and goes further astray by decrying COCOA for preventing indexing and searching. As noted above, COCOA does not prevent indexing or searching or the display of text snippet search results. Cory Doctorow said in private email that he would amend his statement (though hasn't yet).



Miscellaneous

[Top]

What's a "copyright liberal" vs. a "copyright conservative"?

A "copyright conservative" would be someone who is highly protective of copyrights and defends them vigorously; I'd say Harlan Ellison is an example. A "copyright liberal" is someone who is open to trying things and not as worried about piracy, etc.; I'd say someone like Eric Flint of Baen Books (creator of the Baen Free Library) is an example. (In the context of the make up of the group that drafted COCOA, Harlan Ellison's attorney and Eric Flint are both on the committee.)

[Top]

Isn't it legal to read a CDS copy if I own a print copy?

That's an interesting question. I don't know if it's been litigated yet. I see both sides of this. Pro: If I already own a copy, this is like me making a portable, personal copy for quick use elsewhere. Con: If you buy a paperback copy, you aren't entitled to a hardback copy for free.

A CDS copy of a book adds a lot of features, like searchability, cut&paste, reading via ebook reader (or future digital paper books) that you didn't pay for when you bought a paperback. Some of those might be ruled fair use by the courts, but, as I said, I don't think these have been litigated yet. Amazon.com is allowing paid access to online copies of books you've bought, for a price. So all this is up in the air right now.

Regardless, a CDS can't make an online copy of a whole copyrighted book available to those who haven't paid for it, without permission of the copyright owner.

[Top]

Is COCOA hard for CDS's to implement?

It should be really simple for them. Basically COCOA identifies books and pages within those books -- what can be shown, what can't be.

It should be a simple matter to prevent the blocked page numbers from having their JPEG scanned images appear in their database of page images. ("If (page_is_not_blocked(page_number, book_id)) show_page()...")

[Top]

What rights are involved in CDS displays?

Well, here's a problem area, as this is not clear. All rights originate with the author of a work. The author then signs a contract conveying certain rights to a publisher; but rarely do they convey all rights. Typically a book contract will call for "electronic rights", meaning the publisher has the right to produce an electronic edition of the book. However, what rights are actually conveyed to whom is in no way standard. For example, an author of a short story in an anthology may not have sold electronic rights to the publisher of the anthology, yet the publisher may have conveyed electronic rights to the entire anthology to a CDS for display.

Another avenue some CDSs have mentioned is the use of marketing rights. That is, using small portions of a work in order to market the whole. This could be problematic for a CDS when the majority or entirety of the work can be viewed by multiple users.

Some publishing contracts with authors spell out specific rights for CDS usage.

Authors may wish to include specific language in their contracts so there are no miscommunications. References to COCOA-created grants may also make sense in this context.

[Top]

How come people signed the petition multiple times?

They really like it!

But seriously, it's a function of browsers and the petition site. If you go 'back' in your browser or reload the signing page, you risk getting signed up twice. There aren't that many duplicates, and they're very obvious if you take the whole list and sort it by name.

[Top]

What can I do to help?

Urge Google, Amazon, Microsoft, Yahoo!, the Open Content Alliance, etc. to adopt COCOA. The two best ways to do this are:

1) Please SIGN THE PETITION at:

http://new.petitiononline.com/cocoa/petition.html

It's worded for brevity; the details are at the COCOA web site: http://www.CopyrightAccess.com

2) Please SPREAD THE WORD: Urge others to sign the petition, learn about COCOA, and likewise encourage others to sign the petition, spread the word, and urge yet others to..........

Please post on your blogs, tell journalists you know, put links on your web pages, etc.

THIS LINK has a brief 'call to action' you can copy and spread around. Thanks!


Content © 2005 copyrightaccess.com | Design © 2002 webkits4u.com