A perspective from a data publisher
Chairman, Data Publishers Association,
Abstract: Data publishing, where professional content is created, selected, collated and delivered as databases to professional and academic users has been a major adopter of digital technology.
Data publishers are now also business service providers and software applications designers. User agreements typically rely in the first instance on Copyright law to protect that investment but increasingly contracts include user specific terms and conditions and licenses reflecting the bespoke nature of the relationship.
At the same time digitally enabled copying of music and audiovisual has led to pressure to recognise this through modifying existing copyright legislation.
Data publishers find themselves in a dilemma, on the one hand seeking additional protection for investment in database structures, content and delivery platforms yet on the other seeking to allow legitimate users access to and use of technical and professional information.
New business models are emerging which reflect the dis-intermediating impact of digital technology on traditional intermediaries including publishers, bookseller and libraries.
Why copyright remains important
A perspective from a data publisher
The debate on copyright has been brought about and dominated by a discussion of the issues surrounding consumer use and re-use of content, usually music, film or video, more often than not obtained digitally.
There is, however, a significant and no less important sector of the UK creative economy where copyright is as relevant but one which has had little recognition or consideration in the discussion on the importance of copyright in the digital economy.
This sector is part of the publishing universe occupied by data publishing, specifically professional, academic and business information publishing, the world of “need to know” rather than” nice to know”.
In this part of the publishing world content is created and published to inform rather than entertain the user. This is a changing market; publishers now create and own the content and are increasingly software developers, service providers and systems integrators and the role of librarians, as the purchasers and gatekeepers but rarely the final user of the content, is threatened by the growth of near universal internet access and the impact of search engines.
Data publishers have invested heavily in the transition to digital publishing and distribution of content but are caught in the crossfire of a debate which takes little account of the complexities, legal and technical, of a market where content is delivered to be used and copied.
However the debate remains largely focussed on issues of whether copying by consumers is “criminalising” the millions of downloaders and file sharers. These activities are only “illegal” if they lead to commercial piracy. The emotion, deliberately, generated by the “copy-left” protagonists ignores the damage which any general relaxation of copyright would do to most of the creative industries. In focussing on these populist aspects of the debate many important issues do not get the serious discussion they deserve.
Those promoting this debate are not without their own economic self-interest. The advent of the digital economy has created more opportunities than threats. Internet service providers (ISP’s) and search engines are dependent on access to or the distribution of free-flowing content for the success of their businesses. The, often, unauthorised diversion of remuneration, as a result of reduced content sales or advertising revenues, from the rightsholder to the service provider or ISP is not sustainable if new, quality content is to continue to be produced and made available. Libraries are also increasingly active in this debate as they seek to protect their role of information disseminators as users stay away from physical establishments. The actions of libraries, in seeking further IP exceptions, should be seen in this context as part of a self-preservation policy to compete with the increased use of search engines, who have not respected IP rights to the same extent as libraries, as a means of finding and accessing information.
What is being protected? Creativity, expressions of ideas, yes; but the investment in the creation of factual data and the indexes, schemas and metadata structures are no less creative and involve no less an investment than that necessary to create a piece of literature or music. Are the investments in these essential parts of the creative process of data publishing properly protected?
Data Publishing….. Copyright, creativity and investment in factual content
It is important not to confuse the business challenges facing “infomediaries” (publishers, printers, libraries and search engines) with a failure of copyright.
Data publishers are at the leading edge of a shift in the value chain. Whilst content creators are and continue to be preeminent it is now publishers and other elements in the distribution chain who are challenged. The history of publishing can be viewed as the story of the impact of changing technologies on how the written word is communicated. The Statute of Anne in 1710 recognised the rights and claims of content creators as opposed to the owners of the then dominant technology, printing, and in doing so enabled publishing to prosper. As new technologies evolve they invariably cause stresses in existing business models. In the last twenty years technology has been in the ascendancy. There is now a growing realisation that technology without content or technology which destroys content creation is unsustainable .Technology companies recognise this and have started to acquire content, initially with disregard to the rights holders’ interests, by any means to satisfy the appetite of their applications for deep and rich content.
Business models are rapidly changing to reflect the impact of those technologies. In this digital world those involved in the “traditional” publishing value chain of content creation (authors), selection, the adding of value through the payment of advances and guarantees, application of financial and intellectual capital such as branding editorial interaction, design, printing and marketing (publishers) and distribution (retailers and librarians) have all been, to greater or lesser extent, affected by the impact of new technology. The current debate is part of the process of the rights holders and users to establish those new business models and achieve a fair balance of rights to access and use which does not undermine the legitimate economic rights of the rights holders.
Authors have been empowered by these changes. Self-publishing, digital marketing and promotion in the print and music sectors have enabled authors to build their own businesses and achieve advancement in their professions. Some of these content creators have taken a less rigid approach to copyright, taking the view that building a reputation by allowing copying will allow revenue streams to be developed in other areas. These dynamic new business models are challenging the traditional publishing models.
Traditional publishing has been severely challenged by digital technology. Lack of anticipation and lethargic reaction has meant that many consumer facing publishers (fiction, non-fiction, magazines and newspapers) are struggling to adapt to and rebuild revenue streams in a world where content is accessed online (legally and illegally), advertisers are paying search engines rather than content owners, traditional routes to market (high street booksellers) are challenged by supermarkets and online retailers and broadcasters, some using public funding, are offering comprehensive alternative free at point of access content offerings.
Publishers of business, technical and academic content, although subject to the same macro changes, have turned these changes to their advantage. The changes brought about by the development of digital technology to create, organise and deliver content direct to users are an early indicator of the benefits that can be derived in this exciting new world.
In this information sector content is still created, usually by authors but increasingly by selection and arrangement of empirical data systematically collected from external data sources, with other data and content, licensed or freely available from public sources, normalised, aggregated and delivered to users with software and access interfaces allowing users to select and arrange the content to met their specific needs.
Copyright protects the rights of the creator to enjoy an exclusive right on their creativity. Ideas are not copyright only the expression of the idea. Facts are held to be beyond copyright. Yet what is a fact? Some are obvious, names and addresses, others less so. Statistics and data, other than those established by empirical measurement, are often held to be facts. This is rarely the case; statistics are no less the result of a creative process than any other copyright work. In reality most facts are estimations based on samples adjusted and calibrated by an analyst. The production of statistics and data is no less a creative process than the production of any other form of copyright material. It is ironic that the most successful copyright defence tool data publishers have been able to deploy is that of implanting erroneous data, a false name or made up address, into a database.
In the data publishing world the investment costs of data collection, aggregation, normalisation and analysis are high. Specialist data sets are expensive to produce yet often are targeted at a small user community.
Databases are nowadays rarely simple collections of otherwise published materials bundled together, as in an anthology, to which the publisher adds some search and retrieval software. Increasingly databases are sophisticated amalgams of data (historic, compiled, newly created, licensed) from selected sources. The data is validated, normalised and aggregated. Active, dynamic links to external data sources (text, audio visual) are often included. Metadata is produced, allowing the information to be precisely selected, in order to meet the users’ specific needs. Usage and access monitoring technologies are embedded to allow subscriber access depending on the business model (free access or subscription) and specific user requirements. The data is then compiled into a database which works with specific delivery platforms to enable access.
Data publishers are increasingly working directly with users to determine what features of the database and its functionality are required. The interfaces and access applications provided by the data publisher are growing in importance as being part of the reason that users select and pay for access and use of a publisher’s content and service offering. Customer defined selections and applications allow additional user generated content and third party applications to be integrated into a common interface precisely tailored to each individual subscriber organisation’s needs and requirements.
In these situations simple reliance on copyright by itself, is no longer realistic. Copyright, of course, remains the basis of the relationship between creator and user for the reason that it is the right to use that copyright which is being contracted and licensed. A rightsholder may take an action against a user for breach of contract or licence but this is only valid if there is an enforceable right in copyright. Users pay for access to a database because of the added value its use gives to their organisation. That relationship is normally contract based which typically reflects copyright but further recognises the specific uses the purchaser wants to apply that content to. Increasingly the delivery and maintenance of these services are ongoing business relationships. Unlike the sale of a book or a journal subscription where a simple invoice and reliance on the ongoing general provisions of copyright law would usually suffice, these new business relationships involve long term contracts, service level agreements, usage terms and conditions. Contract and licensing law is now as important in this as copyright law.
The growth in the use of research data by commercial enterprises was recognised in the 2003 Copyright Act. Copyright law had accrued a number of exceptions which allow copying to take place under certain limited and specified circumstances. For UK data publishers the most relevant is “fair dealing for research and private study” (not to be confused with its distant and rather different US cousin “fair use”). The 2003 Copyright Act, which implemented the EU Information Society Directive, recognised the abuse of this fair dealing provision and amended Section 29 of the 1998 Copyright Design and Patents Act to include the words “non-commercial”:
“ Fair dealing with a literary, dramatic, musical or artistic work for the purposes of research for a non-commercial purpose does not infringe any copyright in the work provided that it is accompanied by a sufficient acknowledgement.”
Added protection for Databases
Today’s data products deliver considerable added value and user benefits over and above print products. Data services are accessed, read, used and reused. They are as much a part of commercial investment as the financial and human resources which are essential to modern investment decisions. Business, reference and academic information is by nature different in that it has the potential and intention to be used commercially. The more it is used the greater the value the user attaches to it and the greater the ability of the content provider to invest and develop further successful services.
The need to protect this investment, that is the investment over and above that of the creation of the underlying content, was recognised by the European Union with the publication of the Database Directive in 1996. The implementing legislation protects databases in two ways; firstly a database may qualify for protection as a copyright work; a second, Sui Generis, right recognises that some databases will be protected as the result of the substantial investment in the obtaining, verification or presentation even though they have no copyright. This value of this legislation in recognising the investment in the database as a whole, the Sui Generis right, has, as the result of complex and apparently conflicting judicial decisions, been questioned. This is unfortunate as this is an important legislative recognition of the need to protect investment in collecting and arranging content which may not of itself have the protection of copyright.
As “data publishers” become “information providers” they generate intellectual property beyond that of original content (factual or not). The database as a whole may enjoy either or both of the rights recognised by the Database Directive but the creation of schemas and metadata which describe and help to order data are in them creative acts resulting from significant investment and are in themselves worthy of protection from unauthorised copying and utilisation for commercial purposes.
Access to and reuse of information through libraries
The new order is challenging the traditional role of libraries. Just as publishers are having to adopt new business models so too are traditional libraries ; wider societal internet access keeps an ever increasing numbers of users away from libraries and as screen based communication erodes reading as a leisure activity and as a means of accessing factual and reference information .
The traditional role of the librarian as benevolent custodian of publisher materials is increasingly being questioned by publishers of high value products as libraries strive to maintain their role by giving access to digital content. This is particularly the case in libraries which serve the research and academic community. At a time when copyright is generally disregarded by users who have become accustomed to the ease of copying, albeit usually illegal, of digital consumer content, libraries are faced with the difficulty of policing access to high value content. The potential for conflict of interest is unavoidable and results in libraries now being both the source of much of the illegal copying of high value content and at the same time the source of the greatest pressure for the further relaxation of copyright by the increase in the scope of library exceptions and moves to prevent the use of licences to regulate user access.
This part of the debate is brought into sharp focus by the deployment by publishers of increasingly sophisticated processes, technological and contractual, to monitor and control access to their services to ensure use is compliant with the publishers’ user agreements’.
The use of this technology is a bone of contention for librarians, yet it is libraries which are, for high value data publishers, the increasingly leaky bucket as the source of data being misappropriated for purposes which damage the economic rights of the publisher. For example:
A rival publisher accesses a competitor’s content by registering as business user at a business school library. The publisher’s researchers systematically download large sections of the database and republish it as their own. They are exposed when the original publisher’s clients recognise content through seeded data, and deliberate misspellings which have been repeated.
A publisher’s usage monitoring system identifies high volume usage by a European university student user ID at a European university from an internet cafe in India. The university library confirms that the user access account is registered to a foreign student whose home address is India. The student does not to respond e-mails and fails to return to the University to continue studies.
These cases highlight the difficulties faced by librarians and publishers alike in dealing with this problem. The reality is that, in a world where the disregard of copyright is now commonplace in all user communities, technological measures to control measure and regulate access are necessary.
In competing with the internet availability of content and the increasing functionality of search engines libraries are having to address own raison d’être and are developing new business models which risk a breakdown in the trust relationship long enjoyed with publishers. The concerns of the publishing community over access and use of digital publications through the extension of legal deposit reflects this concern.
At a Creative Economy Conference in London in 2006 the importance of copyright was recognised:
“Copyright is crucial. In this new era, everything becomes a subset of Intellectual Property. We believe that copyright has been a highly effective mechanism to generate creative wealth in the industrial mechanical age, and the concepts of copyright will continue to do so as they adapt to the online era.”
If Marshall McLuhan were writing today he may well have come to a different conclusion; not that it is the “Medium which is the Message “but rather it is the “Medium which needs the Message”.
The continued growth in digital media is dependent on content. The higher the quality the better the chances of market success through user demand. New technologies come and are in turn replaced by newer technologies which will thrive until the next new technology arrives. These technologies only succeed because of the content they facilitate access to. The demand for high quality content has never been greater and the need for protection of the investment of the creative industries to enable that high quality content to continue to be produced is, equally, as necessary as it ever was.
Copyright remains fit for purpose in enabling that which Queen Anne intended: “the encouragement of learning” and “to enable learned men to compose and write useful books”.
Data publishers surely “do” just that….create “useful” content sets and facilitate access to data.
Copyright does not lock up content; it protects the right of the content creator to secure a return on the investment in creating that content. It is an economic right and the redress allowed under UK law is the loss of profit suffered as a result of the infringement. If the damage is not substantial the redress will not be substantial but the prevention of further, greater infringements may be prevented.
Content remains king but technology is, as it ever was, the king maker. Creators of content must have the right to protect their investment. Copyright ensures a universally applicable “ground zero”, a safety net for all content creators whilst still protecting users’ rights. Yet in a fragmented and increasingly technologically enabled media world, specialist applications and uses need further protection.
Data publishers combine high quality content, data structures, software and access systems creatively engineered to meet users’ specific needs. If this activity is not protected by copyright the publishers and users must be free to negotiate licences, contracts and user agreements that meet the needs of all parties whilst recognising the fundamental and important protections of copyright.
Trevor Fenwick. August 2010
Trevor Fenwick has over 30 years experience of the global business information market as Managing Director of Euromonitor International, the leading data publisher of market research analysis and reference databases.. Based in the UK, has offices in Chicago, Singapore, Shanghai, Vilnius, Dubai, Cape Town, Santiago, Sydney, and Tokyo.
Chairman of the Data Publishers Association, an ex-President of the European Association of Directory and Database Publishers he has been directly involved with representing the business information sector’s views on intellectual property and data protection to UK government and the European Commission since 1990. Trevor is a member of the Publishers Content Forum, and a past member of the Legal Deposit Advisory Panel and the Advisory Panel on Public Sector Information. He represents the DPA onof the Advertising Association’s Council.
He holds a degree in Economics and Government, postgraduate qualifications in marketing and is a Fellow of the Chartered Institute of Marketing. He is a Liveryman of The Worshipful Company of Stationers and Newspaper Makers.