Cookies on Pinsent Masons website

This website uses cookies to allow us to see how the site is used. The cookies cannot identify you. If you continue to use this site we will assume that you are happy with this

If you want to use the sites without cookies or would like to know more, you can do that here.

Is content scraping legal?

We ask whether the website scraping that underpins the emerging aggregator industry falls foul of the law01 May 2008


 A text transcription follows.

This transcript is for anyone with a hearing impairment or who for any other reason cannot listen to the MP3 audio file.

The following is the text spoken by OUT-LAW journalist Matthew Magee.


Hello and welcome to OUT LAW Radio, the weekly podcast that keeps you up to date on all the twists and turns in the world of technology law.

Every week we bring you the latest news and in depth features that help you to make sense of the ever changing laws that govern technology today.

My name is Matthew Magee, and this week we look at the legal uncertainties surrounding the website scraping that underpins an entire new industry.

But first, the news; Swiss company claims loophole protection for UKTV rebroadcasting and EU orders special funds for session musicians

An internet start up is re broadcasting UK television from Switzerland without the station's permission. It is re broadcasting all five UK terrestrial channels online but claims it is not breaking the law.

One legal expert, though, said that the company's re broadcasting could be short lived, and that the Government could close the loophole at any time.

Zattoo.com claims that it is not in breach of the UK's copyright legislation, and that a loophole allows for live re broadcast of material from public service broadcasters.

Kim Walker, head of intellectual property at Pinsent Masons, the law firm behind OUT LAW said that even if Zattoo has found a legitimate loophole, it may not be open for long.

He said that the Copyright Act says the Secretary of State has the power to withdraw the permission at any time, and that if this is a loophole that the Government is not happy with the Secretary of State could say that the right does not apply to re transmission online.

Record labels would have to set up special funds to pay royalties to session musicians under copyright reforms proposed by EU Commissioner Charlie McCreevy.

The money for the funds would come from McCreevy's proposed extension of the term of copyright in performances from 50 to 95 years. McCreevy said that a percentage of record companies' increased revenues from this change should be paid into the fund to go directly to session musicians.

Session players are hired hands who, unlike members of bands, are generally paid a flat fee for a recording session and do not benefit from any publishing royalties unless they have helped to write a song. Though they are entitled to performance royalties, these tend to be far lower.

McCreevy wants extra money earned from a copyright extension on performances to be earmarked for them. He said it was to prevent the sudden withdrawal of royalty income late in a musician's life after the 50 year period has elapsed.

That was this week's OUT LAW news


When Ryanair complained about Lastminute.com's reselling of its flights last year it objected to that company scraping its website. This odd sounding practice does not involve taking a razor to a computer screen but it might involve database rights infringement or the Computer Misuse Act, or it might not.

Scraping may inhabit a confused, grey legal area but it is at the heart of a multi billion dollar new industry that is changing the face of online business. We are entering the world of the aggregator.
The biggest names in aggregation are Confused.com, and Moneysupermarket.com. Their business is to find consumers the best deals by checking tens or even hundreds of suppliers to see who has the cheapest prices.

Their pre internet equivalents would be brokers, but aggregation is different because of its scale. They can quickly scan all internet published deals so that you do not have to, increasing the range of your search enormously.

This is where the scraping comes in. It is the term given to the practice of automatically going to supplier websites to get prices for a certain product or service.

Richard Mason is the managing director for insurance and home services for Moneysupermarket.com. He explains how it started scraping.

Mason:   Initially when we first started the business we were screen scraping the data. back. We would build a script, would go onto an insurance company's website; put in all the information necessary for that insurance company to produce a price and then scraping, that is where the term screen scraping comes from, we would scrape the price off and bring it back to our results table; and we did that initially with about 23 different insurance companies.

Mason said that these days most companies let his firm into the guts of their system so that they do not have to scrape the public facing website.

But is the practice legal? Ryanair seemed not to think so, telling the Advertising Standards Authority last year that its terms and conditions specifically forbade the use of its information for commercial purposes.

Struan Robertson is a technology lawyer for Pinsent Masons, the law firm behind OUT LAW. He said that there are two kinds of laws that might apply.

Robertson:  The site that is being scraped might say that its terms and conditions forbid any form of scraping. The site that is doing the scraping might say well those terms and conditions do not count. They are not incorporated into any contract we have with you. It may be a fair argument if those terms of use are just an optional link like they are on Ryanair's website. The other thing that could apply is the database rights. There is a set of regulations that gives special protection to databases. Nobody is quite sure how much protection databases have. There was a case a few years ago involving William Hill against the British Horseracing Board and another case called the Fixtures Marketing Case. These cases kind of cast doubt on the strength of the database rights. It now means that we are not really sure how it would apply to a scraping situation. The site that is being scraped that is asserting these database rights probably has the upper hand but really only a court can decide. The Compute Misuse Act is basically the UK's anti hacking law and it says if there is unauthorised access to somebody's material, then there is an offence committed. The trouble with that is whether you can get the prosecutors excited about it, whether there really is unauthorised access. I think that if you are talking about scraping data from a site that is password protected you have a much more powerful argument for saying that there is criminal scraping going on, than if you have scraping from a site where all the pages are open to the public.

Mason has first hand experience of the legal difficulties. Most businesses are delighted to appear on aggregator sites because they bring customers, but insurance company Direct Line ordered Moneysupermarket to stop scraping its site.

Mason:   They sent us a letter which threatened that what we were doing was in breach of the 1990 Computer Misuse Act because that basically states that if you go into somebody's computer without their permission, it is a criminal offence and also that any data that we pull back was in breach of the Copyright and Databases Act. So we were threatened with criminal action if we continued to do it and of civil action if we continued to do it. The firm in London that we were going to, to get legal advice and they basically said that there was the British Horseracing Association and William Hill had been the only case ever tested where somebody was pulling information from somebody's website and that was going through some sort of additional course in Europe. Because the outcome of it was still unknown, it was likely to be a very very expensive court case if we were going to fight Direct Line in the courts and the outcome of it was completely you know in the lap of the gods. So, given that we were a small business at the time and we probably could not afford several million pounds worth of costs, we decided that it was probably safer to just leave Direct Line off.

The stakes are high, aggregation is big business. Aggregator sites earn money from the companies whose databases they search. If you have an insurance business, for example, you will probably spend a lot of money on Google AdWords, trying to get your advert to appear when people search for insurance. But it is a hit and miss game. Mason explains why many see paying aggregators as a better bet.

Mason:   If an insurance company like Direct Line is bidding on Google for traffic, where the majority of its online traffic will now come from, it will be paying £4 or £5 per click to its website. But it pays £4 or £5 for every click, whether that person is too old; too young or you know, somebody who is totally inappropriate for Direct Line. So the traffic that comes to our site, people then fill in all the information and get 60 prices. So they are getting prices even if they have got high numbers of claims, convictions, they are too young, too old, they are still seeing results. And then when somebody clicks on our website to an insurance company, the insurance company know that person has pretty much viewed almost every insurance company in the market and has chosen them to buy from. So a link from Moneysupermarket to an insurance company is much more high quality link than a simple link from Google. Most insurance companies are quite happy to pay for the traffic when they realise they have got somebody who is twenty times more valuable because there is a twentyfold increase in the conversion from somebody who is clicking to get a quote, to somebody who has now seen a quote and wants to buy.

And the change to Google's policies is about to make life even harder for companies. Robertson explains.

Robertson:  At the moment if you have a trade mark you can say to Google, do not sell my trade mark as a key word to anyone else. So that gives you a virtual monopoly on adverts appearing when people are typing in your key word provided you send that instruction to Google. It is something that applies everywhere outside North America. Later this month Google is going to change that policy for the UK and Ireland. It is going to change the policy to reflect the position in the US and Canada, whereby anyone can bid on any key word. So trade mark holders will be bidding to sponsor their own trade mark as a key word alongside others who will be able to sponsor that Trade Mark, including aggregators.

Mason said that his company will be taking full advantage of the opportunity to drive traffic to his site when people search for specific brands.

Mason:   On the 5th of May Google are removing that policy, so anybody can bid on anybody's brand name and we will be doing so, yes.

Mason said that new businesses in the insurance market now spend up to a year pre launch in contact with Moneysupermarket making sure their systems are compatible with its ones. It is a sign that regardless of the objections of some firms and some legal uncertainty, the business of scraping is here to stay.


That's all we have time for this week, thanks for listening.
 Why not get in touch with OUT LAW radio? Do you know of a technology law story? We'd love to hear from you on radio@out law.com.

Make sure you tune in next week; for now, goodbye


OUT LAW radio was produced and presented by Matthew Magee for international law firm Pinsent Masons

Join My Out-Law

  • See only the content that matters to you
  • Tailor Out-Law to your exact needs
  • Save the most useful content for later reading
  • Tailor our weekly eNewsletter to your interests

Join My Out-Law

Already signed up to My Out-Law? Sign in