Realtor.com operator Move Inc. says it’s blocking automated Web bots from accessing more than 1 million pages a day on the listing site, and that such efforts have resulted in a sharp decline in attempts at "scraping."
The company says it monitors about 137 million interactions on the site per day, looking for patterns that indicate illicit behavior such as the systematic copying of listing data, blocking suspicious IPs from accessing Realtor.com.
Realtor.com gets its listing data directly from multiple listing services (MLSs). According to a survey of MLS executives by the Council of Multiple Listing Services (CMLS), most are concerned about scraping — the unauthorized copying and reuse of their listing data.
"Scraping happens everyday, and it’s something that’s surprisingly inexpensive for cybercriminals to do," Move Creative Director Amit Kulkarni said in a blog post detailing the company’s anti-scraping efforts.
As an example, Kulkarni posted a solicitation from a website where companies or individuals put projects they want to contract out to programmers up for bid.
"I am looking for someone to write a program to scrape data from Realtor.com," the bid request said. "The program would take U.S. ZIP codes (from a ‘.txt’ file), input them into Realtor.com, and then take the output from the pages that follow and create a ‘.csv’ file for each ZIP code. Ultimately, I will be running the program for thousands of different ZIP codes. I am willing to pay up to $50 for this program."
The post attracted 12 bids, closing on March 26, 2011, with the status, "awarded."
But Kulkarni said Realtor.com has been so successful at thwarting such efforts that scraping attempts are on the decline, falling from 385 million attempts in October 2011 to 75 million attempts in February.
Brian Larson, president of consulting firm Larson/Sobotka Business Advisors LLC, told Inman News in December that he expects more MLSs will go after "data pirates" who redistribute MLS data without authorization, and that his firm was working with clients to evaluate ways of shutting them down.