Category: Technology Author : Steven Hodson Posted: February 22, 2010
Tags : content, google, Google Buzz, google reader, scraper
Google Buzz: the newest scraper on the block
Technically Buzz isn’t a scraper in the truest sense of the word, however the way that Buzz presents our blog posts is identical to all those scrapers sites that the tech blogosphere goes ape-shit over. Now, it was pointed out to me in a private message on Twitter from a friend that this is exactly how Google Reader Shared Items does it. My reply back to him was that Reader and Buzz are two totally different beasts and the content being shared via Buzz should be treated differently.
The thing about Reader is its reach. In contrast to Buzz; which is meant to reach as many people as possible, Reader Shared Items is typically a much smaller segment of people. The ability for items to go viral in Buzz is profoundly greater than those same items in Reader. The problem is that there is no benefit for blog owners to have their content piped into Buzz as there is literally no chance of any traffic finding its way to the originating site.
This is the one thing that Friendfeed did right. With Friendfeed only the post title was posted and if you wanted to read about what the post was about you had to click through. With Buzz on the other hand the entire contents of the post is there – there is no incentive for click through traffic. Buzz gets all the buzz and the content provider gets none.
There will be those really irritating people who say it doesn’t matter where the content is as long as it is being read and talked about. Bullshit. Period. For bloggers who are either doing this professionally or as part of building themselves a home base of the web our blogs are our headquarters. Everything we do emanates outward from there but with the way that Buzz handles our posts they are claiming defacto ownership over over it. Then on top of it just as they are with GMail you can be assured that at some point they’ll be running ads on Buzz – against your content as well as everyone else’s.
Now before anyone jumps down my throat on this – which will happen anyway – I have no problem with Buzz carrying my content. After all I’m the one who agreed to piping it in there however the solution to the scraping problem is an easy one. Either go the same route as Friendfeed did and just post the headline or even just a short blurb from the content for context. I do however draw the line ad making the complete contents available.
There are two other points that I want to raise about how Buzz displays content especially when it comes to the blog posts we pipe in there.
The first point I want to make is the whole URL thing. As you can see in the above graphic is extremely hard to tell exactly what the blog post headline is. Unlike the rest of the items that show up in Buzz there is no way to visually distinguish what the post headline is as well unlike the rest it isn’t permalinked to the original post.
Sure the blog name from where the post comes from is linked to the original source the only problem is that it is displayed in such away that it blends into the page. Even I didn’t think that the link headed to the original post but rather was just a default link to the blog itself. Oh, and yes there is a handy dandy link option in the drop down menu but again this is a secondary option and not one that the average user will think of.
The second point that really needs to be addressed is the display of images. There are two definite problems with the way that Buzz currently handles it. First off in some cases even if there is only one image in the post Buzz for some reason displays multiple copies of that image. This is just bad design folks – seriously. It gives Buzz a real look of being an amateur effort.
Sticking with images the second problem I have is that when there are actually multiple images within a post Buzz will display all of them (or as is the case above displays the same image for the number of actual images within the post). The thing here is that often times multiple images are setup in such away to entice the potential reader to click through to see them all. Sure the way Buzz is doing it is more in favor of the reader and keeping them on Buzz but once more this is taking traffic away from the originating site.
In both cases I believe the Google Buzz team really needs to rethink how it is displaying our content. As it stands our content is being used to the benefit of Buzz not the content producer.
Now don’t take this post as coming out against Buzz because I am not against it. In fact I hold out a lot of hope for Buzz because I believe that there is a hell of a lot of potential for all parties involved. I just want some equality for the content producers so that we don’t end up just being the fodder for their cash cows.









Feb 22, 2010
I am concerned about the same. Is there something the publishers/bloggers could do to bring this to Google’s attention. After all FreindFeed did it correctly. Maybe suggest this as an improvement or feature request.
Would changing the full feeds to partial feeds the only option left for the publishers? Any suggestions?
Feb 23, 2010
By the way, your post at Braincell Soup was awesome.
Feb 24, 2010
I think that Google will find it out sooner and strive to rectify the errors. After all the way they are marketing about the product it doesn’t seem to be hidden from the public for a long time. However, it remains a thing to be seen that how good Google Buzz fares after the use by the public for sometime.
memory card
Feb 24, 2010
thanks
Feb 24, 2010
I really don’t like the idea of having to change to a partial text feed just because of what Google is doing with Buzz. I know one blogger who has pulled his feeds from Buzz and is adding them manually so that he has some control over what is being shared but than can be a bit of a pain to say the least
Feb 26, 2010
Steven, – I am ambivalent about this one – I have my stuff so scraped at this point I just can’t care. It is amazing to me to think that on average every article I write ends up scraped in some fashion at least 4 times (thanks to fair share I now know this) – many don’t link back, many have ads.
Ads are such a poor way of making money that I turned them off. If someone is willing to work for pennies then great go for it. But for some it is worth it.
I just don’t have the time to hunt down everyone who scrapes content – google or the scammer down the electronic street. It is great to take a stand, yet is it the right stand? Or do we need to do something bigger and bolder.