A little over a week ago, Ed posted about someone who was >re-publishing his RSS feed. Scoble, who is a passionate advocate of providing full feeds instead of excerpts, and who maintains a link blog in addition to his regular blog also recently commented on re-publishing and also linked to Richard McManus' article RSS Rip-off Merchants about software whose sole purpose is RSS re-publishing. These articles have convinced me to speak up about something that has been bugging me a bit for a while.

I've been aware for a couple of months of a Blogsphere-based site (which I will not link to here) that consists of something like 80-90% re-published excerpts from pages on the IBM site and from other blogs. I'm not referring to News4Notes, which isn't based on Blogsphere, and I don't have a problem with News4Notes even though the content there is close to 100% re-publishing. I think I do have a problem, though,. with this other site. where the "Written By" header leads off each article and gives the name of the site owner, but most articles consist entirely of a paragraph or two taken from a technote or from someone else's blog, followed by a link labeled "More @" and a link to a hostname. I repeat: a hostname. Not the name of the site. Not the name of the article. Although technically this is an attribution, IMHO it is a weak attribution. One thing I would do if I were the owner of the blog in qestion would be to go into the translations configuration and change "Written By" to "Posted By". That would immediately help to reduce confusion. But I would do more.

Scoble's link blog does attribution right. In the title of each post, he ends it with "From" followed by the name of the blogger whose post he has re-posted as an excerpt and link. Most other bloggers do attribution right, too, most of the time. Here's a recent post of mine that re-posts from Koranteng Ofosu-Amaah. Notice that I take the minimal trouble to start with my own instroductory statement "Highly recommended reading by", followed by his name a link to his blog, and then I put the link to the actual article before the excerpt. Then I do what I always do for extensive quotes: I use the <blockquote> element, which I have styled to indent, draw a box, change background color, and use italics, For shorter quotations, I sometimes just use ordinary quotation marks (with or without italics). And when I'm posting something that someone else has already re-posted, I will almost always use "Via" as a way to credit the original reposter in addition to crediting the original poster. I'm perhaps doing a little bit of overkill to make absolutely certain that there's clear indication that I am doing a fair use excerpt of someone else's material, but I'd much prefer to see overkill rather than ambiguous or missing attributions.

Anyhow, I'd like to know what others feel about proper attribution, How do you think you're doing as far as making attribution clear goes? What do you do, and what do you recommend?

And specific to Blogsphere, should the "Written By" header be selectable on a per-post basis. such that it could be either "Written By" or "Posted By" depending on whether the material is original or re-published.

1. Koranteng Ofosu-Amaah06/07/2005 07:58:44 AM

One thing though, typically the default text editors (and certainly Blogger) don't have an icons for blockquote indentation or for the cite tag for 'via' attribution. The lack of those simple smarticons might contributing to the impedance in attribution, and posts such as yours will sensitize . I've written enhancement requests to blogger and bloglines to allow users to customize the buttons in their editors and to have a wider set of default icons, beyond bold, italic, underline and strikethrough. To take an example, this comment box on this blog has those four and a bunch of smileys. If I had wanted to quote or cite someone I would be out of luck, and who knows if blockquote or cite would pass your html sanitizer in any case. But that's the terran that we're living in.

I happen to use Notetab as my html editor and that has a clipbook full of all the obscure html I use (abbr tag anyone?)

And while on the topic of malicious reposting to sell insurance or mortgages, I noticed that a few weeks after my first few pieces on technology, first the Gmail and DHTML architecture, started to get a little traction (as evidenced by delicious/popular), a lot of those spam blogs and sites lifted my crusty copy.

The good thing is that a few months of sustained posting couple coupled with a smidgen of PageRank tweaking meant that my google juice was restored and those pages have slid into the long tail of search engine results - and we know how many people are "feeling lucky". That means that the system works itself out eventually.

As Cory Doctorow says all complex ecosysystems have parasites and for now spam on blogs is relatively benign for the authors who are consistent. David Sifry however seems to be loosing sleep about this very issue, but that's another story and one for the toli.

