This is an omnibus follow-up to my earlier posts on copyright, fair use, and plagiarism.

1. A while ago I found a site that appeared to be scraping, or boxing, the content of my blog hosting site and displaying the content below an advertisement. Not only my blog hosting provider’s content but all the other bloggers’ content. How did I handle it? I reported my concerns to blog hosting provider’s technical support team. My interpretation of their response is that the site is: “mostly harmless.”

Hmmm, no brouhaha? Nope, just passing my concerns to the proper folks who have more experience in such matters.

2. Bloggers syndicate their work through the RSS (Really Simple Syndication) technology. This technology allows articles to become separated from the copyright notice on the blog. Some people fail to recognize that this simple statement of ownership applies to the content as well as the blog itself.

It is a wise idea to include the copyright statement as part of the post, or article (if on a separate page of the blog). The idea is similar to what the Associated Press, Reuters, and other wire services do with their articles. At the start of each piece, it says something like “AP — …“).

The copyright statement in the RSS transmitted article alerts the reader using a different website that the article is, in fact, not original content of that different website.

I also find it handy, since I have multiple blogs, to include which blog I’m writing on as part of the copyright statement. This statement serves to alert a reader elsewhere that the material comes from a specific and findable place.

3. As a last note, for those who disregard absence of copyright notices: Ignorance is not a valid legal defense against copyright infringement.


3 Tools to Check Your Writing

A blogger I’m familiar with recently claimed that some of his writing was ripped off by a “splogger.” A couple of academics on the East Coast recently traded barbs about plagiarism by one of them. Is this a serious problem on the Internet, copyright and plagiarism? Or is it just a matter of whose words are whose? I can’t really recall the specific source of the thought, but someone said once that writing is just the alphabet mixed up; or some such thing.

The blogger, I’m afraid, may have cried wolf; but since there are no details …. The academics seemed to be concerned more with ideas, and who said what first, than with anything else. The concept of “idea plagiarism” is weird. Some people seem to think ideas are as unique to people as their fingerprints. Nothing could be further from the truth. The U. S. Copyright Office (copyright.gov) has a Copyright Basics document available, which explicitly states that: “ideas, procedures, methods, systems, processes, concepts, principles, discoveries, or devices, as distinguished from a description, explanation, or illustration” are not copyrightable. (Copyright Basics, p. 3) It is best to leave it to a judicial court to decide what is copyright infringement.

If you care about the uniqueness of your writing, however, there are a number of web-applications that can help you decide whether you are plagiarizing or copying someone else’s words. I’ll briefly look at a few of them here.

The Writing Checker Tools

The first one, which I’ve used extensively, is CopyScape (copyscape.com). It has a very simple submission process to find out whether your writing, or someone else’s, is duplicated on the web. I tried it again, recently, with a couple of pages of my older writing. The samples I chose use direct quotes from another page, so I expected some hits. The CopyScape search didn’t find those other pages, however, since they’ve been taken down.

The second tool, DocCop (doccop.com), is a more in-depth search tool. It can compare files as well as submitted text to existing web pages. I ran the same samples as web checks, but it also didn’t find any significant matches.

The third tool, PlagiarismChecker (plagiarismchecker.com), is similar to DocCop and CopyScape, but limits the input to a short string of text. It also doesn’t check files, like DocCop does. The results, as expected, were nil.


DocCop’s search found a number of common phrases from one of my samples in other pages. “most people do not know how to access the best,” and “and satisfactory service to clients may be just as important as credentials.” These strings can hardly be said to be plagiarism, though since they are generic “satisfactory service,” “most people,” and sort of like slogans, “access the best.” The latter phrase could be called a non-copyrightable term. Did I plagiarise these other writers? I hardly think so, because in the context of the terms used, the overall subject of the writing was different. Hmmm.

One consideration with these tools is that they only check publicly available and indexed web content. DocCop uses Microsoft’s Bing, and the others use Google and Yahoo. If a document hasn’t been on the web for very long, or the web indexers haven’t found it, then it won’t get picked up by these tools. Some web sites also limit the web indexers by telling them to ignore certain pages within sites, causing the potential for negative results when in fact there is duplicate content available.

Another consideration with these text checkers is that, like the Google, Yahoo, and Bing web indexers, they can’t understand the context and meaning of what’s being checked. I used a plagiarism checker several years ago that came up with a kindergarten teacher’s class plan as a potential source of my genealogical text. Was that plagiarism, or was it just the mixed up alphabet appearing in a similar sequence? It takes humans to fit those pieces together. Like I mentioned earlier, it’s best to have a judicial court decide.


