Pigeon holing behaviorists?
Philosophy

Pigeon holing behaviorists?



I am not sure what to make of this. It's relevance seams worthless...just how far can one take statistical data to human development?

"Rare words 'author's fingerprint'"

December 19th, 2009

BBC NEWS

Analyses of classic authors' works provide a way to "linguistically fingerprint" them, researchers say.

The relationship between the number of words an author uses only once and the length of a work forms an identifier for them, they argue.

Analyses of works by Herman Melville, Thomas Hardy, and DH Lawrence showed these "unique word" charts are specific to each author.

The work is published in the New Journal of Physics.

Researchers also suggest each author pulls their works from a hypothetical "meta book". One description of this concept might be a framework for the way an author uses language. It is from this framework that all their works are ultimately derived.

In 1935, the Harvard University linguist George Kingsley Zipf demonstrated a mathematical relationship between the frequency of a word in a text and its rank in the list of an author's most used words.

So, the second most frequent word in a book occurs half as often as the first, the third most frequent occurs one-third as often, and so on.

The rule laid the groundwork for many mathematical analyses of words, in which the Zipf law seemed to be a universal property of English - and by extension, of language itself.

Building on that idea, researchers at Umea University in Sweden have found that language use isn't as universal as Zipf's law might suggest.

They have used a related approach that comes up with a unique identifier for each author.

Hardy measure

Clearly, a longer written work has more unique words - words that appear just once in the text.

However, even the best writer's vocabulary will at some point run out of words that have not yet been used.

The researchers gathered together the complete works of Hardy, Melville, and Lawrence, and measured that dependence - counting the number of new unique words as a particular author's works get longer and longer.

They used sections from books of varying lengths, randomly pulled from novels, alongside shorter works and short stories.

They found that the authors had distinctly different "unique word" curves.

The team suggests that a work by an unknown author could therefore be compared to prior works, with the curve acting as a linguistic "fingerprint".

Source material

The meta book concept proposed by the authors is not simply the list of all the words they know, but also the "distribution" of those words produced by an author, whether in drafting an e-mail or writing War and Peace.

"It doesn't matter if I pull out 10,000 words from a book of 100,000 or from a book of 200,000, I get the same behaviour; you always simply pull a piece out of your very, very big 'meta book', which is just a representation of your style," said Sebastian Bernhardsson, who led the work.

"That story you're writing right now is a piece of that big book and that is what you're pulling out," he told BBC News.

The team will continue the analyses with different English authors, and with authors in different languages. As their collection of fingerprints grows, Mr Bernhardsson said, they will try to identify the authors of anonymous works.

But not every result is a happy one, he added.

"It's a fun and interesting exercise, but I've plotted my own thesis in this sense and it was kind of discouraging comparing to some more famous authors."

Zipf's law




- Are Recipes Intellectual Property?
The Food Network has dropped dessert chef Anne Thornton when it was discovered that recipes in her cookbook were, but for small alterations, the same as those in cookbooks by others.  The word "plagiarism" is being used for the case.  It is,...

- The Necessity Of Curse Words
The shorter of the short people has become fascinated with curse words. The idea that there are culturally forbidden sounds makes these sounds interesting. As a reuslt, he's invented his own, nork (the sound made by the Siamese elephants in Tim Conway's...

- Team Building: Boggleball
Let's try this again. It's based on those obnoxious corporate team-building exercises where folks have to work together on a task. Last night after dinner, TheWife and the short people and I took the word "basketball" and found 101 words that...

- Silent Letters
The English language is an odd mishmash of German, Dutch, Norse, French, Celtic, and Latin. So many pieces that don't fit together so well have resulted in a language that is really odd. Consider how many words in English have utterly irrelevant letter...

- Ancient Languages...reconstructed By Computer Software
"Ancient languages reconstructed by computer program" by Rebecca Morelle February 12th, 2013 BBC World Service A new tool has been developed that can reconstruct long-dead languages. Researchers have created software that can rebuild protolanguages...



Philosophy








.