Wednesday, November 24, 2004

Interesting spelling...

Saw an article today that included the word 'segueways'. Gave me rather a start, it did. But it does make good sense: why on earth should 'segue' be a two-syllable word, when words like 'tongue' and 'vogue' aren't? (Other than the insignificant fact that it comes from Italian, and fairly recently too.) And then there's confusion with that 'Segway' transporter contraption. But everyone knows there's a 'ue' in there making just put 'em all together and spell it 'segueway'. After all, 2,260 other people have already.

Monday, November 22, 2004

Autodidacts among "the masses"

I've just read an amazing article, "Classics in the Slums" by Jonathan Rose (link via Teleread). If you're a book-lover, you ought to read it and admire what amazing intellectual curiosity drove 18th-, 19th- and early 20th-century British working class autodidacts to discover and devour classic, intellectual, literary books:

Will Crooks (b. 1852), a cooper living in extreme poverty in East London, once spent tuppence on a secondhand Iliad, and was dazzled: "What a revelation it was to me! Pictures of romance and beauty I had never dreamed of suddenly opened up before my eyes. I was transported from the East End to an enchanted land. It was a rare luxury for a working lad like me just home from work to find myself suddenly among the heroes and nymphs of ancient Greece."
While studying Greek philosophy at night, Joseph Keating performed one of the toughest and worst-paid jobs in the mine: shoveling out tons of refuse. One day, he was stunned to hear a co-worker sigh, "Heaven from all creatures hides the book of fate." "You are quoting Pope," Keating exclaimed. "Ayh," replied his companion, "me and Pope do agree very well."

And just look at these statistics:

Even more impressive is a 1940 survey of reading among pupils at nonacademic high schools, where education terminated at age 14. This sample represented something less than the working-class norm: the best students had already been skimmed off and sent to academic secondary schools on scholarship. Those who remained behind were asked which books they had read over the past month, excluding required texts. Even in this below-average group, 62 percent of boys and 84 percent of girls had read some poetry: their favorites included Kipling, Longfellow, Masefield, Blake, Browning, Tennyson, and Wordsworth. Sixty-seven percent of girls and 31 percent of boys had read plays, often something by Shakespeare. All told, these students averaged six or seven books per month. Compare that with the recent NEA study Reading at Risk: A Survey of Literary Reading in America, which found that in 2002, 43.4 percent of American adults had not read any books at all, other than those required for work or school. Only 12.1 percent had read any poetry, and only 3.6 percent any plays.

Welsh mining towns had libraries, the books being paid for by subscriptions out of the miners' meagre wages; members of the working class could go to night classes held at the university level that gave no recognition, only holding out the enticement of knowledge for knowledge's sake. The point of the article is that the classics aren't just for academics; they are for everyone; that literary historians should not only be concentrating on the reactions of academic readers, but also the "ordinary" reader - though these people weren't ordinary at all.

Thursday, November 18, 2004

Ooh, Google's done it again

I was just wishing the other day for a search engine that would turn up academic papers and theses and look what Google just unveiled: Google Scholar [via]. NYT article here [reg req'd].

Google Scholar enables you to search specifically for scholarly literature, including peer-reviewed papers, theses, books, preprints, abstracts and technical reports from all broad areas of research. Use Google Scholar to find articles from a wide variety of academic publishers, professional societies, preprint repositories and universities, as well as scholarly articles available across the web.

I just tried it out. It's pretty good. The only syntax that they talk about in the About section is the author: syntax, it would be nice if there were a few others like year of publication or journal. That would be particularly useful for people without access to an academic library (like me) - you would still be able to see what new articles are coming out and then try to scout around for them online. But if you know what you're looking for, you're pretty much guaranteed to find what you're looking for anyway (if it's online). This is how they rank results:

This relevance ranking takes into account the full text of each article as well as the article's author, the publication in which the article appeared and how often it has been cited in scholarly literature.

Tried it with a few papers from Language and other papers that popped into my mind. Out of five of the articles in the latest volume from Language, I found three that had preprints online. The other two gave no useful results.

Example: Searching for "author:walker typology consonant agreement" gave me this as the first link:

[PDF] A typology of consonant agreement as correspondence
S Rose, R Walker - View as HTML - Cited by 10
Page 1. 1. A Typology of Consonant Agreement as Correspondence. Sharon Rose. ... 6. 2.
A typology of long-distance consonant agreement. 2.1 Cross-linguistic overview ...
Ms, University of California, San Diego and University of …, 2001 - - - -

You can see that several links will be given and clustered under the same paper - wonderful!

I tried a few other papers that I knew were online and Google returned all of them as first link. Great resource. I know I'll be using this a lot.

Update: this article from Resourceshelf is well worth a read:
+ In a nutshell, Google has built an algorithm that makes a calculated guess at to *what it thinks* is a scholarly content mined from the OPEN WEB, and then makes it accessible via the Google Scholar interface.
+ Precisely what makes something "scholarly" enough to be included in Google Scholar, Google will not say. And this is not an insignificant omission.

Tuesday, November 16, 2004

Malcolm Gladwell's "Blink"

I like Malcolm Gladwell's stuff - it's always thought-provoking. He's most famous of course for The Tipping Point but he also has a whole lot of essays on his website that are worth reading. He has a recent article in the New Yorker about plagiarism and copyright that is well worth reading. Also on his website are excerpts from his upcoming book Blink which certainly seems like it'll be a worthwhile buy.

From "The Second Mind":
[Gladwell describes a psychological experiment]...right around the time their palms started sweating, their behavior began to change as well. They started favoring the good decks, and taking fewer and fewer cards from A and B. In other words, the gamblers figured the game out before they figured the game out: they began making the necessary adjustments long before they were consciously aware of what adjustments they were supposed to be making.


...our brain uses two very different strategies to make sense of the situation. The first is the one we're most familiar with. It's the conscious strategy. We think about what we've learned, and eventually come up with an answer. This strategy is logical and definitive. But it takes us eighty cards to get there. It's slow. It needs a lot of information. There's a second strategy, though. It operates a lot more quickly. It starts to kick in after ten cards, and it's really smart because it picks up the problem with the red decks almost immediately. It has the drawback, however, that it operates--at least at first--entirely below the surface of consciousness. It sends its messages through weirdly indirect channels, like the sweat glands on the palms of our hands. It's a system in which our brain reaches conclusions without immediately telling us that it's reaching conclusions.
This sounds interesting because of its potential relationship to language acquisition. After all, a lot of language acquisition is done subconsciously - the vast majority of us never formalize our knowledge of, say, the phonotactics of our mother tongue or the syntactic rules of agreement. When asked to explain why I thought a certain construction was ungrammatical when I was little, I would just shrug and say "It sounds wrong" (this was in an English class where not everyone was a native speaker of English - that's how it is in Singapore). The knowledge is there but it's subconscious. And moreover our brains acquire this language with what seems like insufficient evidence - the classic "poverty of the stimulus". But what if we actually need very little information to construct theories of grammar on a subconscious level? Even if our "logical" brains would be unable to derive the rules of syntax based on what evidence was presented to us in the first five years of life, perhaps this subconscious level of learning just takes what it gets and manages to produce something approaching the correct answer anyway.

All this is just speculation, of course, and there are probably serious problems with the comparison. Not least that, in all probability, this subconscious level of reasoning is something that belongs to most species, while the conscious level is what sets humans apart. (On the other hand, it could be that we're just better at the subconscious reasoning, and therefore we have language, and therefore we can reason at the conscious level...but I'm starting to get out of my depth here.) The excerpt just got me thinking, which of course is why I like reading Gladwell's work in the first place. I encourage you to go take a gander at this excerpt in particular and then wander around his website a little. You're bound to find plenty of food for thought.

Three amusing links

Cycle facility of the month: "This site is dedicated to highlighting examples of how innovative design and outstanding engineering offer safety, utility, and comfort to cyclists." Must-read for cyclists and good for a few guffaws [via].

The future is big: fortune-telling is the fastest growing industry in India, with computers and technology being harnessed to its purposes [via World-changing].

And: can you tell me how to get, how to get to Mordor?

Monday, November 15, 2004

Nothing new under the sun

Since just about everyone I know reads Arts & Letters Daily, you will probably have read this review of "The Seven Basic Plots of Literature" by Christopher Booker. Basically, he claims, all storylines can be reduced to seven plots (or combinations of plots): rags-to-riches, the quest, voyage and return, overcoming the monster, rebirth, comedy and tragedy. I find this view delightfully reductionist, but couldn't help thinking of it from the perspective of copyright (I'm trying to read up on copyright right now, so my mind's full of it) - suppose there were such a thing as unlimited copyright, then the copyright holders of a book like (say) the Bible (which pretty certainly has all the plotlines given above) should be able to sue every writer of every fictional story/novel/whatever. Or, come to think of it, if you view your lifestory as some intricate plot unfolding in real time, then everyone should be sued for living - unless you have the dullest possible life just going to the office every day and going straight back home again. Then again, I guess that might qualify as "tragedy".

Anyway, something else I realised when reading "Lord of the Rings" - there really isn't anything new under the sun, is there? I mean, think about something like a (non-human) speech recognition system. That seems pretty modern, doesn't it? After all, it still isn't something we can do at all well. But the idea was there before - think of Ali Baba and the 40 thieves and how to get into the cave you had to say "Open Sesame". Well, how did the cave know to open the door? Automatic speech recognition! [LOTR equivalent: say "friend" to enter at the gates of Moria.] And what about video-conferencing? Well what about things like the Palantíri, the seeing-stones that could be used, among other things, to communicate between distant places. I guess true genius lies not in coming up with new things, but in making what was once rare, commonplace and what was once magic, mundane.

And in other news: I guess some Republicans weren't too pleased about this site, that publishes apologies from Americans to the rest of the world for having re-elected Bush and created their own rival sites, one of which was titled I think they could have come up with a better URL, don't you? My automatic parse of the URL was "were not sorry", which automatically conjures up the continuation "but will be soon!". Just the message they wanted to send, I don't think!

Well that's all for today. My mood today is best characterised by the Singlish (well, originally Hokkien) phrase boh liao which explains the rambling.

Tuesday, November 09, 2004

Strict rules of idiolectal word usage

Mark Liberman at Language Log comments on "snoots" and their propensity to make up words at will. He makes a good point, I think:

I've noticed over the years that snoots often like to make up words, and I've wondered why people who value traditional usage so highly are also so open to lexical innovation. The paradox evaporated when I realized that the snootish impulse is not a defense of the community's traditions, it's an assertion of linguistic ego. And what could be more egocentric than inventing new words? ... Snoots don't check lexicographic and grammatical facts because their complaints are about subjective pain, not about objective facts of usage. Though they masquerade as defense of social norms, such screeds are really the howls of a wounded self, demanding primacy.

I've noticed in myself a tendency to "make up" rules of usage of certain words based on how I myself perceive the distribution of these words and use them myself.

Exhibit 1: indexes/indices

I only use "indexes" when referring to the sorts of things who find at the backs of books, and "indices" with stocks and in mathematics i.e. "raising to the power of". So I've carved up the singular "index" into several meanings and strictly assigned for each meaning a certain plural, although I expect (I've no data on this whatsoever) that most people would use them interchangeably or just stick to a certain one depending on their idiolect. To me, it's anathema to say "there are four indices in the Lord of the Rings trilogy" - yuck! I mean, bluck!

Exhibit 2: dubious/doubtful

I myself have very strict rules about how to use these two words, but I know for sure that my views on this aren't shared by the majority - of Americans, at least. I would say, "I'm doubtful of his claims", NEVER "I'm dubious about his claims" - because dubious to me has a negative connotation and means something like shady in "a shady character". So "I'm dubious about his claims" first of all makes no sense because dubious can't subcategorize for any PPs or anything, and second of all seems to be saying, "I'm a shady character." I remember first encountering dubious in the sense of my "doubtful" during a linguistics class and I nearly laughed out loud at the professor for calling himself that. But since then I've heard many Americans use dubious in that way and while I may smile inwardly, I know now what they mean.

Tuesday, November 02, 2004

Why I love the BBC

(1) the Creative Archive: pilot to start next year. This article discusses lots of cool aspects of the project, including:
- P2P distribution, "so that the public become not just their creative partners, but distribution partners too"
- metatagging of content, with layering of metadata "so that content will be searchable in many different ways"
- Creative Commons licences for the material
- Unfortunately, it looks like content is not supposed to be distributed out of the UK :-( but I'm sure there'll be ways to get access to it nevertheless
- no DRM scheme will be put on the material

(2) this Welsh-English word-by-word translation interface, called Vocab/Geirfa: mouse over Welsh words to see the English translation [via]. Apparently, the program is being offered as open source code to Welsh-language sites, and it was designed to be used to translate words between any language. Oddly enough, not all words are highlighted and therefore translated: I wonder why words like sbon and gyfer receive short shrift compared to ones like Tacteg and llwyfan. Looking at a Welsh-English lexicon, it seems that:

sbon is an adverb used in expressions such as newydd sbon 'brand new/span new', and doesn't have an independent meaning.
gyfer, ddrama, cefnogwch are inflected forms - so the dictionary is basically looking up base forms, and can't handle such forms yet, it appears.
gyda "is used predominantly in South Wales", and the standard form (or so it seems) is cyda.

And of course proper names don't get a translation. Some kinks, but this is cool stuff.

I remember listening to the BBC when I was a kid, usually in the car. We'd hear the announcer saying in his RP accents (that was before they began diversifying to regional accents), "This is the World Service of the BBC, broadcasting from Bush House, London" and then the start of Lilliburlero [recording, lyrics], then the beeps to the hour. Sometimes, we'd get the Chimes of Big Ben and then "This is London" and the news. It always gave me a thrill. But nowadays they don't seem to use Lilliburlero anymore, and anyway I listen to all my BBC on the net. Radio 4 for intelligent programmes on current affairs, history, etc., BBC 7 for comedy (especially old episodes of Just a Minute) and Radio 3 for music.

We get the BBC especially strongly here (88.9 FM) because we have a shortwave retransmitter tucked in the backwoods somewhere. I went there once as part of a car rally. We were supposed to count the number of transmitting towers they had but they were hidden by the trees, so we just asked the watchman who promptly told us the answer. He knew it off the top of his head because another car had already come by and asked him the same question.

During the First Gulf War, my mum bought a little handheld receiver so we could get immediate news about all the developments in Kuwait - especially important to us because she worked in a Kuwait-based bank at the time. I remember going to her office after school and playing there and listening to the radio.

And then we'd listen to the Brain of Britain quiz and try to guess the answers (never got more than 5 in an episode), or enjoy the cleverness of the contestants on Just a Minute, and I remember scrounging around for tapes to record the science programmes (I've forgotten their names now.) Gosh, I can't wait to get my hands on the stuff in the Creative Archive. One of the first things I'd probably look for would be the broadcast by Michael Ventris of his decipherment of Linear B. If it still exists. I hope it still exists.

Other links:
Here's a nostalgic look back at 5o years of the history of the BBC World Service.
Wikipedia article on the BBC World Service

Update: I heard "Lilliburlero" today on the World Service, so it's still in use (I hadn't heard it for a long time before that). They have a new "modern" recording of it that sounds very nice. I think I might try to listen for a while to the Internet stream on the hour to catch it again.


Another post on a neologism, this time two days old: enblogment (coined by Larry Lessig). Basically a bloggable (that's another neologism for sure, but an inevitable one once "blog" became a verb - -able is such a nice, hardworking, productive suffix) method of endorsement of a candidate. Hmm, I wonder when nasal place assimilation will occur and make this emblogment instead?

BTW, someone already pointed this out in the comments to Lessig's blog. Late off the blocks as usual.

I seem to remember that there're some other English words that have a nasal place assimilation clash but I can't seem to find any common ones.

All about libraries

I am reading Nicholas A. Basbanes' Patience and Fortitude right now, and I feel like I need to share this passage with the world:

...As I was signing the books out at the front desk - the Athenaeum did not yet use a scanning device to record loans to its members, although that quaint practice was about to change as well - I confirmed by the blank cards tucked inside the rear pastedowns my assumption that they were, in fact, leaving the library fo the first time. "Eighty-one years," I said aloud, shaking my head with amused gratitude. "You wonder who they bought these books for anyway." James P. Feeney, the silver-haired circulation librarian who was checking me out, paused momentarily and fastened his unblinking eyes on mine. "We got them for you, Mr. Basbanes," he replied evenly, and resumed his work.

What Feeney did not say - what he did not have to say - was that the books had been set aside by his predecessors for the better part of a century on the off chance that one day somebody in need might want to see them. Fortunately, the fact that nobody had requested the titles before me was not considered sufficient grounds for discarding them, a practice employed by so many other libraries in these days of reduced shelf space, stretched operating budgets, and shifting paradigms. It was as if the collective hands of Aristophanes of Byzantium, Petrarch, Robert Cotton, Christina of Sweden, Thomas Jefferson, Arthur Alfonso Schomburg - every temporary custodian of the world's gathered wisdom - had reached out through the swirling eddy of the ages and placed in my hands the precious gift of a book. It was an act of faith fulfilled, and we, their heirs, owe no less a compact to the readers of the third millennium.

Unfortunately, the national library system here tends to go through books rather quickly - I've seen books from 2002 listed as being in 'RU' - 'Repository Used' - and I've even bought books at their annual sales, that nobody's ever checked out, that are about 2-3 years old. Kind of sad really. And to think a previous Chief Executive had the audacity to 'crow' that "We don't get people complaining about service quality or about our collections any more". Guess I'll have to start complaining! Just because you can always find books that you want to read that doesn't mean that you can find any book that you wanted to read a priori - i.e. I always fill up the slots on my library card without too much trouble, but on the other hand about 50% of the books I look for aren't in the national library system. I guess I tend to be too(?) understanding of the budget problems of libraries - after all, these libraries do have to cater to a certain common denominator, and I know my interests are pretty esoteric for Singapore. But then...knowing tax money has bought books that are being put into storage (to get them out costs S$1.50) or selling them away (OK, so I'm not that mad about this part - after all, means that I'll be able to buy them for S$2 or so) at the tender age of 2...sigh.

Had an interesting experience at the library today. For those of you who don't know, Singapore's library books are all RFID-tagged. This means that we check ourselves out - no standing in line waiting for a librarian, although occasionally you have to stand in line to wait for a machine - and books returned are returned instantly. No waiting around for 15 minutes for your books to clear the system. Anyway, I was trying to check out a book and after several swipes I gave up and took it to the counter (labelled 'Concierge' - interesting). While waiting for the librarian to laminate some library cards, I began reading the book and found it really interesting. And then finally the librarian took the book to see what was wrong with it and told me it was due for maintenance!!! This for a book that looked like it had never been touched. She mumbled something about the labelling being wrong or something. I didn't feel like arguing, so I just gave up on the book and left. I'm such a pushover.

Jobsworths, etc.

I'm not really the type to find individual words all that interesting - etymologies and such are fun to look at from time to time but they're not something I'd devote a whole lot of time to. As you might guess, I'm pretty lazy about checking dictionaries (especially English ones - I'm more diligent with foreign languages). But when I come across a word that I've never heard before twice (maybe even thrice) in one day, I take notice. The word in question is jobsworth, and it's not in I typed in "what is a jobsworth" into Google and got this answer: "someone who is getting paid to do a job that might be unpopular with some people". Alternative answer: "somebody who sticks to the work rules very rigidly". Going by the first 20 or so sites for 'jobsworth' on Google, it's a very British word. Well, you learn something new everyday.

Speaking of 'everyday', I saw a sign yesterday that stopped me in my tracks, for a Japanese restaurant here in Singapore: 'Open all day. 11 am - 10 pm.' Hmm, somebody has their quantifiers in a twist - or perhaps omitted an 's', though it wouldn't be very elegant to put it back in anyway.

Then there was this other signboard in Queenstown today: 'No. 1 Coffeeshop 268', which induced momentary confusion. How could something that's number one also be number 268? Upon further reflection, 268 was probably the number of the coffeeshop on the road, but hey, it's a genuine parsing ambiguity.