Wednesday, May 18, 2005

Time scale modification for language learning purposes - the $29.95 version

I don't know about you, but I'm a really impatient person. I read fast because I don't like to spend too much time reading. Now, I would really like to be able to listen to more lectures and interviews and other audio online, because there's really good stuff out there, and coming from a college town back to Singapore, where there's not much in the way of public lectures and all that sort of thing (well, none that I'm interested in hearing, anyway), it's the next best thing.

The trouble is, I'm too impatient to sit down and listen to it. When it's not live, when you can't see the person who's talking, you realise that people actually speak really slowly. They are careful to enunciate their words - all well and good, but I've a pretty good language model in my head, I can stand some speeding up.

Now, supposing you were listening to a lecture and pressed the fast forward button - what do you hear? Chipmunks? Mickey Mouse? Well, something along those lines anyway. That's because playing it twice as fast makes the frequency twice as much, and frequency is correlated with pitch. Playing something twice as fast makes the speaker's pitch rise - hence the "chipmunk effect". I'm not kidding, that's what people call it. Now, to speed things up *without* creating the chipmunk effect - you'd probably need to take each vowel and truncate it, etc., etc. - I haven't looked into how it works, but there are tools out there to do this, and it's called time scale modification.

So I looked on the Web and found that a company called Enounce makes tools to speed up your listening. The software is a plug-in to RealPlayer and Windows Media Player, and works whether you're playing something you've downloaded or streaming from the Net. Although, of course, if it's live you can't play it twice as fast - it would be really funny if you could, but you can't. There's a free 7-day demo, so I tried it out on a talk from IT Conversations by Cory Doctorow. It worked pretty well! No squeaky voices (though you can alter the properties to get the chipmunk effect, if you need a laugh). I found 2x was a bit fast for me, I wasn't used to it yet. 1.5x was perfect this time round. If I practise with it I can probably work up to 2x or higher, I guess.

Apparently there are other benefits besides the time factor to this software: students learn better because playing it faster makes you concentrate harder, for example. [Link to BYU paper on this, pdf format] But what interested me more was seeing that you could not only speed up, you could also slow down - to as low as 0.3x the regular speed.

Many language learning advice-dispensing websites advise that you should try listening to online broadcasts in your target language. These are available for an incredible array of languages. Deutsche Welle has 30 languages, the BBC World Service broadcasts in 43; Radio Canada 8. (These are my favourite.) Each of these three has broadcasts in Arabic, my target language - the only thing is, the presenters speak too fast for me. I'm not yet at the stage where I can listen to broadcast news read at normal speed, though when I know the story in question I can usually work it out. What I really need is for them to go just a little bit slower. The BBC knows this, and Deutsche Welle knows this - both have programmes for language learners where they speak a little more slowly, and clearly, using less difficult words. But I haven't found anything comparable for Arabic.

But all I have to do is to put Enounce and those broadcasts together, which I tried today. I found that playing at 0.7 - 0.8x the normal speed worked pretty well for me. I had time to work out how the more complex words parse, and to recall what they meant. I think this will be a big help for my Arabic-learning efforts.

The thing is, Enounce costs $29.95. I think this is a pretty reasonable price, but tomorrow let's see if we can go one step better and do this for free.


Blogger hh said...

I sat next to a blind guy on a plane last year, and he had his text-to-audio pocket device in hand. He was listening to the day's NYTimes read in automatically generated speech. That was the first time I learned about the possibility of speeding up audio speech for faster listening -- apparently blind people do it as a matter of course, and get amazingly good at it. He played me a bit of what he was listening to and it was completely incomprehensible -- and he said that his skills were mediocre compared to some of his friends and associates. I didn't ask him what the rate was, but I'd be willing to bet it was faster than 2x. Cool!

3:55 PM  
Blogger C. Callosum said...

Neat, I had no idea!

Let's see now. According to this NYT article, we can understand speech at rates of up to 400 words per minute. And according to this Language Log post by Mark Liberman, 150-200 wpm is the normal range for speaking rates, though of course it depends on many factors. So 2x seems a pretty good speed-up.

Since you couldn't understand it, hh, I guess it was probably faster than 2x as you say. Today I managed to get up to 2.5x on Enounce with a lecture by Clayton Christensen, but since it was a lecture he was speaking pretty slowly. I wonder how far up you can go, even with training and practice, until it's just total gibberish to everyone. Apparently the world record for fastest speaker belongs to one Fran Capo, at 603.2 wpm, so we can certainly process words at that rate, but I don't know whether you'd understand!

Oh, and another thing - I wonder if synthesized voices vs natural human voices would make a difference in speeding things up. Perhaps with synthesized voices, there's less variation, and it's easier to shorten and smooth them together? Just a thought.

8:09 AM  
Anonymous Anonymous said...

9:46 PM  

