[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

1482.0. "Changing the Speed of Voice and Music?" by GAUSS::ASFOUR () Fri Aug 16 1991 11:17

    I'm not sure which notesfile to post this in, so I figured
    I'd start here.
    
    I've scanned a couple of papers describing several methods to
    change the speed of speech without changing its spectral
    characteristics. 
    
    e.g. play back speech slower without making it sound lower in pitch
    or distorting it.
    
    Does anyone know anything about any such techniques? Does anyone
    know if these techniques (or their equivalents) would work equally 
    well with other forms of sound such as CD quality music? 
    
    I'm working on a multi-media A/D project and we would like to be able 
    to synchronize several sources of audio and video together. Sometimes,
    the audio could take a longer or shorter time than the video,and we
    would like to be able to "shrink" or "expand" it so that they would 
    fit.
    
    Any pointers to "cookbook" methods to implement these (as opposed
    to translating the math in these papers to programs) would be 
    a great help. 
    
    		thanks.
    			Yousif.
    
    P.s. The papers I'm refering to are found in ___Speech Enhancement__
    edited by Jae S. Lim, Prentice-Hall Signal Processing Series, C 1983.
    One paper in the book is "Time-Scale Modification of Speech Based on 
    Short-Time Fourier Analysis", M.R. Portnoff, taken from the IEEE 
    transactions on Acoustics, Speech and Signal Processing,Vol 29, No3,
    June 1981.

T.R	Title	User	Personal Name	Date	Lines
1482.1		ALLVAX::JROTH	I know he moves along the piers	`Fri Aug 16 1991 15:06`	16
	If you have access to the usenet newsgroups, it may be worth posting a query in the comp.dsp group. Pitch/time shifting is well known, but does involve some trickery - you have to "splice" pieces of waveforms together at the zero crossings to avoid glitches. Companies like Lexicon are experts at this. Another source (possibly) would be the application notes for DSP products from Motorola, TI, and Analog Devices. There are public bullitin boards with code online as well. Actually, with DSP addons becoming so widely available even for PC's I expect shareware to become available for a whole gamut of signal processing tasks. Lots of people are working in this area! - Jim
1482.2	comp.dsp	HGRD01::CLCHEUNG		`Mon Aug 19 1991 01:29`	7
	There is some discussion of this topic in comp.dsp in the past week. I can post it or send mail to you if you need it. -CL
1482.3	try DNEAST::COMMUSIC conference	BRSTR1::SYSMAN	Dirk Van de moortel	`Tue Aug 20 1991 03:12`	11
	I know that with a pitch to MIDI converter you can (with a microphone) record a (not too complicated) voice into a sequencer (MIDI recorder). From there on you should be able to playback at any speed without the pitch being distorted. Have a look in the DNEAST::COMMUSIC conference... this will open a whole new world for you. If you post your question in COMMUSIC, you'll shurely get lots of hints... Good luck! Dirk
1482.4	Could you please post comp.dsp discussion?	GAUSS::ASFOUR		`Tue Aug 20 1991 09:43`	8
	re .2 thanks. Could you please post the comp.dsp discussion, or you can send me mail to 3D::ASFOUR. re others Thanks for the pointers. I'll follow up on them. Yousif.
1482.5	comp.dsp - 1	HGRD01::CLCHEUNG		`Tue Aug 20 1991 23:04`	71
	Here is the first extraction from comp.dsp. --------------------------------------------------------- Dear comp.dsp readers, Thanks to everyone who responded to my questions about slowing down speech files! Here's my summary. Sorry about the delay... --- The 2nd of July I wrote: >I need algorithms or source code for a routine to slow down >a speech file WITHOUT affecting the frequency domain. --- Here are some replies I got: >From: Greg Sandell ([email protected]) > >What you want is a phase vocorder. There is C code for such a >program in F. R. Moore's "The Elements of Computer Music", >Prentice Hall, 1990. >From: Peter Silsbee ([email protected]) > >You might want to check out an article by Keith Lent in the Dec. 1989 >(I think) Computer Music Journal. It describes an algorithm for pitch- >shifting which maintains the spectral envelope of a sound. >From: Brett Ninness ([email protected]) > >I'm very interested in trying to time scale speech without altering >its spectral content. The main papers in the area are: > >M. R. Portnoff, "Time-Scale Modification of Speech Based on Short-Time >Fourier Analysis", IEEE ASSP-29 No 3 June 1981 > >and > >T. F. Quatieri, "Speech Transformations Based on a Sinusoidal >Representation" in IEEE ASSP-34 No 6 December 1986. --- I also got a reply, describing a simple, straight-forward method to do this. It goes like this: One plays a portion of, say 20-100 ms, repeats it and skips to the next 20-100 ms chunk. One could probably play 50 ms, repeat the last 20 ms, and so on, to slow down just a little bit. I tried this method and it worked. Of course, it introduces quite a lot of noise and distorts the sound when you try to slow it down too much. I think it sounded pretty good at half the speed though. Thanks again to all who responded! - Anders Ohlsson - [email protected] --------------------------------------------------------------------------
1482.6	comp.dsp - 2	HGRD01::CLCHEUNG		`Tue Aug 20 1991 23:05`	31
	The second extraction ----------------------------------------------------------------- Subject: Slow it down... Someone recently asked in comp.dsp: Is there a way to slow down the rate of (e.g.) speech, without altering the pitch? Several decades ago, the Radiophonic Workshop of the BBC used a modified tape machine to do this. They recorded the speech normally onto the tape, then played it back with a rotating head. The speed of delivery of the speech depends on the "absolute" rate at which the tape moves (i.e. the rate at which it comes off the spools), whereas the pitch depends on the "absolute" rate and also on the rotation speed of the playback head. Since you can control these speeds independently, you have independent control over the speed & pitch. There are one or two obvious problems with this approach, but I think it's very clever. ---- Tony Fisher Dept. of Computer Science, The University of York, York YO1 5DD, U.K. Tel. +44 904 432738 or 432722 Janet: [email protected] Internet: fisher%[email protected] UUCP: [email protected] (..!uunet!mcsun!reading!minster!fisher)
1482.7	comp.dsp - 3	HGRD01::CLCHEUNG		`Tue Aug 20 1991 23:06`	35
	The thrid extraction. ----------------------------------------------------------------------------- In comp.dsp, [email protected] writes: > Is there a way to slow down the rate of (e.g.) speech, without altering the > pitch? > > Several decades ago, the Radiophonic Workshop of the BBC used a modified tape > machine to do this. They recorded the speech normally onto the tape, then > played it back with a rotating head. The speed of delivery of the speech > > ---- > Tony Fisher Back in the 1960's, there was an article in Popular Electronics describing a system like this with the trademarked name "Squeech". You could call a phone number and listen to a recording of it. We now return to the 1990's! You can buy a cassette recorder from GE called the Fastrak which allows a continuously variable control of speech from 15% slower than normal up to 100% faster than normal without varying the pitch. There is a note on the unit that says the IC that does it (no rotating heads!) is licensed from Variable Speech Corp. It appears to do the same thing as the rotating heads, only electronically (probably uses a Reticon analog delay line, or something like that). Also, our company's voice mail system (ASPEN by Octel) has a feature where you can speed up playback of your messages. The ASPEN implementation is considerably better than the GE Fastrak system (i.e. it sounds better). It ought to be fairly easy to implement it with a DSP chip. Rick Karlquist [email protected]
1482.8	comp.dsp - 4	HGRD01::CLCHEUNG		`Tue Aug 20 1991 23:07`	35
	The final one. Pls follow comp.dsp for the coming one. -------------------------------------------------------------------------------- In article <[email protected]> [email protected] (Rick Karlquist) writes: >In comp.dsp, [email protected] writes: >> Is there a way to slow down the rate of (e.g.) speech, without altering the >> pitch? > There is a note on the unit that says the IC that does it (no >rotating heads!) is licensed from Variable Speech Corp. It appears to >do the same thing as the rotating heads, only electronically (probably >uses a Reticon analog delay line, or something like that). Back in 1974-75, Electronics Magazine had a writeup on the VSC circuit, with a demonstration on one of those floppy plastic phonograph records. It sounded wierd, with a lot of low frequency hum/modulation, but better than most voice synthesizers I've played with. They used an analog delay line with a (exponentily?) swept clock to stretch portions of the signal. The rest of the signal got thrown out. Each sample was fairly long, 20 to 50 milliseconds. There was some mention of detecting the start of a syllable, and using it to select that sample portion. To speed up slower speech, I guess they recycle the samples through the delay. Digitally this could be done with a (variable clocked) A/D feeding a fifo which fed a D/A. Where the fifo would hold a contiguous block of samples with duration of 20 milliseconds or so. This only works because speech has a low baud rate (20 Hz or so) and you can tolerate throwing out half of the signal (in the time domain). Mark Zenier [email protected] [email protected] ---------------------------------------------------------------------------