[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference napalm::commusic_v1

Title:	* * Computer Music, MIDI, and Related Topics * *
Notice:	Conference has been write-locked. Use new version.
Moderator:	DYPSS1::SCHAFER

Created:	Thu Feb 20 1986
Last Modified:	Mon Aug 29 1994
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2852
Total number of notes:	33157

1225.0. "Sampling - Time compression w/o frequency shift" by HPSTEK::RHODES () Mon Feb 22 1988 08:42

Tom's "for sale" item prompted this note.  This has got me baffled.

In either the analog or digital domain, how do you possibly shorten or
lengthen the playback time of a stored signal without increasing or 
decreasing the pitch of the stored signal?  Whether it be a tape deck,
record player, any electronic analog player, any digital player, etc,
munchkinization will occur for obvious reasons when reading the data 
at a higher rate than the rate at which it was recorded.

I know that Lexicon won an Emmy for inventing a device that allows 
time compression/expansion of music without the frequency shift that 
normally occurs so that the length of a music track could be matched 
with the length of a movie clip.  This is a very EXPENSIVE device.  How
does it work, and how is it an $80 device can do the same thing?

My guess on the Lexicon is that the signal information sampled in 
the time domain is converted to frequency information via numerous 
FFTs, the frequency information is then scaled to reflect the frequency
shift necessary to offset the difference in playback rate, the signal
information is reconverted back into the time domain, and then played 
at the altered rate.  Sound realistic?
                                 
Any comments greatly appreciated.

Todd.

T.R	Title	User	Personal Name	Date	Lines
1225.1		RANGLY::BOTTOM_DAVID	If the phone don't ring..	`Mon Feb 22 1988 09:11`	6
	You used to be able to do this with the SAD-1000 (analog delay IC, found in most analog delay stomp boxes) I had the spec sheets and notes on how to do this, if you're interested I'll see if they still live in my possessions somewhere.... dave
1225.2	Eventide harmonizers, pitch change, and tape	ANGORA::JANZEN		`Mon Feb 22 1988 09:26`	57
	I have the same information. To see how it's done in charge-coupled devices or digitally, let's see how we would do it on magnetic tape.. Problem: change the speed of something without changing its pitch. Well, we can change the speed of an audio recording on a variable-spped tape recorder by changing the tape speed. However, this changes its pitch. If we could change the pitch in the direction opposite to that that we changed the tape speed, we could correct for this. Therefore, the only new problem is pitch change. How can we do pitch change on a tape recorder? Raising a pitch and lowering a pitch are separate problems. Let's begin with lowering a pitch. 1. Record a sound, say, me saying "record a sound." out loud. on 1/4" tape running at 15 inches per second. Let's say it took me 1 second of time to say that phrase. 2. Using a razor blade, cut the tape every 1/4" along its length. This would require a great deal of skill and patience to actually do, of course. 3. Throw away every other 1/4" length of tape, leaving half as many. 4. Splice them together in sequence. 5. Play the tape at half speed, or 7.5 ips. The result of this will be that my voice will be an octave lower, but that the phrase will be said in the same length of time, one second. If, instead, we had wanted to increase the speed without changing the pitch, just play it as the original speed, 15ips. It will take .5 seconds. The little 1/4" lengths of tape will merge together with some fluttering to reproduce the phrase with only half the original information. Now, let's try to raise a pitch an octave. This is harder. 1. Record a sound of me saying "record a sound." Duplicate this tape on another tape. Splice both tapes into 1/4" sections, and insert corresponding pieces of tape (like, the "rd" sound in the word "record" on both tapes) next to each other. Splice it all together. The whole tape is now twice as long, 30 inches, as the original or the copy. Play it at the original spped, which we said before was 15"/sec. It will take 2 seconds for me to say "record a sound". Now, play it at 30ips. It will take 1 second to say the words, as in the original, but will be an octave higher (munchkin land). 2. Go to LEDS-BIM To make a digital version, replace the magnetic tape with digital memory. Write a program that will adjust the amount of sound deleted or duplicated so that pitch can be moved over a continuous range. To make a CCD version, read the data out at a different speed than it's read in. Read data again if the player is reading faster than the recorder, and throw away data if the player is reading more slowly than the recorder. That should make it clear to people who already know how it works. Tom
1225.3	Tom's right...	SQM::VINSEL	she took my bowling ball too	`Mon Feb 22 1988 10:45`	12
	I did alot of work on this very thing a few years back. We were building a digital voice recorder system for Emergency 911 phone systems. We even go a few patents on the filtering & time conversion process that we can up with. Tom's explaination .34 is pretty accurate, but leaves out the important problems that must be dealt with, the filtering, and selection of fill data for the convertion to higher frequencies. We were most concerned with fast and slow playback with not fequency shift, but ended up with a system that handled both time shift with no frequency shift and frequency shift with no time shift. pcv
1225.4	Even cheap commercial electronics can do this.	BOLT::BAILEY	Steph (stef') Bailey	`Mon Feb 22 1988 14:53`	12
	I got bored reading the explaination, but it can probably be done the same way that CD players implement the cue fast-forward function, but on a finer grain. That is, actually play every N seconds of M seconds of your source program samples, and skip M-N of M seconds. N must be less than M (obviously) and M is greater than the length of time required to audibly recognize the lowest frequency component in your source. Of course I'm sure this can introduce annomalies if N and M aren't chosen appropriately, but you get the idea. Steph
1225.5	should have read the whole thing	TIGER::JANZEN		`Mon Feb 22 1988 15:13`	29
	>< Note 1225.4 by BOLT::BAILEY "Steph (stef') Bailey" > > -< Even cheap commercial electronics can do this. >- > That is, actually play every N seconds of M seconds of your source > program samples, and skip M-N of M seconds. > N must be less than M (obviously) I don't know what CDs do on fast forward, but this is not a reasonable description for changing. N can be more than M if you want to raise pitch; you repeat a segment of audio. A reasonable size for M has to be larger than your greatest period of your lowest frequency. It may be roughly 1/12 seconds. Splicing the pieces together creates a problem; if the waves on either side of the juncture are at different levels or have very different slopes, a glitche or flutter can result. Perhaps commerical digital processors smooth the gulf with interpolation of a ramp across the two different levels. I suspect that the Yam SPX90 looks for same-direction zero-crossing, because the pitch-change interval changes depending on the absolute pitch of the input; octaves are sometimes perfect, sometimes big, sometimes small, but always the same for a given absolute input pitch. > and M is greater than the length of time required to > audibly recognize the lowest frequency component in your source. Of > course I'm sure this can introduce annomalies if N and M aren't chosen > appropriately, but you get the idea. > > Steph > > tom
1225.6		HPSTEK::RHODES		`Wed Feb 24 1988 08:39`	11
	Thanks for the input, especially Tom. So as I understand it, the input signal is somewhat altered (distorted) by the application of the time compression algorithm. Information is in fact lost. I've noticed that a sideband type of radio creates a munchkinization or deathbooming if the frequency of reception is not exactly equal to the frequency of the transmitted carrier. What is happening here? Todd.
1225.7	you mean this isn't EDT?????	ANGORA::JANZEN		`Wed Feb 24 1988 08:50`	17
	>< Note 1225.6 by HPSTEK::RHODES > >I've noticed that a sideband type of radio creates a munchkinization or >deathbooming if the frequency of reception is not exactly equal >to the frequency of the transmitted carrier. What is happening here? > >Todd. For cheap pitch change, Try a balanced modulator with a sine carrier, like the PAiA one; I also found a kit for one in a pulp electronics catalog. I think that in shortwave, The frequency of the resulting signal is shifted by the difference between the carrier and the heterodyne tuner's signal. The farthur apart they are, the higher the frequency difference, the munchkiner it gets. Or roughly like that. Tom
1225.8	Radio, Radio	AQUA::ROST	I'll buy you a cherry phosphate	`Wed Feb 24 1988 09:36`	7
	Re: .7 This is correct. When I used to work in communications, the microwave radio level metering sets had an audio monitoring output that had this frequency-shifting effect.
1225.9		DFLAT::DICKSON	Network Design tools	`Wed Feb 24 1988 10:12`	12
	In Single Sideband modulation (which does in fact use a balanced modulator), essentially only the harmonic structure of the sound is transmitted. The fundamental is reconstructed at the receiver. When the receiver and transmitter are not using the same reference frequency (caused by one of them being tuned wrong), the reconstruction in the receiver is misled and it puts the fundamental back too high or too low in the audio spectrum, off by the difference in the radio frequency carriers. I think the spacing of the harmonics remains the same, however, so it is not exactly a perfect pitch change. Boy, it's been a long time since I fooled with this stuff.
1225.10	tom	PLDVAX::JANZEN		`Wed Feb 24 1988 10:43`	8
	It's correct to say that balanced modulation does not offer real pitch change, linearly. The arithmetic relationships distort harmonic content. I use a balanced modulator to turn a piano into a prepard piano; it sounds like John Cage's technique of putting rubber wedges, screws, and other junk in between the strings of a piano. This works because the harmonic content of the sound changes. otm
1225.11	Acht, but Herr Docktor Professor...	CTHULU::YERAZUNIS	Snowstorm Canoeist	`Wed Feb 24 1988 11:17`	97
	Exactly correct, Mr. Dickson... DANGER - MATHEMATICS AHEAD - --------Theory of Electromagnetic Communications Lecture 401.30----- On SSB (and in a ring modulator), the information that is passed through correctly can be described as a bunch of frequencies of the form X+F1, X+F2, X+F3 ... where the Fn's are known, but X isn't. X is the carrier frequency of the ring modulator (provided it's a sine wave), or the _difference_ in frequency of the SSB transmitter and reciever center frequencies. So, SSB where the transmitter and reciever are "right on" reconstructs the signal correctly; likewise a ring modulator with carrier=0 doesn't alter the signal. What happens if X isn't zero: All of the frequencies are displaced by a constant number of Hz from their correct location- NOT a multiplicative constant. This (in general) results in a sound where the original harmonics (f, 2f, 3f, etc) are no longer in octave ratios- the distinctive ring modulator sound. Second interesting effect: "negative frequency wrap-around": It really isn't fair to say a sound has harmonics at f, 2f, 3f, etc. Really, it has harmonics at �f, �2f, �3f, etc. The energy at frequency f is really split equally between a positive frequency at +f and a negative frequency at -f. With normal (non-ring-modulative) processing, such as time-compression or time-expansion (via magnetic tape, sampler, or CCD) all harmonic frequencies are MULTIPLIED by some constant, but they are still integer multiples. Example: Original: �f, �2f, �3f = �440, �880, �1320 Hz speed up playback by 1.3 : �1.3f, �2.6f, �3.9f = �572, �1144, �1716 Hz but the harmonics are still in the ratios of 1:2:3 . It just sounds like you're playing the same line a minor third or so higher. But with a ring modulator, the + and - freqencies don't shift equally- they both shift upwards. Example: Original: �f, �2f, �3f = 440, 880, 1320 Hz Ring modulate with 400 Hz carrier wave: +f+400, -f+400, +2f+400, -2f+400, +3f+400, -3f+400 = 840, 40, 1280, 480, 1720, 920 Hz What dows this sound like: Well, there's a low rumble (the 40 Hz), a slightly shifted original line (at 480 Hz), a pair of lines about an octave up and about a semitone apart <<yecch>> (at 840 and 920 Hz), and then a pair of rather thin high lines up at 1270 and 1720 Hz - a ratio of 1.35 to one, which is pretty close to a major third. It most certainly does NOT sound like a sampler in good working condition, nor like a sped-up tape loop. Note that the low rumble is due to the -f component being shifted upwards from -440 Hz to -40 Hz. Since the ear can't tell the difference between -f and +f, the output sounds like it has six, rather than three, positive frequencies. More magick: The new sound really has twelve freqencies: each of the six frequencies above is really a +/- pair again! So, a second ring modulator after the first would give a signal with 12 positive frequencies (really 12 +/- pairs), etc.... Even more majick: What if the modulator is NOT a sine wave??? This gets very complicated very quickly here. Learn about Bessel functions if you're really interested. ---------------------------------- Do people really read these long mathematical notes of mine?
1225.12	Never too detailed	NAC::PICKETT	David - Fault Tolerant Diagnostics	`Wed Feb 24 1988 12:27`	13
	re .11 Yes! A mathematical discussion like this is always useful. I'd like to see a bit on the mathematics of how Lexicon's device does it's conversion. If I understand correctly, it converts the incoming signal in the time domain to the frequency domain. It then remaps that to a different time domain, and sends it out at the altered rate with close to the same frequency spectra intact. I'd like to see how all this is done on the fly. If your incoming data is being sampled at 44kHz, this gives you 22.7 �S between samples. Not a lot of time to do all this computation. dp
1225.13	Gave up Maths at 16	HEART::MACHIN		`Wed Feb 24 1988 12:42`	7
	I thought this notesfile was for frustrated musician computerpeople. Now I discover there are a few frustrated frustrated computerpeople musicians computerpeople. Richard.
1225.14	pining for knobs...	DSSDEV::HALLGRIMSSON	Eir�kur, CDA Product Manager	`Wed Feb 24 1988 12:55`	5
	re .11: Yep, some of us read them. Esp. those of us who miss our old modular systems with nice things like ring modulators! Eirikur
1225.15	f(t)*f(t)=f(2t)?	ELESYS::JASNIEWSKI		`Wed Feb 24 1988 13:47`	18
	A crude frequency shifter can be conceptualized by imagining samples clocked in at one rate and clocked out at another.... One of the problems with frequency shifters is that you dont get the product in real time. It comes later and it's noticable. This makes a difference if your trying to synthesize a bass guitar sound from a six string. Singing harmony with yourself, the problem can easily be lived with - you'd probably want a small time delay anyway. Has anyone ever evaluated the expression f(t)f(t-T) as far as how it sounds? This is just a ring modulator with the carrier replaced by a delayed version of the input signal. Bet I know something you dont know! (Hint: f(t)f(t)=f(2t), if f(t)=sinwt) Joe Jas
1225.16		DFLAT::DICKSON	Network Design tools	`Wed Feb 24 1988 13:57`	4
	Now that I see the math, I notice a strong similarity with the formulas for FM synthesis. Both have that plus/minus stuff with wraparound, but FM uses multiplicative modification of frequency while ring modulators use additive modification.
1225.17	Time runs BACKWARDS???	CTHULU::YERAZUNIS	Snowstorm Canoeist	`Wed Feb 24 1988 21:23`	16
	Re: .11 - Is it by any chance a comb filter with passbands at f = N * 1/T for N=1,2,3... ? Now, what's the response of Vout(t) = Vin (t) * Vin (T - t) (note that the second Vin term has time running BACKWARDS) :-) re .16: Both ring modulators and FM share a lot of common mathematics- and tend to sound similar too. Bessel functions are the root of both matters.
1225.18		HPSTEK::RHODES		`Thu Feb 25 1988 08:57`	15
	Thanks all. After reading Bill's first reply, I must say that I was saying to myself "he's on to something here". How about a multistage ring modulator digital synth with outboard digital filtering? Quick, let's build one along with the ultimate drum machine... RE: .12 (Lexicon device) I was just venturing a guess at how the Lexicon machine works. Probably way off base. Chances are they do just what Tom was saying - strech the data (applying smoothing algorithms, of course) for time expansion, and remove data (applying interpolatoin algorithms) for time conpression. Todd.
1225.19	clarification	HPSTEK::RHODES		`Thu Feb 25 1988 09:13`	6
	Also note that the Lexicon machine samples an entire soundtrack (probably direct to disk) before the compression or expansion algorithm is applied. After the sampling has completed, the musical data is adjusted to fit into a desired time window. Todd.
1225.20	Sine of the Times	DRUMS::FEHSKENS		`Fri Feb 26 1988 16:16`	85
	re .15 I did a little poking around in my math books the other night. First of all, (sin(x))^2 is not equal to sin(2x). This should be readily apparent from the fact that 0 <= (sin(x))^2 <= 1 while -1 <= sin(2x) <= 1 What is true is that d (sin(x))^2 ------------ = sin(2x) dx which is kind of cute. However, if you look at (sin(x))^2, you'll see that it looks a lot like cos(2x), inverted and shifted up. In fact, a useful approximation of (sin(x))^2 is (sin(x))^2 = (1 - cos(2x) + ...)/2 where "..." represents a bunch of terms I'm too lazy (or mathematically inept) to figure out. I couldn't find anything in any of the sources available to me that discussed the Fourier transform of (sin(x))^2, so I had to wing it. The following table is useful: x sin(x) (sin(x))^2 cos(2x) (1 - cos(2x))/2 0 0 0 1 0 pi/6 1/2 1/4 1/2 1/4 pi/4 sqrt(2)/2 1/2 0 1/2 pi/3 sqrt(3)/2 3/4 -1/2 3/4 pi/2 1 1 -1 1 So, it's a pretty good approximation at a couple of key points. Be that as it may, the point is, (sin(x))^2 has a prominent "2nd harmonic", i.e., a strong showing at 2x, which is the point I believe Joe was trying to make. Now, regarding ring modulating a signal with a delayed version of itself: This is sin(x)sin(x+d) Using the trigonometric identity sin(x+d) = sin(x)cos(d) + cos(x)sin(d) and observing that d is a constant, multiply it all out, use the trigonometric identity sin(2x) = 2sin(x)cos(x) and we get sin(x)sin(x+d) = k1(sin(x))^2 + k2sin(2x) where k1 and k2 are constants. We've already noted that (sin(x))^2 looks like cos(2*x), so the bottom line (again, what I believe Joe was driving at), is that ring modulating a signal with a delayed copy of itself effectively frequency doubles it, subject to some phase wierdness (due to the presence of both a sine and cosine term) and a DC offset. len.
1225.21	Sign of the Tigms?	DRUMS::FEHSKENS		`Mon Feb 29 1988 16:20`	17
	Does the deafening silence mean: 1) I got it all right? 2) I got it all wrong? 3) Nobody cares? 4) None of the above? I was hoping some young whippersnapper fresh out of school would tell me what the transform of (sin(x))^2 was. And where'd Joe go? What answer was he expecting? len (mildly abashed at having wasted all that effort).
1225.22	I didn't know the clock had started	SQM::VINSEL	she took my bowling ball too	`Mon Feb 29 1988 16:31`	6
	re:.21 Give us a chance... You had the whole weekend to come up with it, I'll look it over tonight and give you a reply tomorrow morning... pcv
1225.23	Super Time Compression?	DRUMS::FEHSKENS		`Mon Feb 29 1988 16:51`	5
	re .22 - uhm, it was posted last Friday. I figured everybody else had the whole weekend to find fault with my math. len.
1225.24	looks good, but how's it sound?	SQM::VINSEL	she took my bowling ball too	`Tue Mar 01 1988 08:24`	11
	re: .23 OK, so I'm slow... ;^) I looked over your equations, and they seem to look good (on paper). One of the things I noticed when I did alot of work on this subject was that we went down alot of ratholes trying to come up with a good sounding filter. I just wish I had the time to actually try this and hear how it sounds. pcv
1225.25	-----hey, Commusic is back up!------	JON::ROSS	shiver me timbres....	`Tue Mar 01 1988 11:17`	8
	{yawn} neat stuff len. when can I get the Fehskens Systems Inc. Doubler in a box? or you gonna license it to Yamaha and retire?