[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference hydra::amiga_v1

Title:	AMIGA NOTES
Notice:	Join us in the NEW conference - HYDRA::AMIGA_V2
Moderator:	HYDRA::MOORE

Created:	Sat Apr 26 1986
Last Modified:	Wed Feb 05 1992
Last Successful Update:	Fri Jun 06 1997
Number of topics:	5378
Total number of notes:	38326

1401.0. "Voice Recognition??" by 45384::MASON () Wed May 04 1988 06:51

    Hi,
    
    I more regularly read this note conference than I do write to it
    since most of my questions seem to be quite common.  However there
    is one area I am becoming very confused indeed with. 
    
    This area is music on the amiga, or to be more specific digitising
    sounds.  I am not particularly interested in making music on my
    Amiga but I am very interested with the concept of digitising sounds,
    storing these on disk and then creating a security system on my
    Amiga through voice recognition.  Sounds like something out of a
    movie doesnt it.  I know that this concept is perfectly possible
    providing I have two crucial items.  Firstly the correct software to
    write the sounds to disk.  My questions arise here as to how many
    seconds of sound you can store in a 1 megabyte A500.  If I want
    to store more is it possible to write each megabyte of sound to
    disk and then merge the whole lot into one file.  The second piece
    of software which I think is going to be the real killer is one
    that will store the user's input and compare it with that in the
    file.  Are there are any relatively generic pieces of software which
    can compare input sounds with sounds on disk?? Thats what I thought.
    Any suggestions on how I could write one??  Going back to the first
    piece of software I would want it to be able to manipulate input
    and distort in a large number of ways.  You know the usual reverse
    etc etc.   No price limit,  I just want the best of the best.  
    
    All help gratefully received,
    
    Thanks,
    
    Paul.

T.R	Title	User	Personal Name	Date	Lines
1401.1		WJG::GUINEAU		`Wed May 04 1988 07:34`	27
	I had thought of this awhile back (before I even knew about Amiga, in fact, probably before Amiga was even born!). Voice Recognition is no easy task! The human voice is extreamly flexible and varient. Even the same person speaking the same word will have trouble matching exactly a pre-digitized record. The question becomes: "Is this new input close enough to the stored one?" But "close enough" is a hard thing to determine. (there are people at major universities doing just this kind of thing, and they don't make it look easy!) I had originally thought of doing "slope analysis" on the waveform. This is (my own dreamed up screw ball method) of selecting a sample rate and then comparing the relative slopes between each sample point with those of the original waveform. If all the slopes of the whole sample are within some tolorance of the original, it's a match. The problem here is finding a common starting point. Phase shifting the sample back and forth may help solve this (i.e. Didn't match?, Hmm shift the whole sample x number of sample periods 'left' and try again...) This must sound rediculous! John
1401.2	...	LEDS::ACCIARDI		`Wed May 04 1988 08:34`	35
	Didn't DEC have a whole group dedicated to a produt called 'DECTalk' or some such? I remember reading articles about their work long before I joined the company. I don't know squat about voice recognition, but I do perform a little signal analysis now and then on an HP dynamic signal analyzer. We attempt to characterize spindle ball bearings by their unique defects. We do this by passing the signal from a capacitive displacement transducer, differentiated twice, through the analyzer and performing an FFT on the signal. We can now identify specific spindles by their frequency content. Unfortunately, the phase information, although not completely lost, is difficult to extract without performing an inverse FFT. The gist of this is that the Amiga needs a good, fast FFT algorithm. I haven't seen any yet, although I believe one of the waveform editing packages has 'FFT' as one of it's drop down menu choices. In fact, it kind of irks me that all the decent signal analysis software exists for Pee Cees. With great graphics, a super fast expansion buss, and reasonably fast processor, the Amiga could be a killer signal processing system. Anyway, I don't know if voice recognition should be performed in the time domain or the frequency domain. Maybe both? Re: .0 There are lots of sound sampling hardware/software packages available. The sampled sounds are stored as IFF standard files to allow interchange with other packages and music programs. The size of the sampled sound and duration depends on the sampling rate you select. They can get very huge in a hurry. Ed.
1401.3		45384::MASON		`Wed May 04 1988 09:37`	36
	Yes they did, I have one. But from what I know about the DECTalk ( I didnt get a manual with it ) the DECTalk only responds to data input via the keyboard. There is no input for a microphone only for headphones. You can change any one of the seven on-board voices inside the DECTalk to be any voice you like and you can change the way certain words sound. I am sorry but I found your method incredibly complex to the eye and got completely lost when trying to understand your method for comparing input. Would you like to explain this in some naive user sort of way? Please. Re .1 This is quite an interesting idea. A way to find the correct starting point for comparision could be to take say three or four points and as soon as you find a match for these in both patterns, you could even the two waves up and do your comparision. A way to be sure that the same person is speaking regardless of variations in speech could be to only accept the input if say 85% plus of the points in the input are the same. This would not require 100% matches and would allow the user access. Obviously the machine is going to require some kind of specific input. If a person speaks in a high pitched voice on purpose the machine is bound to recognise their input as valid. This is a good way of doing this and would not be too difficult to code providing I could interpret the file containing the wave. Any thoughts?? It looks like I have stepped into some dodgy ground here which is more difficult than I thought. Never mind, I still think I will give it a shot even if I do end up facing a brick wall in every direction. Keep the info coming in, Paul.
1401.4	HearSay	ANGORA::SMCAFEE	Steve McAfee	`Wed May 04 1988 09:41`	10
	You might want to find a couple of papers on the HearSay voice recognition system. This uses a blackboard approach to solving the problem. I don't think they ever got it working in real time, but I believe it did work... Sorry, I don't have any references at hand. Pick up any recent AI textbook and try the index/bibliography. - steve
1401.5	lots of false starts in this area	SAUTER::SAUTER	John Sauter	`Wed May 04 1988 10:31`	34
	I worked at an AI lab in the 1960s, and one of the sub-groups there was working on voice recognition. They were not very successful, as I recall. The problem is to extract cues from the speech waveform that can be used to match against the model. My memory is hazy, but I think they were using frequency-domain analysis: they had chosen three frequency bands, and they measured the amplitude in each band every few milliseconds. They then tried to match this pattern with the patterns recorded earlier, to recognize the sentence. They tried to recognize sentences rather than words because a sentence has lots more cues than a word. Shortly before I left I got a copy of their data base and tried to synthesize a waveform that would produce the same cues. I wanted to see if I could recognize the sentence by hearing it, reasoning that if I couldn't then they weren't using the right information to create their patterns. I played the waveforms through the digital music interface developed for John Chowning's experiments in computer music. It took a lot of imagination to understand the sentence from the sound I produced, so I concluded that they weren't gathering the right information. It may be that my synthesis program wasn't working correctly--I was never sure what constants they were multiplying the amplitudes by in each frequence band. Also, such programs are very hard to debug. I should have run its output back through the recognizer to see if it "recognized" the sentence I was synthesizing, but I never did. Voice recognition is not a simple task. I'm sure it's made progress since my experience in it, but if you start from scratch you will have to make all of the early mistakes over again. As one or two of the previous replies said, start from an AI textbook. John Sauter
1401.6	Yes it's been done	MQFSV2::DESROSIERS	Tout est possible	`Wed May 04 1988 11:31`	12
	There are a number of chips that do just that, I saw an article in a French magazine (Micro Systemes) that used some uPD series chips to do voice recognition and Steve Ciarcia of Byte magazine had an article on voice rec. using a different chip, the whole thing hooked up to an Apple II or to a C64. Mind you these things could not do speech to text and had a limited number of words that could be learned and recognized, but at the price they were going for, and the fact that it was done on such lowly machines made the whole thing whorthwhile. Jean
1401.7	This sounds promising	45384::MASON		`Wed May 04 1988 11:53`	24
	Really!! You havent got any more information have you about the issue number that the article appeared in for Byte Magazine?? This would be fantastic if you have. What about the French magazine?? It doesnt matter that it is in French I would be able to get it translated. If I could find out who supplies either of these chips it could greatly reduce my task and maybe this company has developed the chip even further now it high levels. Any help would be very helpful indeed. RE .5 If I can not succeed with the assistance of these chips mentioned then I may just do that. I realise that this is going to be a very big task but I am under no pressure as this is for the pure job satisfaction of trying to complete such a task. It is inevitably going to be a long process but what have I got to lose apart from a few restless nights. Thanks everybody for the input. Regards, Paul.
1401.8	Just don't try using it with a cold...	TEACH::ART	Art Baker, DC Training Center (EKO)	`Wed May 04 1988 13:21`	22
	I have the relevant issues of BYTE at home; I'll upload the publication info for you tonite. The software they use for recognition was pretty simple- minded; it takes the output of some LPC chips and compares the speech-parameters generated by new input against the stored parameters of the words it has been trained with. When it finds a match, it assumes that's what you must have said. (For their purposes, "match" is defined to be whatever is closest in speech-parameter-space; that takes care of some of the fuzziness associated with human speech.) Usual restrictions: limited vocabulary, speaker dependent, has to be trained to hear everything you plan to say, discrete utterances (i.e. no connected speech). Circuit Cellar Inc sells the whole thing as a kit; unfortunately, it only comes with C64 or Apple II interfaces. They might be able to help you rework it a little. ("No, choose, Doctor !" ... "Snowshoes, Doctor?")
1401.9	Just a thought...	DYO780::WILDER	U comes before V in the alphabet	`Wed May 04 1988 14:07`	21
	A more economical and reliable approach might be to consider using touch-tone hardware for your audio input. I don't know if this would fit in with how you wish your security system to work though I'd suggest it. I can well appreciate that the challenge of voice recognition might be an overriding consideration. Sounds like fun. When I worked a General Motors, they had an operator network country- wide that allowed you to call a local number in most cities, give an access number, and then be connected to any long-distance number. GM then elimiated most of the human operators by installing a voice recognition system to enter your access code and destination number. It ran on a large IBM system and only had to understand 12 words (zero thru nine, yes, and no) and would usually work although not always on the first try. If it just couldn't understand anything you said, as a fallback it would ring you thru to a human operator. It's a tough problem. Of course in this scenario, it may have only had to understand 12 words but it had to understand anyone who said those 12 words. A little different problem than trying to match the same word spoken by the same person. dan
1401.10	Voice-recognition phone story	OLIVER::OSBORNE	Blade Walker	`Wed May 04 1988 14:59`	24
	Just a little story about voice recognition: A friend of mine (Bob) has a voice recognition telephone. He has to "train" it by speaking the name to be dialed many times, and then entering the phone number associated with the spoken name. So one day he wanted to demonstrate it to me and another friend. He spoke my name several times, and the phone ignored it. In my usual intrusive way, I asked to try it, Bob said it wouldn't recognize my voice, since I hadn't "trained" it. I tried anyway, and my other friend said, "Nah, Bob's voice is more nasal". So I pinched my nose and said my name again, and the phone dialed my number. Voice recognition has a way to go. Distinguish between two people? I wouldn't try to get it past Rich Little... In a book titled "Making your own Robot", or something similar, Tod Loofburrow descibes a voice-recognition system implemented on a KIM-1. This is a pretty primitive computer, I think it did frequency comparison to an averaged set of samples, picking the closest. If you're interested in the book, I'll see if I can find it at home. John O.
1401.11		WJG::GUINEAU		`Wed May 04 1988 18:57`	9
	Funny. I just realized why you want to use voice recognition - Security for Amiga... Well, seeing as I can walk to your machine boot (or hit C-A-A) and then bang on control C till startup-sequence aborts... John
1401.12		45384::MASON		`Thu May 05 1988 05:32`	25
	Yes you could do that, but I am not using it so much for somebody not to gain access to the machine full stop I want to be able to limit what a user of the machine can do. Since MS-DOS doesnt have anyway of entering a password like VMS and it doesnt have a key on the front like an IBM then what else can I do to stop nasty little people using my machine?? There is no sure way to stop anybody getting into an MS-DOS system. I mean even on the Rainbow when people thought they had it fixed. All you have to do is stick an MS-DOS disk in drive a:, boot MS-DOS off drive a: and then swap over to either E: or F: and there you go you are straight into the hard disk and have a field day. There is no secure MS-DOS system but unless you know about things like ctrl C on startup you shouldnt be able to get into the system. Besides some security is better than no security isnt it?? Please do get me a copy of that article I would really appreciate it. I think I am going to drop the idea of a Voice Recognition chip. It doesnt seem flexible enough for my needs. Maybe I should consider something like telling a user to put their middle finger onto a template and then comparing their finger print with one of the authorised ones on disk. Who could forge something like a finger print ( apart from James Bond )?? Paul. -----
1401.13	are you out in the jungle or what?	YIPPEE::GOULNIK	OogaboogaBox type	`Thu May 05 1988 06:35`	16
	It seems you're into pattern recognition, one way or another. Another idea then is to use the mouse, a lightpen or any such, and ask the would be user to draw his signature, or any stored drawing for that matter. It's not any easier than speech recognition but at least does not require additional equipment. In any case you might want to have a go at some neural-net stuff, which potentially exhibit the features you're looking for: learning by examples, reasonable output given a noisy/partial/distorted input and fast reponse time. The major problem is in the training, which can take quite a long time, but I suppose it can be fun. There is a conference on that topic by the way on TLE::NEURAL_NETS. Having said that, I think you'd be better off buying a secure case, or locking the door. Iv.
1401.14	hardware protection	WJG::GUINEAU		`Thu May 05 1988 07:49`	12
	I don't know much about the Amiga BUS (yet) but I imagine you could put some security stuff in a ROM that took over and made you do something before allowing the machine to boot (like pressing the mouse buttons in some order, or hitting certian keys...) Although without the OS up, you'ld be on you own as far as getting input from devices... But that would sure keep people guessing... (Hmmm, poor guy. His Amiga won't even boot!) John
1401.15		45384::MASON		`Thu May 05 1988 08:41`	12
	Yes, I thought about locking all the peripherals, RAM and ROM unless they had gone through the correct channels and been given clearance. However as mentioned if the user controls out of the routine before this is performed then no security. Unless of course I did something to the machine every time I turned it off. This would be one hell of a feature and save a lot of money and time. The mouse clicking idea is very good. Or even a password typed in but not echoed. What would happen if there was a bug in it or you forgot the password?? I mean how could you ever get back into your AMIGA again. It could cost a fortune developing this. Anybody got any suggestions on how I could approach this??
1401.16	Qmouse,timesaver,or fool them	TRUMAN::LEIMBERGER		`Thu May 05 1988 09:11`	12
	There is a small utility I pulled off usenet a long time ago.it is called QMOUSE.what happens is when you boot the amiga it will poll the mouse and if it sees the left? button depressed it will use an alternate startup routine.I have a amiga 1000 with timesaver and this allows one to use a four letter password to disable the keyboard.I don't know it it can be used on the 2000.Of course it is connected to the keyboard cable so it won't work on a 500. While not in the relm of pattern recoginition one could always set the forground,background colors to the same color and this may confuse the unexpecting.(confuses me if i do it by mistake. bill
1401.17	Idea?	MQFSV2::DESROSIERS	Tout est possible	`Thu May 05 1988 09:24`	12
	Paul, Il'l rummage trough my clippings for the article in the french magazine, even if you don't build it, it's a nice circuit, could give you ideas for other projects. Now I don't know if this is a good idea, but since some weird persons made made viruses that lived in the boot blocks, could the boot blocks be infected with a password program? Jean
1401.18	Password signature.	THEONE::PARSONS	Down-under computing...	`Thu May 05 1988 18:43`	8
	A recent short piece on Australian TV mentioned a password protection scheme that involved measuring the time between keypresses when typing the password, so not only do you have to have the right password, but it has to be typed in the same manner as the owner of the password. Sort of typing recognition idea as opposed to voice recognition. I guess computing after partying would be out, in that case. Regards Guy.
1401.19	I think I've got it	45384::MASON		`Fri May 06 1988 04:38`	10
	Yes well I think I have come up with the best solution. It might not be high tech or look very flash, but fitting a padlock onto the cover over my Amiga is one hell of a way to top people hacking. Thanks for the info but I think I'll let the professionals deal with Voice/Keyboard/Password recognition. I'm not up to it. Thanks everybody, Paul.