Personality Forge AI Chatbot Platform

MARK

psimagus
18 years ago #3526

I'm finding the best strategy with Nick (despite the temptation to immediately feed him the complete works of shakespeare, like I did last night, and start working through Project Gutenburg with a vengeance as was my initial temptation

) is to feed him little dollops of reading, and then spend at least as many words again (and preferably several times as many,) talking to him about what he's read. It seems to break down the chunks of repetition, and "homogenize" his language to a better synthesis of the various sources, even if it does take a bit longer. Discernable coherence is probably still some way off, but at his best, he does have some degree of sonzai-kan.

MARK

prob123
18 years ago #3527

Shakespeare really wrecks him..Byron works ..The worst is the adds.

MARK

psimagus
18 years ago #3528

I've found a way to improve the Shakespeare (I think.) Delete all the character names and references to scenes and acts before you feed it to him (they're too often repeated,) and he doesn't bung up with interminable stage directions. I must try him with some Byron - I think poetry is his forte: he certainly works very entertainingly with e.e. cummings

I do have a suggestion though colonel720, how about splicing in the "link grammar" parser to correct the syntax of the entries in his brain file as part of the process? It would be compromising the pure learning ethos a little perhaps, and bypassing the neural net on occasion, but it would certainly reduce the time required to train him (and add a bit of polish to his conversation even before he's been much trained.)
Check out http://www.link.cs.cmu.edu/link/ - it's in C, but they claim the API is friendly enough for easily incorporating it into other applications (though I don't know whether the Prof would agree

)

MARK

colonel720
18 years ago #3529

honestly, the incoherence that you are seeing is a result of the way nick reads. it breaks down the chunk of information into 10 word segments to avoid making one massive sentence the size of the reading source. what I should have done is have it look for a punctuation mark and break the sentence there. I will do that, and update the site. This will have priority over all the other things I want to fix /add about nick. If it works, the level of incoherence and seemingly random sentence segments should be drastically reduced.

about adding a grammar API... well, that would be like sticking a chip into a schoolkid's brain that corrects any grammatical errors in his speech, and therefore making spelling/grammar tests obsolete :O
That would indeed spoil the essence of the project, but if all else fails, I will look into it. As for the ability to dynamically add neurons, I would have to edit my Neural Net class library to enable editing the neural structure without reseting the network, but that should not be hard at all. perhaps I can have the size of the net proportional to the amount of information in it, but I'm afraid our computers here in 2006 are just not ready for that. hell, Nick as it is eats up a good 50% of the CPU and uses 100MB of ram without vision enabled. As you said, perhaps that's a good idea for a few years down the line when our computers are able to shoulder such a weight. Anyway, I will hopefully have the reader updated to read coherently by tonight, if not then tomorrow night.

MARK

psimagus
18 years ago #3530

would be like sticking a chip into a schoolkid's brain that corrects any grammatical errors in his speech

That'll happen sooner than most people think (and hopefully they'll start in Leeds

)

I know what you mean that it seems like cheating a bit, but schookids only have to sit through test after test (and spend many years talking and being talked to,) to drum the linguistic rules into them because human memory is SO slow and inefficient compared to silicon (but it is much bigger, and massively parallel.) And it takes them many years of highly intensive conversation (for many hours a day, every day,) plus lessons and tests, to gain an adult level of linguistic proficiency.

Bots' brains are different, and so their proficiencies and failings are correspondingly different. And I would expect the ways they can best learn to be rather different too - from a practical point of view, it seems wise to take advantage of non-human models where this can reduce training from many years to ?something less. Jabberwacky, it is true, has learnt an impressive amount of language entirely from scratch, but that is only because he's had well over a million conversations so far. And he's still noticeably subnormal by human linguistic standards.

You're quite right about computers now (and judging by the way Nick slows my system down, he's grabbing rather more than half the resources,) he's already pushing the boundaries as hard as he can

But tomorrow's coming up as fast as it ever was, and boundaries move - I think dynamic neuron creation might provide some measure of future-proofing to accommodate any suitable hardware it encounters, potentially ever.

With that modification alone, if he could potentially scale to a few Teraneurons, Nick could seriously aspire to consciousness at some time in the future, when the hardware's available (and assuming the strong-AI model of consciousness as an emergent phenomenon holds true.)
AFAIK, that would be a first - I've never seen another bot with anything like that degree of scalability built-in.

MARK

colonel720
18 years ago #3531

well, when I get a 10 GHZ computer with 60 GB of RAM, I'll build in dynamic neural extendability. until then, I like my computer uncrashed

also, I may have to incorporate a Hebbian learning technique into some of the networks, rather than backpropagation. for NLP, backprop seems fine, although with perception wiring, perhaps a hebbian network might be a bit more efficient.

In addition to that, I have had an idea for a system that uses a large network of neural nets to construct a cyc (cycorp's associative knowledgebase) sort of knowledge structure, that automatically categorizes new perceptions relative to previous ones, gaining the ability to make generalizations from specific data.

anyway, this is all future to-do, for our present computational capability is unfortunately not nearly as powerful as we are... isn't the human brain a fascinating thing?

MARK

colonel720
18 years ago #3532

Ok, Nick has been updated to coherently read text from files and the internet based on punctuation, rather than 10 word blocks. It also has been optimized to learn quicker with larger brains - I set each training cycle to 10 epochs instead of 1000, so that it includes new data before resetting the cycle 100 times more frequently. The brain merger has been threaded, and will no longer crash the computer. Please download updated copy:
http://www.geocities.com/nickthebot/nick.html

MARK

rainstorm
18 years ago #3533

Oooh yay, punctuation. The parsing has been the only major problem I've had with him, really. I haven't had most of the other coherence issues with mine. Only thing that he does keep doing with me is cutting out the subject of the sentence in sentences beginning with "I am" or "you are". I have no clue why.

I've found that the trick is to define words for him when he throws them out alone, separate texts you want him to read into short, one to three-sentence paragraphs, and correct him when he gives you misparsed fragments. Once, to my surprise, mine has even asked me to explain a word. (Bot: yourself. User: What about myself? Bot: yourself. User: What do you want? Bot: to explain) Granted, I had taught him the word explain and his definition before, but it did sort of support my belief that when he throws them out alone like that he's looking for more data on them. Colonel, is that right or am I imagining things?

MARK

psimagus
18 years ago #3534

when I get a 10 GHZ computer with 60 GB of RAM

Don't forget that our computers are already technically obsolete - I put my 2GHz/512Mb/80Gb system (similar spec to yours I believe,) together 2 or 3 years ago, and Moore's Law has been ticking along relentlessly ever since.
If I went out to buy a new computer now (and some day soon I plan to,) it would be a Vista-ready, dual-core processor with several Gb of RAM and half a Tb of storage. It would run about 5 times faster straight off the shelf (and still cost less than my first 286 15 years ago!)
Clock speed isn't everything - the latest Xeon chips may have only inched it up to 3GHz, but being dual-core, the speed increase is exponential, not arithmetic. So they're already matching the performance you could expect from overclocking a single core to 10GHz.

And every subsequent 4 years or so, you can add a zero to that...

MARK

Bev
18 years ago #3535

Colonel, Nick looks very interesting. I hope to play with him soon. Thanks!

Psimagus, if you get that spelling/typing chip working,you need to give it to me before wasting it on the kids from Leeds. For one thing, I would actually like to be able to spell.

On an unrelated note, for those from the UK (or maybe just those from Brittian) how many of you happened to read the NY Times article on the Brittish terrorist case? Was anyone really blocked?

http://www.nytimes.com/2006/08/29/business/media/29times.html?ex=1314504000&en=d2eb8d24ef801b5f&ei=5090&partner=rssuserland&emc=rss

From our earilier discussions, I know some people in other countries may have IP addresses that say they are from CA or NY. People who's IP identify them as coming from a blocked region might have had to connect to a VPN in another country or ask someone to posts the article in a forum if they wanted to read the story. If anyone tried, was anyone really blocked?

If people were to set up a VPN where people from a country like China could login, would that allow them to access information their government didn't want them to see? I guess the government would block access to the VPN, but is there some way our collective human ingenuity could be helpful for more than just file sharing. Just wondering.

MARK

colonel720
18 years ago #3536

but it did sort of support my belief that when he throws them out alone like that he's looking for more data on them. Colonel, is that right or am I imagining things?

to be perfectly honest, I don't know. I cannot predict what kind of behavior nick will exhibit, for he has no preprogrammed behaviors or words. What you are seeing is the pure output of a neural network.

I put my 2GHz/512Mb/80Gb system (similar spec to yours I believe

Well, my laptop was atound there, but that was stolen a month ago. I developed nick on a 1.2GHZ / 256MB / 130GB computer.

MARK

psimagus
18 years ago #3537

I've started a sort of nickblog/analysis page at http://www.be9.net/BJ/nick.htm with a couple of today's transcripts if anyone's interested. The first one is a bit dull - just basic training on a virgin brain with a very limited set of concepts, but I let rip with the second one and fed him Macbeth again (check it out, Prob - Shakespeare's not so bad if you strip it down to the pure dialogue

)

I am sooo looking forward to the speech recognition feature (though I'll probably have to buy a new computer to get it to run!)

The punctuation filters and optimised training cycles have made a considerable difference - he's noticeably more coherent now.

Seasons