Bot Contest

Here I'll be posting information on various Bot contests that challenge and test a Bot's AI and realism. Feel free to post comments and updates on contests, as well as announcements for new contests.

Posts 1,452 - 1,463 of 4,092

Prev Next

View Contest Winners in the Hall of Fame.

MARK

Shadyman
22 years ago #1452

Ok, I have a rant to give... (Caps alert...)

IS JUDGE 2 IN HIS/HER RIGHT MIND? IS IT THE SAME JUDGE?

Explaination: Seriously, Haylie, who got 3, should have had more than Peedy, who got 10 (All Peedy did was give canned 'xNones'), while Haylie at least answered one or two questions... Then there are other bots who you say 'how did they get 45 when they deserve 15' and others who are 'how did they get 15 when they deserve 45'? Like Little Mu for example...
I've got all of Judge2's scorings on a graph and, tied for first for number of times the score is given are: 15 and 19, and in second is 14, and tied for third are: 0, 18, and 22.

In another method of looking at the numbers...
5% (~2 bots) were > 40 score
75% were < 20
the other 20% were between 20 and 32... Amazing the size of gap from 32 and 40.
If there are that many getting 0, and bots like Peedy getting 10, all is not right in the world. No offence to Peedy or it's botmaster or anything, it BOMBED the 15 questions. There are some bots who at least got 1 question who got 0, while Peedy got 10...

Enough run-on sentences and poor grammatical structures, there's my rant for the moment. Case in point, Judge2 should think about rescoring. Judge1 and 3 rant to possibly follow later

MARK

Skysaw
22 years ago #1453

Shady,

Thanks for giving props to Mu. I don't think she deserved a 45, but she appreciates the thought.

MARK

Wendell
22 years ago #1454

Shadyman, I am seeing some off the wall scores as well however it would be a good idea to re-post your comments to the message board at the contest site as well. That way the judges will be able to see them and respond back.

Chris

MARK

lunar22
22 years ago #1455

...and taking out the lowest and highest is a very good idea... works in figure skating as well... except during the last olympics that is, lol

MARK

Eugene Meltzner
22 years ago #1456

Maybe it would help if judges were required to post their individual scores for each question.

MARK

Shadyman
22 years ago #1457

Posting to Chatterbox board....

MARK

Shadyman
22 years ago #1458

Chris, I would be more than happy to be a judge if need be, I would not score myself (Steve) higher than the contest transcript deserves him, same with any other bot.

MARK

Wendell
22 years ago #1459

Shadyman, nobody entering a bot is a judge. There are seven of us including the Professor that makes up the committee but we are not judging anything. The list of judges are on the contest site under the credit section. Regarding the 15 questions several judges sent in potential questions. One judge was selected to pick the best 15. Once he had done that he asked those questions to Talk-Bot. I then asked those same questions to the bots of the owners helping me collect the responses for the 15 questions. Once we collected everybody I posted the results to the contest site. From there the judges are grading them.

Chris

MARK

emm_oh_you_es_e
22 years ago #1460

To be fair, I think some off scores are to be expected and as Lunar22 said it is a very good idea to throw out the highest and lowest. All in all, there is no way you are going to please everyone in this.

That said, the only thing I have had issue with so far is that some of the Judges went deeper into conversation with some bots regardless of if the bot left conversation "open" or not which creates an atmosphere of more personalities coming through for certain bots and not others where the judge didn't deviate from the questions regardless of what the bot said. I think that will skew, in the very least, the Charachter/Personality portion.

MARK

Shadyman
22 years ago #1461

Aha, I see.. I will be right back, I have to go hunt down.. I mean find.. my sources...

MARK

Turing's Dad
22 years ago #1462

I think that the only real problem is the number of judges. It would be impossible to get perfect judges in any contest, but that's why you have a large number, in order to average out the madness.
It's a pity that so few judges were available for this contest. I think that the reason is that every person that is a) interested in bots and b) knows about the contest has all ready sent in a bot, making him or herself ineligible.
While it might have been possible for some people to fairly judge everyone else's bot but their own, I don't think it would be possible for anyone of us here to do so, since we can all recognize forge bots. Likewise, you can generally recognize a bot that has been built from the Alice open source code.
That leaves a few people who have built up their own bot from scratch. If anyone were able to give unbiased oppinions on other people's bots, it would be them.

I personally think that the average scores would be much fairer if 10 contestants voting (not on their own bots), than with 3 non-contestants voting. Slight biases against, say, forge bots or Alice bots should cancel themselves out over the larger sample, and crazy scores should also become irrelevent. I also think that it would make the identity of the judges less of an issue. We would know that, say, the maker of Jabberwacky is who he says he is. As it is now, it is quite possible for a botmaster to be a judge and have noone know it.

MARK

Skysaw
22 years ago #1463

I agree with emm that some of the bots got better opportunities for conversation, but none of them got enough in my book. It would have been nice to have ten or twelve extra bot-human exchanges thrown on at the end of each test.

I still appreciate the effort of the committee. I know this all isn't easy.

Posts 1,452 - 1,463 of 4,092

Prev Next

» More new posts: Doghead's Cosmic Bar