Tuesday, December 13, 2005

Amazon Training Session

I participated in the online presentation by Jeff Barr this evening. I found it interesting although the initial presentation was geared more towards management and only offered a high level overview of the Mechanical Turk service.

Of much more interest was the question and answer session at the end. I'm not sure who besides Jeff Barr and John Hsia were answering questions since they weren't introduced, but it was said they were development team members. I had a few questions prepared before I signed on, and I asked a few more that I thought of later.

1. Alan asks: Approximately how many individual turkers are regularly submitting hits on a daily basis?

Answer: That is currently confidential information.

2. Alan asks: Has the MTurk team considered any methods of verifying worker identity and vetting worker backgrounds in order to allow them to view more sensitive data?

Answer: Yes we have, nothing in that regard has been implemented yet. You can add your own system using qualifications by only granting people quals who have passed a particular test.

--What I was really getting at here was whether Amazon had any interest in providing a pool of workers who could be trusted to view private or confidential information. From what I gathered tonight I think they're more interested in providing the interface and letting Requestors work out this sort of thing for themselves. There is apparently nothing preventing you from only allowing exactly who you want to have access to the Hits that you provide to the MTurk system.

3. Alan: I can imagine ways to use MTurk where you might only pay for a hit if a worker happens to find something. This would lead to a high rejection rate that might affect their ability to do other hits. Any thoughts on how to handle that?

Answer: We are considering exposing info about the requesters such as their approval and rejection rates for different types of hits. This would give workers insight into if they want to work on those particular types of HITS or not.

--Again I didn't really explain what I was getting at very clearly. I might want to have workers to look through a set of scanned receipts and flag any that have problems. If I only wanted to pay when they found an erroneous receipt, I'd have to reject those that don't. It occured to me I could account for this by adjusting the rate I paid for each submitted hit, but it might be useful to have a way to only selectively pay for a hit without just rejecting them.

4. Alan: How long do you anticipate MTurk staying in Beta?

Answer: Until we are ready to launch!

-- A bit of a flippant answer, but it was an off-the-cuff question. What I should have asked is if anything will change once MTurk leaves Beta or if there's any reason to wait until it leaves Beta status to submit requests.

5. Alan: Will work always have to be done through MTurks website?

Answer: Today we don't have a set of APIs, but we are actively looking at ways to incorporate MT into the worker side of your applications. We would love to hear of your requirements in that space.

-- Someone else asked basically the same question, but it didn't make it to the chat window. The answer was to their question. It sounds like they're open to other interfaces being developed, but don't have the tools available yet.

6. Rob: I've noticed you've added CAPTCHA support to the site, is this feature going to be automatically applied to HITs requestors submit?

Answer: Yes, it is applied for all new users after the first 5 HITS and then progressively thereafter, for all hits in the system, not just Amazon ones.

-- The image captcha's are here to stay, which is a great thing and a good first step in causing the script kiddies as much grief as possible.


captnkurt said...


Thanks for sharing some of the points covered for those of of who didn't/couldn't catch the presentation.

I do wish that TPTB at MTurk would be a little more communicative on their site. I am guessing a lot of people each day are hearing buzz about MTurk, but when they go there, there are no available HITs and no communication as to why, or what to expect. The FAQ is okay as long as there is actual work out there to do, but for all the newbies just arriving, it is useless.

If they just had some kind of a "News" section that had semi-current updates. Even if they just said something along the lines of, "MTurk currently has limited HITs available, but more will be coming soon", or something, anything...

Suburban Wolf said...

Hi, I'm the "Rob" who asked about CAPTCHA support. I did get a kick out of their response to the question about committal of resources to the project - the 'chuckle of contempt' from the team, a la 'What kind of question is that?'.

I was simply asking from a logistical standpoint - Amazon have lots of money, but we don't know if they've got an entire rack already dedicated to this task, or a couple of 486's in the corner, y'know. :)