Turk Lurker

Friday, December 23, 2005

Casting Words Style Guide

Rachel at Castingwords.com has started a Style Guide to help people better understand how to complete these types of hits on Mturk.

Wednesday, December 21, 2005

Rachel at Casting Words has agreed to moderate a forum at Turker Nation where Turkers can easily communicate with with her and other Turkers interested in pod casting transcription work. Casting Words was one of the early requestors outside of Amazon to post hits on MTurk and it's been interesting to follow their progress so far.

Low User Activity

I was unable to attend last Thursday night's training session hosted by Jeff Barr due to my father passing away that night. Hopefully there will be a transcript available.

From what I can tell, things are slow currently on Mturk. A few new types of hits have shown up, but honestly I find the Top Three hits to be completely uninteresting. If MTurk moves towards becoming just a new way of doing survey work I'd be greatly disappointed.

The new types of hits also lead me to make a suggestion to the Mturk team. It would be a great thing to be able to sort the available hits just by clicking on column headers instead of having to choose from a dropdown list and clicking a button. I realize this would involve changing the layout of how hits are presented to users, but now is a great time to do so while MTurk is still in Beta status. It would also be very desirable to have an easy way to exclude hits from the list, so I wouldn't have to scroll through a lot of Top Three hits to see if there's anything else more interesting to work on.

Tuesday, December 13, 2005

Amazon Training Session

I participated in the online presentation by Jeff Barr this evening. I found it interesting although the initial presentation was geared more towards management and only offered a high level overview of the Mechanical Turk service.

Of much more interest was the question and answer session at the end. I'm not sure who besides Jeff Barr and John Hsia were answering questions since they weren't introduced, but it was said they were development team members. I had a few questions prepared before I signed on, and I asked a few more that I thought of later.

1. Alan asks: Approximately how many individual turkers are regularly submitting hits on a daily basis?

Answer: That is currently confidential information.

2. Alan asks: Has the MTurk team considered any methods of verifying worker identity and vetting worker backgrounds in order to allow them to view more sensitive data?

Answer: Yes we have, nothing in that regard has been implemented yet. You can add your own system using qualifications by only granting people quals who have passed a particular test.

--What I was really getting at here was whether Amazon had any interest in providing a pool of workers who could be trusted to view private or confidential information. From what I gathered tonight I think they're more interested in providing the interface and letting Requestors work out this sort of thing for themselves. There is apparently nothing preventing you from only allowing exactly who you want to have access to the Hits that you provide to the MTurk system.

3. Alan: I can imagine ways to use MTurk where you might only pay for a hit if a worker happens to find something. This would lead to a high rejection rate that might affect their ability to do other hits. Any thoughts on how to handle that?

Answer: We are considering exposing info about the requesters such as their approval and rejection rates for different types of hits. This would give workers insight into if they want to work on those particular types of HITS or not.

--Again I didn't really explain what I was getting at very clearly. I might want to have workers to look through a set of scanned receipts and flag any that have problems. If I only wanted to pay when they found an erroneous receipt, I'd have to reject those that don't. It occured to me I could account for this by adjusting the rate I paid for each submitted hit, but it might be useful to have a way to only selectively pay for a hit without just rejecting them.

4. Alan: How long do you anticipate MTurk staying in Beta?

Answer: Until we are ready to launch!

-- A bit of a flippant answer, but it was an off-the-cuff question. What I should have asked is if anything will change once MTurk leaves Beta or if there's any reason to wait until it leaves Beta status to submit requests.

5. Alan: Will work always have to be done through MTurks website?

Answer: Today we don't have a set of APIs, but we are actively looking at ways to incorporate MT into the worker side of your applications. We would love to hear of your requirements in that space.

-- Someone else asked basically the same question, but it didn't make it to the chat window. The answer was to their question. It sounds like they're open to other interfaces being developed, but don't have the tools available yet.

6. Rob: I've noticed you've added CAPTCHA support to the site, is this feature going to be automatically applied to HITs requestors submit?

Answer: Yes, it is applied for all new users after the first 5 HITS and then progressively thereafter, for all hits in the system, not just Amazon ones.

-- The image captcha's are here to stay, which is a great thing and a good first step in causing the script kiddies as much grief as possible.

Friday, December 09, 2005

Still Hittin' It

I've still been doing about 200 or so hits a day, just to keep up with things and add a little extra money to my Amazon account. At that rate I'd make about $100 a month, which is enough to at least make me smile.

One thing that doesn't make me smile is still having to fight for hits. Clicking accept 20 times in a row and not getting a hit to work on is the only thing that still just makes me go red when I'm on the MTurk site. The Monolith is still silent about it of course. Meh.

I'm signed up for an online training session/overview this Tuesday evening with host Jeff Barr. It should be interesting and I hope to get a few questions in and I promise not to heckle Jeff too much. I'm pretty sure I'm bigger than him, but he may work out more. Heh, heh, heh.

Monday, December 05, 2005

5k Hits

I recently submitted my 5000th hit since I started participating in the MTurk project about a month ago. In that month I made a little over $131.00. Not too bad for piddling around in my spare time I suppose.

I've also recently broken down and done a few more Artist Confirmation hits. I still don't really like them, but I have a stubborn, pitiful technique of doing them so if I make three bucks a day doing them I'm happy. I basically skip the hits until I see something like "Pat Boone" or "The Temptations" or something that is obviously "Various Artists." Combining that with have to hit Accept about 10 times before I actually get a hit makes for very slow progress.

I think Amazon's stubborn refusal to do anything about the hits so easily being given to someone else before you have a chance to accept them is soon going to turn me off of the entire project. It's too frustrating. Apparently hits are just shown to as many people as possible until someone clicks accept. It's like they've put you in one of those Vegas "money chambers" where the dollar bills are blowing all around you and you've got so much time to grab as many as you can.

In the past few days people have also been posting on the message boards that they always choose "None of the Others" when doing IA hits. I personally find this distasteful, but it's hard to not expect it. It's the best way to make the highest amount of money in the shortest amount of time so until they implement some method to discourage it, IA hits are probably dead.

Wednesday, November 30, 2005

I've Been Hit Hiking

So I guess I spoke too soon about the Podcast Transcription hits. I saw a few today and they were back down to four cents apiece. I broke down and did about twenty artist confirmation hits, but none have been processed as of yet. I don't think I'll do many more since I just honestly don't like doing them.

There's been rumors of a new Product Description hit that pays upwards of seventy-five cents, but I haven't seen one on the hit list to confirm it yet. As soon as I see one listed I'll try it out.

I managed to snag about 40 IA hits earlier this evening, but it was a frustrating process. I guess I'm going to have to admit I'm a stubborn old goat and try out some of the auto-accept scripts. I really hope the MTurk guys change the process soon so that you get at least a few seconds to accept a hit before someone else can. It would also be great to have a Submit and Assign button that just accepts the next hit in the group, but there seems to be some indication from the Monolith that they want us to be more selective in what hits we accept. Check out this thread on Turker Nation for more information on that.

Tuesday, November 29, 2005

A Matter of Trust

There's an old saying that "Trust must be earned." As it stands right now on Amazon's Mechanical Turk, everyone starts out being trusted. That's very noble, but it causes problems when the script kiddie barbarians show up at the gate and they get waved in with the rest of us.

My experience as a parent and what I learned when studying for my education degree tells me that trust must be given to a child as they show more responsibility. If they stay within the box that defines the rules of the adult/child relationship, the box expands and they are given more freedom. If they step outside that box, it shrinks down to a more restrictive level until the child earns the trust to expand it again.

My computer science degree and my experience in the IT field allows me to define a way to reflect this in a system to govern hit acceptance in the MTurk environment. As I understand it, the method of accepting or rejecting hits is left to the client, so this is my thoughts on how to do that in a way that rewards success and limits the damage that auto-accept and submit scripts can do. I'm going to use the A9 Image Adjustment (IA) hits for this, but the method should work with any of the hit types seen so far on MTurk.

If we extrapolate my example of the adult/child relationship, then we have Amazon as the adult and the workers as the child. I'm not implying of course that the Turk Wurkers are children, but the relationship is similar because the adult/A9 has no reason to trust the child/worker when the relationship begins. If a method is put into place that allows trust by A9 to be measured and modified for each worker, I think many of the current problems with IA hits could be eliminated.

To begin, a variable has to be assigned to each Turker that measures Trust. This would be a simple integer value and would begin at a low number of say five (5) for new workers. A value of zero would imply that there is a neutral level of Trust between A9 and the worker. The value would increase as trust is gained and decrease as trust is removed. Any negative value would imply lack of trust and will be discussed later.

In order to determine whether you trust a new worker, you have to have a basis to judge them against. This requires a seed of absolute trust. For each group of images that you plan to test in the MTurk system, you have internal workers or admins select a small percentage of them and choose the definite correct answer to establish them as Trust Markers (TM). These sets of images can be placed semi-randomly in the work flow. Each correct answer will raise the Trust level of a worker by one point. The higher the Trust value is for a given worker, the less often these Trust Markers need to show up for that worker. In this way A9 would have to spend less on these types of images as more trust is gained. This is analogous to a company spending more on a worker when they are first hired in order to train them.

An incorrect answer to a TM would result in one point being subtracted from that workers Trust value. Once Trust goes negative, the worker would no longer be allowed to accept any hits until Trust reaches zero again. Negative points could decay at a rate set by A9, so it could take an hour or a day before the new user could try again. The value could also be allowed to increment back to the starting point of five after a certain amount of time. This would allow for a bit more leniency.

Just the addition of this functionality would severely hamper how much damage a scripter could do to the results, but one more feature is needed to eliminate the need to pay them for the random hits they did get correct before being locked out. I call this feature a Trust Lock.

A Trust Lock is created by taking a standard set of A9 images, including the "None of the Others" image and changing one of the images so that it reads "Submit This Image" in the same style as the NotO image. The worker would obviously be required to submit that particular image to answer the hit correctly. The Trust Lock would be dropped in much less frequently than the TMs, but answering the Trust Lock image set incorrectly would lead to an immediate Trust value of negative one (-1). This sounds harsh, but only a script or someone not paying attention would miss one of these.

In addition, all hits submitted since the last time you correctly answered a Trust Lock hit would be automatically rejected, whether they were answered correctly or not. Again, this sounds harsh but there's no reason to ever answer one incorrectly unless you're running an auto-accept script or working at a pace that is too high.

So let's use a few real world examples to walk through the methodology I just covered. First, let's say Johnny Turker heard about MTurk from his roommate and logs in and creates an account. He's assigned an initial Trust value of five (Trust = 5). Johnny then goes off and selects a group of IA hits to work on and starts turking.

At some point within the first 20 hits, Johnny is unknowingly presented a Trust Marker hit, and being inexperienced, gets it wrong. This drops his Trust value to four. At this point the MTurk system may decide to assign the next TM within 10 to 15 hits, since the trust level is less. If he had answered the first TM correctly, the MTurk system may wait until 20 to 25 more hits, depending on the algorithm used to determine how often these hits are presented to the worker. The system could also decide to immediately drop in a Trust Lock hit since missing the first TM that was presented could raise suspicions of him using a script.

Regardless, within the first 50 hits Johnny is presented with his first Trust Lock hit and he answers it correctly. At that point all the hits he submitted before the Trust Lock are eligible to be processed, while all the hits after this point will be processed when the next Trust Lock is answered correctly.

After doing about 100 hits, Johnny calls it a night and logs out with a Trust value of say six (6) since he improved how often he answered the TMs correctly. Later that night, Johnny falls under the influence of an evil script kiddy buddy down the hall in his dorm, who tells him he has a script that will randomly answer IA hits and make him lots of money while he sleeps.

Johnny installs the script and it starts running. Within 20 hits or so it encounters it first TM hit. It has a 1 in 7 chance of answering this correctly, which is quite possible but somewhat unlikely. Since Johnny's Trust value is still relatively low, he will also be presented a Trust Lock hit soon as well. Between the two types of Trust hits, it is unlikely the script will run for very many hits before Johnny is locked out of accepting any more hits. This also makes it unlikely that A9 will have to pay him for the submitted hits since they can know with some confidence that they're likely junk submittals.

The Trust value also allows Amazon and A9 to remove more of these restrictions after Trust reaches a certain level. The restriction on hits being processed until a Trust Lock is passed could be removed after Trust reaches a value of 25 or whatever level is determined to be appropriate. The Trust level could also be used to allow access to other new types of hits that pay better or that are more sensitive to script manipulation.

A possible algorithm for how often a turker is presented with TMs could be TL(20) - random(1...TL(10)). So at a Trust Level of 10 a turker would see a TM within the next 100 to 199 hits, which is 200 minus a random number between 1 and 100. Trust Locks could occur more randomly, but should one probably be presented to the turker within a few hits of a TM being answered incorrectly.

The existing qualifications can be combined with the Trust level to create new seeds of TM hits. If enough people with a certain level of Trust and a high level of accuracy agree on a certain image, it could be turned into a new TM hit. The Trust value could also be used to reduce the number of workers a hit has to be presented to to verify it. The higher the Trust value of a turker, the more weight is given to their response, so one Turker with a Trust value of 100 and an accuracy of 90% could replace multiple submitals by turkers with lower values.

This all leads me to a method where A9 could get more value out of their IA hits, but I'll leave that for the next gigantipost. Please feel free to comment on anything I missed or ways this method could be abused. I'll edit this post with any changes we come up with.

Transcription Hits Up to a Nickel

I saw a few of the Transcription Hits today at lunch. They quickly disappeared, but I did notice they were paying five cents each instead of two cents. That's a great improvement, but IA hits at three cents each are a much better pay out. Of course, I haven't been able to actually receive an IA hit for the last two days, so the difference is a moot point.

Monday, November 28, 2005

Turk Props

There was a recent post on the Amazon Web Services Blog recently that I missed during my holiday down time. Thanks to them for mentioning this blog and the Turker Nation message board. There wasn't a lot of information revealed in the blog post, but they seem to indicate that they're actively seeking new types of hits to enter into the system from outside customers. Personally, I think they're going to have to do a lot more testing on ideas to discourage script kiddies and other dillholes before they start trying to get paying customers.

Sunday, November 27, 2005

Breaks

I've taken a break since Tuesday from work and MTurk for the most part. I wanted to do a lot of hits last night but was only able to do about 200 since the Image Adjustment hits available were running low and I don't like doing the Confirm Artist Name hits.

There are rumors on the Turker Nation boards that people are running scripts just to accept hits and randomly click an image and submit them as rapidly as possible. There has been no word from the Monolith at Amazon of course, so there's no way to confirm or deny it. Greedy people running scripts with 25 tabs open at once on their browser could also account for some of it.

Hopefully Amazon will seriously consider putting a hard limit of less than 10 hits that can be checked out at once. I'm sure this will slow some people down, but doing 1000 hits an hour is not really providing valuable work anyway. It's just taking advantage of the fact that the A9 images are skewed so heavily towards the "None of the Above." If "None of the Above" were only the correct option about 20% of the time, you'd have to spend more than 4 seconds on a hit or your accuracy would suffer greatly.

I suspect that Amazon might not care about getting "real" work done with IA hits so much right now. I see the same IA hits over and over again so I think they're recycling a lot in order to maintain load on the system and get more data they can use to optimize things.

Wednesday, November 23, 2005

Transcription Hits

A new type of MTurk Hit appeared recently. For these you listen to a small audio podcast and transcribe what you hear. While generating some interest because they are new, these hits will not be very popular if they continue to pay only two cents per hit. In the time it takes you to do one of these, you could finish multiple Image Adjustment hits.

Tuesday, November 22, 2005

Artist Name Hits

I was recently able to do a few of the latest hit type, where you are asked to look at the cover of an album and then type in the correct name of the artist. They give you some samples of potential artist names as well. These hits have been paying either two or three cents, but usually two cents per approval.

Honestly, I don't like doing them. Maybe I'm just too used to doing Image Adjustment hits. I also don't want to take a 33% pay cut for doing hits that actually take longer for me to do. I think the combination of typing and using the mouse slows me down to about four per minute where I usually do about six IA hits a minute.

Six submits a minute on IA Hits puts me in the geezer league on hits, by the way. I can't even maintain that rate for more than about 10 minutes before I have to take a break. I think getting rich off the Mechanical Turk is not in my destiny. As it is I've made about $100 in the last three weeks by just spending a few hours a night.

There still seems to be some issues with negative pending hits showing up in the dashboard, but various people have received emails from Amazon support stating this should not affect your earnings. There also seems to be a delay in moving money into your account. I've earned enough to order my first goodie from Amazon, but it hasn't shown up in my account yet for me to spend. Patience.

Monday, November 21, 2005

Turk Loot

As of last night, with my average 9 to 10% rejection rate, I have submitted enough hits to be able to order my first piece of Turk Loot once the hits are processed. After a long hiatus over the weekend, the Mechanical Turk is back and a lot of the previously pending hits have been processed. Although many Turkers were disappointed that the system was down for so long, I'm sure their wrists and clicking fingers are appreciative of the rest.

Amazon Web Services Evangelist Jeff Barr also stopped by the message boards at Turker Nation recently and gave some nice input and asked for suggestions from the user community. Make sure you stop by when you take your next break from processing hits. He confirmed that new types of hits will be showing up eventually and that they're always looking for more ideas for new types of hits.

Saturday, November 19, 2005

Wounded Turk

I stayed up way too late last night doing IA Hits on MTurk, and about 2am cst the Turk Masters were merciful to me and took the system down. Almost 12 hours later it appears to still be down, causing much consternation among the Turker Nation, at least among those who don't have sporting events to watch.

I'm very close to getting the helmet I want for free, although I guess I could use what I have now and spend about three dollars to go ahead and order it. Since I do all the Turk Wurk I do in my free time, I'm treating the funds I earn as extra spending money so I can get more of the toys I crave. I've even gone and updated my wishlist at Amazon so I can drool over what I want occasionally.

Friday, November 18, 2005

Friday Morning

Not a whole lot to report about Mechanical Turk lately. The new "None of the others" image seems to be fully implemented now and I didn't see any of the previous two types yesterday. Because of a school function last night, I only submitted 159 hits yesterday. There were also a lot of system errors which slowed down the pace since I couldn't get in a good rhythm so I just went to bed early. I'm still being stubborn and not using the scripts which would probably solve a lot of that for me.

The processing of hits submitted still seems to be hit or miss. They processed a lot of them sometime in the early morning hours yesterday, then only a few got processed from yesterday. This is probably the one thing that is frustrating the majority of Turkers that I've heard from. Some are also frustrated about the account balance not staying up to date recently, but this hasn't affected me yet since I haven't cashed anything out yet.

At this point I'm down to needing about 800 more approvals to get the helmet I've been craving. I'll probably take a break next week during the holidays since I don't want to do anything resembling work for a few days other than getting the big meal ready.

Thursday, November 17, 2005

Getting There

I had planned to write about how troubled and frustrating the Mechanical Turk was yesterday. It was troubled and frustrating yesterday, there's no doubt about that. I submitted over 300 hits by the time I went to sleep last night and only about 25 had been processed. I was getting close to having over 1000 hits still in Pending status. The MTurk systems seemed to be overloaded yesterday as well. There were lots of page errors and the Image Adjustment hits were a strange conglomeration of all three of the None of The Above options we've seen so far.

I was prepared to write a fairly harsh blog entry this morning, but when I looked at my account this morning there were over 500 of my pending hits processed while I slept. Bliss! My rejection rate is down to almost 10% and I'm now over halfway in my quest to acquire a new Petzl Ecrin Roc helmet.

So in short, MTurk has some problems, but it's still one of the coolest projects on the Internet in my own humble opinion. If I had to fault them for anything, it would be that they're not communicating with their userbase effectively yet. There's no official MTurk announcement board or blog, and the only way we're getting much official news right now is from the emails from customer service that we're sharing with each other over on Turker Nation.

Wednesday, November 16, 2005

Black Box Fading Away

As first reported on Turker Nation by Jake, a new None of the Above has appeared on MTurk and can currently be seen in Image Adjustment hits for Fort Worth. This is very likely the latest attempt to stifle the script kiddies out there who were likely scanning images for lots of black space or smaller file size due to black images being compressed easier.

I think we're going to have to get used to the fact that having a script pre-enable None of the Above answers is not going to be feasible as long as the majority of IA Hits should correctly be answered with None of the Above.

Hell Yeah You Get Paid

I'm sure this will get smacked down by the Amazon lawyers very soon, but someone has put up a hilarious spoof site. Go check it out and then get back to submitting hits. No more breaks for you!