Welcome to Turk Lurker. Here you will find my thoughts and observations on the recently unveiled Mechanical Turk project from Amazon. I've long been a fan of distributed work projects and this blog will detail my experiences on this exciting new opportunity brought to us from Amazon. Come join the Turker Nation. You too will be assimilated.
Friday, December 23, 2005
Casting Words Style Guide
Rachel at Castingwords.com has started a Style Guide to help people better understand how to complete these types of hits on Mturk.
Wednesday, December 21, 2005
Casting Words Forum
Rachel at Casting Words has agreed to moderate a forum at Turker Nation where Turkers can easily communicate with with her and other Turkers interested in pod casting transcription work. Casting Words was one of the early requestors outside of Amazon to post hits on MTurk and it's been interesting to follow their progress so far.
Low User Activity
I was unable to attend last Thursday night's training session hosted by Jeff Barr due to my father passing away that night. Hopefully there will be a transcript available.
From what I can tell, things are slow currently on Mturk. A few new types of hits have shown up, but honestly I find the Top Three hits to be completely uninteresting. If MTurk moves towards becoming just a new way of doing survey work I'd be greatly disappointed.
The new types of hits also lead me to make a suggestion to the Mturk team. It would be a great thing to be able to sort the available hits just by clicking on column headers instead of having to choose from a dropdown list and clicking a button. I realize this would involve changing the layout of how hits are presented to users, but now is a great time to do so while MTurk is still in Beta status. It would also be very desirable to have an easy way to exclude hits from the list, so I wouldn't have to scroll through a lot of Top Three hits to see if there's anything else more interesting to work on.
From what I can tell, things are slow currently on Mturk. A few new types of hits have shown up, but honestly I find the Top Three hits to be completely uninteresting. If MTurk moves towards becoming just a new way of doing survey work I'd be greatly disappointed.
The new types of hits also lead me to make a suggestion to the Mturk team. It would be a great thing to be able to sort the available hits just by clicking on column headers instead of having to choose from a dropdown list and clicking a button. I realize this would involve changing the layout of how hits are presented to users, but now is a great time to do so while MTurk is still in Beta status. It would also be very desirable to have an easy way to exclude hits from the list, so I wouldn't have to scroll through a lot of Top Three hits to see if there's anything else more interesting to work on.
Tuesday, December 13, 2005
Amazon Training Session
I participated in the online presentation by Jeff Barr this evening. I found it interesting although the initial presentation was geared more towards management and only offered a high level overview of the Mechanical Turk service.
Of much more interest was the question and answer session at the end. I'm not sure who besides Jeff Barr and John Hsia were answering questions since they weren't introduced, but it was said they were development team members. I had a few questions prepared before I signed on, and I asked a few more that I thought of later.
1. Alan asks: Approximately how many individual turkers are regularly submitting hits on a daily basis?
Answer: That is currently confidential information.
2. Alan asks: Has the MTurk team considered any methods of verifying worker identity and vetting worker backgrounds in order to allow them to view more sensitive data?
Answer: Yes we have, nothing in that regard has been implemented yet. You can add your own system using qualifications by only granting people quals who have passed a particular test.
--What I was really getting at here was whether Amazon had any interest in providing a pool of workers who could be trusted to view private or confidential information. From what I gathered tonight I think they're more interested in providing the interface and letting Requestors work out this sort of thing for themselves. There is apparently nothing preventing you from only allowing exactly who you want to have access to the Hits that you provide to the MTurk system.
3. Alan: I can imagine ways to use MTurk where you might only pay for a hit if a worker happens to find something. This would lead to a high rejection rate that might affect their ability to do other hits. Any thoughts on how to handle that?
Answer: We are considering exposing info about the requesters such as their approval and rejection rates for different types of hits. This would give workers insight into if they want to work on those particular types of HITS or not.
--Again I didn't really explain what I was getting at very clearly. I might want to have workers to look through a set of scanned receipts and flag any that have problems. If I only wanted to pay when they found an erroneous receipt, I'd have to reject those that don't. It occured to me I could account for this by adjusting the rate I paid for each submitted hit, but it might be useful to have a way to only selectively pay for a hit without just rejecting them.
4. Alan: How long do you anticipate MTurk staying in Beta?
Answer: Until we are ready to launch!
-- A bit of a flippant answer, but it was an off-the-cuff question. What I should have asked is if anything will change once MTurk leaves Beta or if there's any reason to wait until it leaves Beta status to submit requests.
5. Alan: Will work always have to be done through MTurks website?
Answer: Today we don't have a set of APIs, but we are actively looking at ways to incorporate MT into the worker side of your applications. We would love to hear of your requirements in that space.
-- Someone else asked basically the same question, but it didn't make it to the chat window. The answer was to their question. It sounds like they're open to other interfaces being developed, but don't have the tools available yet.
6. Rob: I've noticed you've added CAPTCHA support to the site, is this feature going to be automatically applied to HITs requestors submit?
Answer: Yes, it is applied for all new users after the first 5 HITS and then progressively thereafter, for all hits in the system, not just Amazon ones.
-- The image captcha's are here to stay, which is a great thing and a good first step in causing the script kiddies as much grief as possible.
Of much more interest was the question and answer session at the end. I'm not sure who besides Jeff Barr and John Hsia were answering questions since they weren't introduced, but it was said they were development team members. I had a few questions prepared before I signed on, and I asked a few more that I thought of later.
1. Alan asks: Approximately how many individual turkers are regularly submitting hits on a daily basis?
Answer: That is currently confidential information.
2. Alan asks: Has the MTurk team considered any methods of verifying worker identity and vetting worker backgrounds in order to allow them to view more sensitive data?
Answer: Yes we have, nothing in that regard has been implemented yet. You can add your own system using qualifications by only granting people quals who have passed a particular test.
--What I was really getting at here was whether Amazon had any interest in providing a pool of workers who could be trusted to view private or confidential information. From what I gathered tonight I think they're more interested in providing the interface and letting Requestors work out this sort of thing for themselves. There is apparently nothing preventing you from only allowing exactly who you want to have access to the Hits that you provide to the MTurk system.
3. Alan: I can imagine ways to use MTurk where you might only pay for a hit if a worker happens to find something. This would lead to a high rejection rate that might affect their ability to do other hits. Any thoughts on how to handle that?
Answer: We are considering exposing info about the requesters such as their approval and rejection rates for different types of hits. This would give workers insight into if they want to work on those particular types of HITS or not.
--Again I didn't really explain what I was getting at very clearly. I might want to have workers to look through a set of scanned receipts and flag any that have problems. If I only wanted to pay when they found an erroneous receipt, I'd have to reject those that don't. It occured to me I could account for this by adjusting the rate I paid for each submitted hit, but it might be useful to have a way to only selectively pay for a hit without just rejecting them.
4. Alan: How long do you anticipate MTurk staying in Beta?
Answer: Until we are ready to launch!
-- A bit of a flippant answer, but it was an off-the-cuff question. What I should have asked is if anything will change once MTurk leaves Beta or if there's any reason to wait until it leaves Beta status to submit requests.
5. Alan: Will work always have to be done through MTurks website?
Answer: Today we don't have a set of APIs, but we are actively looking at ways to incorporate MT into the worker side of your applications. We would love to hear of your requirements in that space.
-- Someone else asked basically the same question, but it didn't make it to the chat window. The answer was to their question. It sounds like they're open to other interfaces being developed, but don't have the tools available yet.
6. Rob: I've noticed you've added CAPTCHA support to the site, is this feature going to be automatically applied to HITs requestors submit?
Answer: Yes, it is applied for all new users after the first 5 HITS and then progressively thereafter, for all hits in the system, not just Amazon ones.
-- The image captcha's are here to stay, which is a great thing and a good first step in causing the script kiddies as much grief as possible.
Friday, December 09, 2005
Still Hittin' It
I've still been doing about 200 or so hits a day, just to keep up with things and add a little extra money to my Amazon account. At that rate I'd make about $100 a month, which is enough to at least make me smile.
One thing that doesn't make me smile is still having to fight for hits. Clicking accept 20 times in a row and not getting a hit to work on is the only thing that still just makes me go red when I'm on the MTurk site. The Monolith is still silent about it of course. Meh.
I'm signed up for an online training session/overview this Tuesday evening with host Jeff Barr. It should be interesting and I hope to get a few questions in and I promise not to heckle Jeff too much. I'm pretty sure I'm bigger than him, but he may work out more. Heh, heh, heh.
One thing that doesn't make me smile is still having to fight for hits. Clicking accept 20 times in a row and not getting a hit to work on is the only thing that still just makes me go red when I'm on the MTurk site. The Monolith is still silent about it of course. Meh.
I'm signed up for an online training session/overview this Tuesday evening with host Jeff Barr. It should be interesting and I hope to get a few questions in and I promise not to heckle Jeff too much. I'm pretty sure I'm bigger than him, but he may work out more. Heh, heh, heh.
Monday, December 05, 2005
5k Hits
I recently submitted my 5000th hit since I started participating in the MTurk project about a month ago. In that month I made a little over $131.00. Not too bad for piddling around in my spare time I suppose.
I've also recently broken down and done a few more Artist Confirmation hits. I still don't really like them, but I have a stubborn, pitiful technique of doing them so if I make three bucks a day doing them I'm happy. I basically skip the hits until I see something like "Pat Boone" or "The Temptations" or something that is obviously "Various Artists." Combining that with have to hit Accept about 10 times before I actually get a hit makes for very slow progress.
I think Amazon's stubborn refusal to do anything about the hits so easily being given to someone else before you have a chance to accept them is soon going to turn me off of the entire project. It's too frustrating. Apparently hits are just shown to as many people as possible until someone clicks accept. It's like they've put you in one of those Vegas "money chambers" where the dollar bills are blowing all around you and you've got so much time to grab as many as you can.
In the past few days people have also been posting on the message boards that they always choose "None of the Others" when doing IA hits. I personally find this distasteful, but it's hard to not expect it. It's the best way to make the highest amount of money in the shortest amount of time so until they implement some method to discourage it, IA hits are probably dead.
I've also recently broken down and done a few more Artist Confirmation hits. I still don't really like them, but I have a stubborn, pitiful technique of doing them so if I make three bucks a day doing them I'm happy. I basically skip the hits until I see something like "Pat Boone" or "The Temptations" or something that is obviously "Various Artists." Combining that with have to hit Accept about 10 times before I actually get a hit makes for very slow progress.
I think Amazon's stubborn refusal to do anything about the hits so easily being given to someone else before you have a chance to accept them is soon going to turn me off of the entire project. It's too frustrating. Apparently hits are just shown to as many people as possible until someone clicks accept. It's like they've put you in one of those Vegas "money chambers" where the dollar bills are blowing all around you and you've got so much time to grab as many as you can.
In the past few days people have also been posting on the message boards that they always choose "None of the Others" when doing IA hits. I personally find this distasteful, but it's hard to not expect it. It's the best way to make the highest amount of money in the shortest amount of time so until they implement some method to discourage it, IA hits are probably dead.
Wednesday, November 30, 2005
I've Been Hit Hiking
So I guess I spoke too soon about the Podcast Transcription hits. I saw a few today and they were back down to four cents apiece. I broke down and did about twenty artist confirmation hits, but none have been processed as of yet. I don't think I'll do many more since I just honestly don't like doing them.
There's been rumors of a new Product Description hit that pays upwards of seventy-five cents, but I haven't seen one on the hit list to confirm it yet. As soon as I see one listed I'll try it out.
I managed to snag about 40 IA hits earlier this evening, but it was a frustrating process. I guess I'm going to have to admit I'm a stubborn old goat and try out some of the auto-accept scripts. I really hope the MTurk guys change the process soon so that you get at least a few seconds to accept a hit before someone else can. It would also be great to have a Submit and Assign button that just accepts the next hit in the group, but there seems to be some indication from the Monolith that they want us to be more selective in what hits we accept. Check out this thread on Turker Nation for more information on that.
There's been rumors of a new Product Description hit that pays upwards of seventy-five cents, but I haven't seen one on the hit list to confirm it yet. As soon as I see one listed I'll try it out.
I managed to snag about 40 IA hits earlier this evening, but it was a frustrating process. I guess I'm going to have to admit I'm a stubborn old goat and try out some of the auto-accept scripts. I really hope the MTurk guys change the process soon so that you get at least a few seconds to accept a hit before someone else can. It would also be great to have a Submit and Assign button that just accepts the next hit in the group, but there seems to be some indication from the Monolith that they want us to be more selective in what hits we accept. Check out this thread on Turker Nation for more information on that.
More Turk Links
I've been slow to update the links to other MTurk web sites, and for that I apoligize. Life is getting more hectic and is only going to get worse I suspect when the baby gets here.
Turking.com is a great place for more information and they have extensive message boards, although you do have to register to view them.
A Hong Kong Mechanical Turk Addict is another personal blog that you can check out as well.
Turking.com is a great place for more information and they have extensive message boards, although you do have to register to view them.
A Hong Kong Mechanical Turk Addict is another personal blog that you can check out as well.
Tuesday, November 29, 2005
A Matter of Trust
There's an old saying that "Trust must be earned." As it stands right now on Amazon's Mechanical Turk, everyone starts out being trusted. That's very noble, but it causes problems when the script kiddie barbarians show up at the gate and they get waved in with the rest of us.
My experience as a parent and what I learned when studying for my education degree tells me that trust must be given to a child as they show more responsibility. If they stay within the box that defines the rules of the adult/child relationship, the box expands and they are given more freedom. If they step outside that box, it shrinks down to a more restrictive level until the child earns the trust to expand it again.
My computer science degree and my experience in the IT field allows me to define a way to reflect this in a system to govern hit acceptance in the MTurk environment. As I understand it, the method of accepting or rejecting hits is left to the client, so this is my thoughts on how to do that in a way that rewards success and limits the damage that auto-accept and submit scripts can do. I'm going to use the A9 Image Adjustment (IA) hits for this, but the method should work with any of the hit types seen so far on MTurk.
If we extrapolate my example of the adult/child relationship, then we have Amazon as the adult and the workers as the child. I'm not implying of course that the Turk Wurkers are children, but the relationship is similar because the adult/A9 has no reason to trust the child/worker when the relationship begins. If a method is put into place that allows trust by A9 to be measured and modified for each worker, I think many of the current problems with IA hits could be eliminated.
To begin, a variable has to be assigned to each Turker that measures Trust. This would be a simple integer value and would begin at a low number of say five (5) for new workers. A value of zero would imply that there is a neutral level of Trust between A9 and the worker. The value would increase as trust is gained and decrease as trust is removed. Any negative value would imply lack of trust and will be discussed later.
In order to determine whether you trust a new worker, you have to have a basis to judge them against. This requires a seed of absolute trust. For each group of images that you plan to test in the MTurk system, you have internal workers or admins select a small percentage of them and choose the definite correct answer to establish them as Trust Markers (TM). These sets of images can be placed semi-randomly in the work flow. Each correct answer will raise the Trust level of a worker by one point. The higher the Trust value is for a given worker, the less often these Trust Markers need to show up for that worker. In this way A9 would have to spend less on these types of images as more trust is gained. This is analogous to a company spending more on a worker when they are first hired in order to train them.
An incorrect answer to a TM would result in one point being subtracted from that workers Trust value. Once Trust goes negative, the worker would no longer be allowed to accept any hits until Trust reaches zero again. Negative points could decay at a rate set by A9, so it could take an hour or a day before the new user could try again. The value could also be allowed to increment back to the starting point of five after a certain amount of time. This would allow for a bit more leniency.
Just the addition of this functionality would severely hamper how much damage a scripter could do to the results, but one more feature is needed to eliminate the need to pay them for the random hits they did get correct before being locked out. I call this feature a Trust Lock.
A Trust Lock is created by taking a standard set of A9 images, including the "None of the Others" image and changing one of the images so that it reads "Submit This Image" in the same style as the NotO image. The worker would obviously be required to submit that particular image to answer the hit correctly. The Trust Lock would be dropped in much less frequently than the TMs, but answering the Trust Lock image set incorrectly would lead to an immediate Trust value of negative one (-1). This sounds harsh, but only a script or someone not paying attention would miss one of these.
In addition, all hits submitted since the last time you correctly answered a Trust Lock hit would be automatically rejected, whether they were answered correctly or not. Again, this sounds harsh but there's no reason to ever answer one incorrectly unless you're running an auto-accept script or working at a pace that is too high.
So let's use a few real world examples to walk through the methodology I just covered. First, let's say Johnny Turker heard about MTurk from his roommate and logs in and creates an account. He's assigned an initial Trust value of five (Trust = 5). Johnny then goes off and selects a group of IA hits to work on and starts turking.
At some point within the first 20 hits, Johnny is unknowingly presented a Trust Marker hit, and being inexperienced, gets it wrong. This drops his Trust value to four. At this point the MTurk system may decide to assign the next TM within 10 to 15 hits, since the trust level is less. If he had answered the first TM correctly, the MTurk system may wait until 20 to 25 more hits, depending on the algorithm used to determine how often these hits are presented to the worker. The system could also decide to immediately drop in a Trust Lock hit since missing the first TM that was presented could raise suspicions of him using a script.
Regardless, within the first 50 hits Johnny is presented with his first Trust Lock hit and he answers it correctly. At that point all the hits he submitted before the Trust Lock are eligible to be processed, while all the hits after this point will be processed when the next Trust Lock is answered correctly.
After doing about 100 hits, Johnny calls it a night and logs out with a Trust value of say six (6) since he improved how often he answered the TMs correctly. Later that night, Johnny falls under the influence of an evil script kiddy buddy down the hall in his dorm, who tells him he has a script that will randomly answer IA hits and make him lots of money while he sleeps.
Johnny installs the script and it starts running. Within 20 hits or so it encounters it first TM hit. It has a 1 in 7 chance of answering this correctly, which is quite possible but somewhat unlikely. Since Johnny's Trust value is still relatively low, he will also be presented a Trust Lock hit soon as well. Between the two types of Trust hits, it is unlikely the script will run for very many hits before Johnny is locked out of accepting any more hits. This also makes it unlikely that A9 will have to pay him for the submitted hits since they can know with some confidence that they're likely junk submittals.
The Trust value also allows Amazon and A9 to remove more of these restrictions after Trust reaches a certain level. The restriction on hits being processed until a Trust Lock is passed could be removed after Trust reaches a value of 25 or whatever level is determined to be appropriate. The Trust level could also be used to allow access to other new types of hits that pay better or that are more sensitive to script manipulation.
A possible algorithm for how often a turker is presented with TMs could be TL(20) - random(1...TL(10)). So at a Trust Level of 10 a turker would see a TM within the next 100 to 199 hits, which is 200 minus a random number between 1 and 100. Trust Locks could occur more randomly, but should one probably be presented to the turker within a few hits of a TM being answered incorrectly.
The existing qualifications can be combined with the Trust level to create new seeds of TM hits. If enough people with a certain level of Trust and a high level of accuracy agree on a certain image, it could be turned into a new TM hit. The Trust value could also be used to reduce the number of workers a hit has to be presented to to verify it. The higher the Trust value of a turker, the more weight is given to their response, so one Turker with a Trust value of 100 and an accuracy of 90% could replace multiple submitals by turkers with lower values.
This all leads me to a method where A9 could get more value out of their IA hits, but I'll leave that for the next gigantipost. Please feel free to comment on anything I missed or ways this method could be abused. I'll edit this post with any changes we come up with.
My experience as a parent and what I learned when studying for my education degree tells me that trust must be given to a child as they show more responsibility. If they stay within the box that defines the rules of the adult/child relationship, the box expands and they are given more freedom. If they step outside that box, it shrinks down to a more restrictive level until the child earns the trust to expand it again.
My computer science degree and my experience in the IT field allows me to define a way to reflect this in a system to govern hit acceptance in the MTurk environment. As I understand it, the method of accepting or rejecting hits is left to the client, so this is my thoughts on how to do that in a way that rewards success and limits the damage that auto-accept and submit scripts can do. I'm going to use the A9 Image Adjustment (IA) hits for this, but the method should work with any of the hit types seen so far on MTurk.
If we extrapolate my example of the adult/child relationship, then we have Amazon as the adult and the workers as the child. I'm not implying of course that the Turk Wurkers are children, but the relationship is similar because the adult/A9 has no reason to trust the child/worker when the relationship begins. If a method is put into place that allows trust by A9 to be measured and modified for each worker, I think many of the current problems with IA hits could be eliminated.
To begin, a variable has to be assigned to each Turker that measures Trust. This would be a simple integer value and would begin at a low number of say five (5) for new workers. A value of zero would imply that there is a neutral level of Trust between A9 and the worker. The value would increase as trust is gained and decrease as trust is removed. Any negative value would imply lack of trust and will be discussed later.
In order to determine whether you trust a new worker, you have to have a basis to judge them against. This requires a seed of absolute trust. For each group of images that you plan to test in the MTurk system, you have internal workers or admins select a small percentage of them and choose the definite correct answer to establish them as Trust Markers (TM). These sets of images can be placed semi-randomly in the work flow. Each correct answer will raise the Trust level of a worker by one point. The higher the Trust value is for a given worker, the less often these Trust Markers need to show up for that worker. In this way A9 would have to spend less on these types of images as more trust is gained. This is analogous to a company spending more on a worker when they are first hired in order to train them.
An incorrect answer to a TM would result in one point being subtracted from that workers Trust value. Once Trust goes negative, the worker would no longer be allowed to accept any hits until Trust reaches zero again. Negative points could decay at a rate set by A9, so it could take an hour or a day before the new user could try again. The value could also be allowed to increment back to the starting point of five after a certain amount of time. This would allow for a bit more leniency.
Just the addition of this functionality would severely hamper how much damage a scripter could do to the results, but one more feature is needed to eliminate the need to pay them for the random hits they did get correct before being locked out. I call this feature a Trust Lock.
A Trust Lock is created by taking a standard set of A9 images, including the "None of the Others" image and changing one of the images so that it reads "Submit This Image" in the same style as the NotO image. The worker would obviously be required to submit that particular image to answer the hit correctly. The Trust Lock would be dropped in much less frequently than the TMs, but answering the Trust Lock image set incorrectly would lead to an immediate Trust value of negative one (-1). This sounds harsh, but only a script or someone not paying attention would miss one of these.
In addition, all hits submitted since the last time you correctly answered a Trust Lock hit would be automatically rejected, whether they were answered correctly or not. Again, this sounds harsh but there's no reason to ever answer one incorrectly unless you're running an auto-accept script or working at a pace that is too high.
So let's use a few real world examples to walk through the methodology I just covered. First, let's say Johnny Turker heard about MTurk from his roommate and logs in and creates an account. He's assigned an initial Trust value of five (Trust = 5). Johnny then goes off and selects a group of IA hits to work on and starts turking.
At some point within the first 20 hits, Johnny is unknowingly presented a Trust Marker hit, and being inexperienced, gets it wrong. This drops his Trust value to four. At this point the MTurk system may decide to assign the next TM within 10 to 15 hits, since the trust level is less. If he had answered the first TM correctly, the MTurk system may wait until 20 to 25 more hits, depending on the algorithm used to determine how often these hits are presented to the worker. The system could also decide to immediately drop in a Trust Lock hit since missing the first TM that was presented could raise suspicions of him using a script.
Regardless, within the first 50 hits Johnny is presented with his first Trust Lock hit and he answers it correctly. At that point all the hits he submitted before the Trust Lock are eligible to be processed, while all the hits after this point will be processed when the next Trust Lock is answered correctly.
After doing about 100 hits, Johnny calls it a night and logs out with a Trust value of say six (6) since he improved how often he answered the TMs correctly. Later that night, Johnny falls under the influence of an evil script kiddy buddy down the hall in his dorm, who tells him he has a script that will randomly answer IA hits and make him lots of money while he sleeps.
Johnny installs the script and it starts running. Within 20 hits or so it encounters it first TM hit. It has a 1 in 7 chance of answering this correctly, which is quite possible but somewhat unlikely. Since Johnny's Trust value is still relatively low, he will also be presented a Trust Lock hit soon as well. Between the two types of Trust hits, it is unlikely the script will run for very many hits before Johnny is locked out of accepting any more hits. This also makes it unlikely that A9 will have to pay him for the submitted hits since they can know with some confidence that they're likely junk submittals.
The Trust value also allows Amazon and A9 to remove more of these restrictions after Trust reaches a certain level. The restriction on hits being processed until a Trust Lock is passed could be removed after Trust reaches a value of 25 or whatever level is determined to be appropriate. The Trust level could also be used to allow access to other new types of hits that pay better or that are more sensitive to script manipulation.
A possible algorithm for how often a turker is presented with TMs could be TL(20) - random(1...TL(10)). So at a Trust Level of 10 a turker would see a TM within the next 100 to 199 hits, which is 200 minus a random number between 1 and 100. Trust Locks could occur more randomly, but should one probably be presented to the turker within a few hits of a TM being answered incorrectly.
The existing qualifications can be combined with the Trust level to create new seeds of TM hits. If enough people with a certain level of Trust and a high level of accuracy agree on a certain image, it could be turned into a new TM hit. The Trust value could also be used to reduce the number of workers a hit has to be presented to to verify it. The higher the Trust value of a turker, the more weight is given to their response, so one Turker with a Trust value of 100 and an accuracy of 90% could replace multiple submitals by turkers with lower values.
This all leads me to a method where A9 could get more value out of their IA hits, but I'll leave that for the next gigantipost. Please feel free to comment on anything I missed or ways this method could be abused. I'll edit this post with any changes we come up with.
Transcription Hits Up to a Nickel
I saw a few of the Transcription Hits today at lunch. They quickly disappeared, but I did notice they were paying five cents each instead of two cents. That's a great improvement, but IA hits at three cents each are a much better pay out. Of course, I haven't been able to actually receive an IA hit for the last two days, so the difference is a moot point.
Monday, November 28, 2005
Turk Props
There was a recent post on the Amazon Web Services Blog recently that I missed during my holiday down time. Thanks to them for mentioning this blog and the Turker Nation message board. There wasn't a lot of information revealed in the blog post, but they seem to indicate that they're actively seeking new types of hits to enter into the system from outside customers. Personally, I think they're going to have to do a lot more testing on ideas to discourage script kiddies and other dillholes before they start trying to get paying customers.
Sunday, November 27, 2005
Breaks
I've taken a break since Tuesday from work and MTurk for the most part. I wanted to do a lot of hits last night but was only able to do about 200 since the Image Adjustment hits available were running low and I don't like doing the Confirm Artist Name hits.
There are rumors on the Turker Nation boards that people are running scripts just to accept hits and randomly click an image and submit them as rapidly as possible. There has been no word from the Monolith at Amazon of course, so there's no way to confirm or deny it. Greedy people running scripts with 25 tabs open at once on their browser could also account for some of it.
Hopefully Amazon will seriously consider putting a hard limit of less than 10 hits that can be checked out at once. I'm sure this will slow some people down, but doing 1000 hits an hour is not really providing valuable work anyway. It's just taking advantage of the fact that the A9 images are skewed so heavily towards the "None of the Above." If "None of the Above" were only the correct option about 20% of the time, you'd have to spend more than 4 seconds on a hit or your accuracy would suffer greatly.
I suspect that Amazon might not care about getting "real" work done with IA hits so much right now. I see the same IA hits over and over again so I think they're recycling a lot in order to maintain load on the system and get more data they can use to optimize things.
There are rumors on the Turker Nation boards that people are running scripts just to accept hits and randomly click an image and submit them as rapidly as possible. There has been no word from the Monolith at Amazon of course, so there's no way to confirm or deny it. Greedy people running scripts with 25 tabs open at once on their browser could also account for some of it.
Hopefully Amazon will seriously consider putting a hard limit of less than 10 hits that can be checked out at once. I'm sure this will slow some people down, but doing 1000 hits an hour is not really providing valuable work anyway. It's just taking advantage of the fact that the A9 images are skewed so heavily towards the "None of the Above." If "None of the Above" were only the correct option about 20% of the time, you'd have to spend more than 4 seconds on a hit or your accuracy would suffer greatly.
I suspect that Amazon might not care about getting "real" work done with IA hits so much right now. I see the same IA hits over and over again so I think they're recycling a lot in order to maintain load on the system and get more data they can use to optimize things.
Wednesday, November 23, 2005
Transcription Hits
A new type of MTurk Hit appeared recently. For these you listen to a small audio podcast and transcribe what you hear. While generating some interest because they are new, these hits will not be very popular if they continue to pay only two cents per hit. In the time it takes you to do one of these, you could finish multiple Image Adjustment hits.
Tuesday, November 22, 2005
Artist Name Hits
I was recently able to do a few of the latest hit type, where you are asked to look at the cover of an album and then type in the correct name of the artist. They give you some samples of potential artist names as well. These hits have been paying either two or three cents, but usually two cents per approval.
Honestly, I don't like doing them. Maybe I'm just too used to doing Image Adjustment hits. I also don't want to take a 33% pay cut for doing hits that actually take longer for me to do. I think the combination of typing and using the mouse slows me down to about four per minute where I usually do about six IA hits a minute.
Six submits a minute on IA Hits puts me in the geezer league on hits, by the way. I can't even maintain that rate for more than about 10 minutes before I have to take a break. I think getting rich off the Mechanical Turk is not in my destiny. As it is I've made about $100 in the last three weeks by just spending a few hours a night.
There still seems to be some issues with negative pending hits showing up in the dashboard, but various people have received emails from Amazon support stating this should not affect your earnings. There also seems to be a delay in moving money into your account. I've earned enough to order my first goodie from Amazon, but it hasn't shown up in my account yet for me to spend. Patience.
Honestly, I don't like doing them. Maybe I'm just too used to doing Image Adjustment hits. I also don't want to take a 33% pay cut for doing hits that actually take longer for me to do. I think the combination of typing and using the mouse slows me down to about four per minute where I usually do about six IA hits a minute.
Six submits a minute on IA Hits puts me in the geezer league on hits, by the way. I can't even maintain that rate for more than about 10 minutes before I have to take a break. I think getting rich off the Mechanical Turk is not in my destiny. As it is I've made about $100 in the last three weeks by just spending a few hours a night.
There still seems to be some issues with negative pending hits showing up in the dashboard, but various people have received emails from Amazon support stating this should not affect your earnings. There also seems to be a delay in moving money into your account. I've earned enough to order my first goodie from Amazon, but it hasn't shown up in my account yet for me to spend. Patience.
Monday, November 21, 2005
Turk Loot
As of last night, with my average 9 to 10% rejection rate, I have submitted enough hits to be able to order my first piece of Turk Loot once the hits are processed. After a long hiatus over the weekend, the Mechanical Turk is back and a lot of the previously pending hits have been processed. Although many Turkers were disappointed that the system was down for so long, I'm sure their wrists and clicking fingers are appreciative of the rest.
Amazon Web Services Evangelist Jeff Barr also stopped by the message boards at Turker Nation recently and gave some nice input and asked for suggestions from the user community. Make sure you stop by when you take your next break from processing hits. He confirmed that new types of hits will be showing up eventually and that they're always looking for more ideas for new types of hits.
Amazon Web Services Evangelist Jeff Barr also stopped by the message boards at Turker Nation recently and gave some nice input and asked for suggestions from the user community. Make sure you stop by when you take your next break from processing hits. He confirmed that new types of hits will be showing up eventually and that they're always looking for more ideas for new types of hits.
Saturday, November 19, 2005
Wounded Turk
I stayed up way too late last night doing IA Hits on MTurk, and about 2am cst the Turk Masters were merciful to me and took the system down. Almost 12 hours later it appears to still be down, causing much consternation among the Turker Nation, at least among those who don't have sporting events to watch.
I'm very close to getting the helmet I want for free, although I guess I could use what I have now and spend about three dollars to go ahead and order it. Since I do all the Turk Wurk I do in my free time, I'm treating the funds I earn as extra spending money so I can get more of the toys I crave. I've even gone and updated my wishlist at Amazon so I can drool over what I want occasionally.
I'm very close to getting the helmet I want for free, although I guess I could use what I have now and spend about three dollars to go ahead and order it. Since I do all the Turk Wurk I do in my free time, I'm treating the funds I earn as extra spending money so I can get more of the toys I crave. I've even gone and updated my wishlist at Amazon so I can drool over what I want occasionally.
Friday, November 18, 2005
Friday Morning
Not a whole lot to report about Mechanical Turk lately. The new "None of the others" image seems to be fully implemented now and I didn't see any of the previous two types yesterday. Because of a school function last night, I only submitted 159 hits yesterday. There were also a lot of system errors which slowed down the pace since I couldn't get in a good rhythm so I just went to bed early. I'm still being stubborn and not using the scripts which would probably solve a lot of that for me.
The processing of hits submitted still seems to be hit or miss. They processed a lot of them sometime in the early morning hours yesterday, then only a few got processed from yesterday. This is probably the one thing that is frustrating the majority of Turkers that I've heard from. Some are also frustrated about the account balance not staying up to date recently, but this hasn't affected me yet since I haven't cashed anything out yet.
At this point I'm down to needing about 800 more approvals to get the helmet I've been craving. I'll probably take a break next week during the holidays since I don't want to do anything resembling work for a few days other than getting the big meal ready.
The processing of hits submitted still seems to be hit or miss. They processed a lot of them sometime in the early morning hours yesterday, then only a few got processed from yesterday. This is probably the one thing that is frustrating the majority of Turkers that I've heard from. Some are also frustrated about the account balance not staying up to date recently, but this hasn't affected me yet since I haven't cashed anything out yet.
At this point I'm down to needing about 800 more approvals to get the helmet I've been craving. I'll probably take a break next week during the holidays since I don't want to do anything resembling work for a few days other than getting the big meal ready.
Thursday, November 17, 2005
Getting There
I had planned to write about how troubled and frustrating the Mechanical Turk was yesterday. It was troubled and frustrating yesterday, there's no doubt about that. I submitted over 300 hits by the time I went to sleep last night and only about 25 had been processed. I was getting close to having over 1000 hits still in Pending status. The MTurk systems seemed to be overloaded yesterday as well. There were lots of page errors and the Image Adjustment hits were a strange conglomeration of all three of the None of The Above options we've seen so far.
I was prepared to write a fairly harsh blog entry this morning, but when I looked at my account this morning there were over 500 of my pending hits processed while I slept. Bliss! My rejection rate is down to almost 10% and I'm now over halfway in my quest to acquire a new Petzl Ecrin Roc helmet.
So in short, MTurk has some problems, but it's still one of the coolest projects on the Internet in my own humble opinion. If I had to fault them for anything, it would be that they're not communicating with their userbase effectively yet. There's no official MTurk announcement board or blog, and the only way we're getting much official news right now is from the emails from customer service that we're sharing with each other over on Turker Nation.
I was prepared to write a fairly harsh blog entry this morning, but when I looked at my account this morning there were over 500 of my pending hits processed while I slept. Bliss! My rejection rate is down to almost 10% and I'm now over halfway in my quest to acquire a new Petzl Ecrin Roc helmet.
So in short, MTurk has some problems, but it's still one of the coolest projects on the Internet in my own humble opinion. If I had to fault them for anything, it would be that they're not communicating with their userbase effectively yet. There's no official MTurk announcement board or blog, and the only way we're getting much official news right now is from the emails from customer service that we're sharing with each other over on Turker Nation.
Wednesday, November 16, 2005
Black Box Fading Away
As first reported on Turker Nation by Jake, a new None of the Above has appeared on MTurk and can currently be seen in Image Adjustment hits for Fort Worth. This is very likely the latest attempt to stifle the script kiddies out there who were likely scanning images for lots of black space or smaller file size due to black images being compressed easier.
I think we're going to have to get used to the fact that having a script pre-enable None of the Above answers is not going to be feasible as long as the majority of IA Hits should correctly be answered with None of the Above.
I think we're going to have to get used to the fact that having a script pre-enable None of the Above answers is not going to be feasible as long as the majority of IA Hits should correctly be answered with None of the Above.
Hell Yeah You Get Paid
I'm sure this will get smacked down by the Amazon lawyers very soon, but someone has put up a hilarious spoof site. Go check it out and then get back to submitting hits. No more breaks for you!
Tuesday, November 15, 2005
Turk Wurk
I've been reading what some others in the blogosphere have been saying about Amazon's Mechanical Turk lately and I thought I'd make some comments on their comments. Phil Wainewright speaks about it on Loosely Coupled and makes a point of quoting someone's slashdot comment about MTurk having "ominous shades of The Matrix..."
I can see the point of that comparison, but I think there's a bit of grandstanding for the audience in there too. I think a better analogy is to compare it to the street corners in almost every US city where immigrants congregrate to wait for someone to stop by with a truck and hire them for the day or a few hours.
There's nothing ominous in that. It's just that you need some work done and you don't necessarily need a full time employee to do it. Now I'm not advocating paying illegal immigrants like I'm sure someone will point out, but if you need a basement cleaned out quickly you hire 5 guys for a few hours and get it done. MTurk hires thousands of us to quickly scan through hundreds of city blocks each day. You may do 100 or so a day like I tend to do, or you may do 4000 a day like some of the more dedicated Turkers have managed.
Dividing up work and managing it wisely to get it done is when humanity really shines. You can build a barn by yourself, but it's much easier when a community comes together and gets it done quickly and efficiently. MTurk has already started a new community of people who are sharing ideas and experiences and helping the barn get built. This type of distributed work could be the next highly disruptive innovation, which is something that Phil Wainewright and I would completely agree on I suspect.
I can see the point of that comparison, but I think there's a bit of grandstanding for the audience in there too. I think a better analogy is to compare it to the street corners in almost every US city where immigrants congregrate to wait for someone to stop by with a truck and hire them for the day or a few hours.
There's nothing ominous in that. It's just that you need some work done and you don't necessarily need a full time employee to do it. Now I'm not advocating paying illegal immigrants like I'm sure someone will point out, but if you need a basement cleaned out quickly you hire 5 guys for a few hours and get it done. MTurk hires thousands of us to quickly scan through hundreds of city blocks each day. You may do 100 or so a day like I tend to do, or you may do 4000 a day like some of the more dedicated Turkers have managed.
Dividing up work and managing it wisely to get it done is when humanity really shines. You can build a barn by yourself, but it's much easier when a community comes together and gets it done quickly and efficiently. MTurk has already started a new community of people who are sharing ideas and experiences and helping the barn get built. This type of distributed work could be the next highly disruptive innovation, which is something that Phil Wainewright and I would completely agree on I suspect.
The Old Guy
Since I'm the old guy among all the Turkers I know, I'm not really going for speed on doing hits. Last night I managed to do 436 hits which brings my total up to 1641. It's goes without saying I suppose that only about 1/3 of them have been processed by this morning. My total towards the caving helmet is now about $25.00, but if all my hits had been processed I'd be sitting at around $42.00. I feel another break coming on until they get this settled.
The rate at which I can submit hits varies with the type of show I'm watching on the Tivo. When I was watching Law and Order last night I slowed down a lot because I had to pay attention to the plot. Later when I was watching That 70's Show I was going much faster since the plot is much easier to follow.
The rate at which I can submit hits varies with the type of show I'm watching on the Tivo. When I was watching Law and Order last night I slowed down a lot because I had to pay attention to the plot. Later when I was watching That 70's Show I was going much faster since the plot is much easier to follow.
Monday, November 14, 2005
Change in Instructions
Thanks to Sandoz for pointing out a change in the wording of the instructions for Image Adjustment hits. It now asks you to choose the correct image matching the Business Name OR the Street Address. This takes care of situations where it asks you to find Joe's Tacos at 100 Main Street and an image clearly shows 100 Main Street but the business name is Nancy's Hair Salon.
It's great they made the change, but I completely failed to notice it. They may want to think about having an announcements area or even having workers click on something stating they understand the new instructions for Image Adjustment hits before they are allowed to continue working on them.
It's great they made the change, but I completely failed to notice it. They may want to think about having an announcements area or even having workers click on something stating they understand the new instructions for Image Adjustment hits before they are allowed to continue working on them.
New Script Available
There's a new script available recently that should help speed up Image Adjustment Hits submissions. Check it out at http://userscripts.org/scripts/show/2117.
Hopefully the great guys at Mechanical Turk will look at some of the features of this script and roll them into the actual MTurk website. There are some great improvements in usability just from this one script alone.
Feel free to discuss this script here or at the new Turkers Message Board.
Hopefully the great guys at Mechanical Turk will look at some of the features of this script and roll them into the actual MTurk website. There are some great improvements in usability just from this one script alone.
Feel free to discuss this script here or at the new Turkers Message Board.
Feeling Snarky
Turkers Message Board
Due to a lot of requests, I've started a Turkers Message Board. Give it a try and feel free to make suggestions for improvement. If the site doesn't work out well I'll try again somewhere else.
http://turkers.proboards80.com/
http://turkers.proboards80.com/
Changes
So while I was away the Black Box made it's appearance as shown below. It seems to have already caused controversy among all the Turkers out there. I guess I see it as a current necessary evil to prevent script kiddies from ruining everything for the rest of us.
This doesn't address the real issue, which needs to be handled by the A9 team that takes the pictures. I couldn't guess at exact percentages, but there are way, way too many hits where the business is not in any of the pictures. Often you can tell by the address and the numbers in the images that the group of pictures being shown are off by an entire city block or more.
I guess before anyone gets too angry about this or the Pending issue, we should all take a deep breath and remember that this is a project still in the early stages of Beta testing. I would advise against making plans to pay rent or buy anything important with your Turk Tips. Just enjoy it for now and let them get the kinks worked out so other companies will be inclined to use the API to provide more and possibly better paying work for us.
This doesn't address the real issue, which needs to be handled by the A9 team that takes the pictures. I couldn't guess at exact percentages, but there are way, way too many hits where the business is not in any of the pictures. Often you can tell by the address and the numbers in the images that the group of pictures being shown are off by an entire city block or more.
I guess before anyone gets too angry about this or the Pending issue, we should all take a deep breath and remember that this is a project still in the early stages of Beta testing. I would advise against making plans to pay rent or buy anything important with your Turk Tips. Just enjoy it for now and let them get the kinks worked out so other companies will be inclined to use the API to provide more and possibly better paying work for us.
Slow Progress
I was out of town most of the weekend, so I didn't submit many hits. I submitted a single "None of the Above" hit from my mom's computer yesterday and it actually got approved by this morning. The remaining Pending hits I have are slowly being processed. Out of 1205 hits I've submitted, 384 are still pending. My rejection rate has moved down to 12.9% now, so that's a good thing.
I still enjoy doing work on MTurk, but I'm very disappointed that they're not addressing problem with so many Pending hits. I think they could at least leave us a note on the website or send out an email saying that they're working on it. When the very first hit I worked on NINE days ago is still in pending status it can be discouraging.
I still enjoy doing work on MTurk, but I'm very disappointed that they're not addressing problem with so many Pending hits. I think they could at least leave us a note on the website or send out an email saying that they're working on it. When the very first hit I worked on NINE days ago is still in pending status it can be discouraging.
Saturday, November 12, 2005
Drivers Wanted
So do you crave driving a cool white SUV with a camera mounted on top all over different cities in the USA? If so then A9 wants to talk to you. Thanks to Fan Wang for the link.
If I was younger and single I'd love doing this job. I did something similar over ten years ago when I drove a white panel van all over Alabama and Georgia trying to find Nextel phones to make modifications to and upgrade their software.
The more I thought about this I wish there could have been something like this when I was in college. I had a student job at the library where I sat in the copy center for hours at a time either studying or trying not to fall asleep. Of course there was barely an Internet back then but it would have been nice to throw a couple extra dollars an hour onto a job where I was basically doing nothing in the first place. Of course I had to walk up hill both ways through the snow to get to this job. You kids today with your Greasemonkey scripts, you don't know how easy you've got it. /grumble
If I was younger and single I'd love doing this job. I did something similar over ten years ago when I drove a white panel van all over Alabama and Georgia trying to find Nextel phones to make modifications to and upgrade their software.
The more I thought about this I wish there could have been something like this when I was in college. I had a student job at the library where I sat in the copy center for hours at a time either studying or trying not to fall asleep. Of course there was barely an Internet back then but it would have been nice to throw a couple extra dollars an hour onto a job where I was basically doing nothing in the first place. Of course I had to walk up hill both ways through the snow to get to this job. You kids today with your Greasemonkey scripts, you don't know how easy you've got it. /grumble
Oops, I Hit it Again
I'm glad it's Saturday. I stayed up too late last night watching whatever crap my Tivo happened to grab during the week and doing Baltimore Image Adjustment hits. I was being very casual about it last night and still processed 268 hits. I was quite proud of that until I read on the Turk Monitor that he had done 3000 hits in one day! You guys are insane! The next thing you know I'm going to read about someone developing a neural cortex device that allows them to jack in directly to the Mechanical Turk so they can process a hit every 0.5 seconds. At least it will be driving new technology forward. Heh.
I've tried to do some hits this morning, but I've had no luck. They seem to be having severe technical difficulties. Has anyone noticed the same images coming around since earlier this week? I wonder if perhaps they're just throwing the same hits up over and over just to get the quirks worked out before they start processing real hits. Another reason I say this is because a lot of the businesses and areas of town I've seen so far are not really good candidates for people really looking them up in a search engine.
I also finally caught a reflection of the the image capture vehicle. I'm sure I skipped a lot of them before but it took me a while once I finally started looking for them. Now I see them all the time. Here's the first one I saw. I know it makes no logical sense, but it struck me as odd at first because I've always pictured the vehicle going from left to right just like the picture series do in a hit.
I've tried to do some hits this morning, but I've had no luck. They seem to be having severe technical difficulties. Has anyone noticed the same images coming around since earlier this week? I wonder if perhaps they're just throwing the same hits up over and over just to get the quirks worked out before they start processing real hits. Another reason I say this is because a lot of the businesses and areas of town I've seen so far are not really good candidates for people really looking them up in a search engine.
I also finally caught a reflection of the the image capture vehicle. I'm sure I skipped a lot of them before but it took me a while once I finally started looking for them. Now I see them all the time. Here's the first one I saw. I know it makes no logical sense, but it struck me as odd at first because I've always pictured the vehicle going from left to right just like the picture series do in a hit.
Friday, November 11, 2005
Scary Driving
Is it just me or are a lot of the pictures in the Image Adjustment hits from some seedy parts of town? I hope they're paying the guys driving the SUVs a decent wage for driving around in those areas. I guess they'd be called Turklings? If you appear in one of the pictures have you been Turked? Maybe a Turkee? A drive-by Turking? Ok, I'm getting goofy now so it's time for bed. Maybe just one more hit though...
Yes, I'm a Turker
I guess I've become a real Turker now. I swore I was going to hold off on doing any more hits until my pending hits started being processed, but I still did 100 of them tonight after I got home. It's like having a huge bowl of M&Ms in front of you while you're laying on the couch. You swear you're going to stop eating them, but then you grab another one.
There's some great comments being posted on the blog and I thank you guys for contributing. If you haven't done so yet, check out another great blog by a fellow Turker over at the Mechanical Turk Monitor. He's got some great information about using a script to automate Hit acceptance and some cool pictures he's seen while doing image adjustment, including the reflection of the vehicle taking the pictures.
I'm surprised they're using an SUV since that's not very economical for driving around a city all day taking pictures. I'm a little bit of a snob I guess since I drive a Honda Civic that gets almost 40 mpg. Maybe they need more height to get better shots. One thing I would recommend would be to take as many pictures as possible on the weekend so avoid a lot of the UPS and Fedex vans which seem to show up and block a lot of the pictures I've seen so far.
There's some great comments being posted on the blog and I thank you guys for contributing. If you haven't done so yet, check out another great blog by a fellow Turker over at the Mechanical Turk Monitor. He's got some great information about using a script to automate Hit acceptance and some cool pictures he's seen while doing image adjustment, including the reflection of the vehicle taking the pictures.
I'm surprised they're using an SUV since that's not very economical for driving around a city all day taking pictures. I'm a little bit of a snob I guess since I drive a Honda Civic that gets almost 40 mpg. Maybe they need more height to get better shots. One thing I would recommend would be to take as many pictures as possible on the weekend so avoid a lot of the UPS and Fedex vans which seem to show up and block a lot of the pictures I've seen so far.
New Email
I received a new kind of email from MTurk today. It was entitled "Summary of Payment Activity today." As a QA Analyst I have to point out that "today" should be capitalized like all the other words in the title. It looks odd otherwise.
The email appears to be a summary of how many hits were approved and credited to me in the last 24 hours. It was a grand total of $1.23. To be honest I'm not really interested in receiving this information in the form of an email, but I don't currently see any way to inform Amazon that I don't wish to receive the email. To me it's just easier to log in and see my account balance. There's no need to send an email out every day.
The email appears to be a summary of how many hits were approved and credited to me in the last 24 hours. It was a grand total of $1.23. To be honest I'm not really interested in receiving this information in the form of an email, but I don't currently see any way to inform Amazon that I don't wish to receive the email. To me it's just easier to log in and see my account balance. There's no need to send an email out every day.
Pending
I'm slowing down on the number of hits I work on for now. I've done over 850 now and about 45% of them are still in pending status. This includes the very first hit I did back on November 5th. With 390 pending hits that I haven't been paid for, I'm not inclined to spend too much time doing hits until I see that there's some resolution to this.
I also noticed this morning that they're not showing a percentage for Pending status anymore on the dashboard. This is actually a good thing since it let's you see your actual rejection rate which for me is at 14.4% now.
In lieu of doing large numbers of hits, I'm going to do a few select ones to try to understand better which ones are getting rejected. If anyone else wants to share data on this feel free to leave comments.
I also noticed this morning that they're not showing a percentage for Pending status anymore on the dashboard. This is actually a good thing since it let's you see your actual rejection rate which for me is at 14.4% now.
In lieu of doing large numbers of hits, I'm going to do a few select ones to try to understand better which ones are getting rejected. If anyone else wants to share data on this feel free to leave comments.
Human Algorithms
I thought I'd spend a little bit of time discussing the techniques I use when doing work on Amazon's Mechanical Turk. I'm a QA Analyst by profession so it's in my nature to consider process improvement and usability. I'm only going to consider Image Adjustment Hits at this time simply because these are the only hits I've worked on and they're also the only hits available now on MTurk, at least recently.
First of all, I've found I prefer to work on cities like New York or San Francisco that have a higher population density. This is because they tend to have smaller store fronts that have lots of signs and seem to have their street address displayed more often. I could be completely wrong about this and have no data to back it up, but my internal neural net is wired this way so that's how I approach it currently. Working on images from a city like Phoenix also seems to have a lot of "None of the Above" answers, which myself and some other users feel are being held in "Pending Status" too long and doesn't lead to quick payment.
So once I choose a city to work in, I just start working. I take the first assignment that comes up and don't worry too much about pre-scanning it before I accept. The first reason for this is because if you wait more than a few seconds to accept it, there's a good chance it will be assigned to someone else. If you were quick enough you could probably get a quick idea if it's a potential "None of the Above" and skip it, but I think that would take only slightly less time than actually just completing the hit anyway.
As soon as I accept the hit, I look at the business name and just the numbers of the address. So if it was "Joe's Tacos" at "113 West Smith St" I'd just mouth "Joe's Tacos 113" to myself then start scrolling down to look at the images. If I see a good image I click on the radio button and keep scrolling. If I don't see a good image by the time I hit the bottom I click on "None of the Above" and submit and immediately start on the next hit. A lot of the times, at least recently, it seems that it's very obvious very quickly that it's "None of the Above." When you see five images of an empty lot you can finish it in less than five seconds probably.
Things slow down often when there are multiple images of the correct store front. This is also where I suspect I get most of my rejections. If there's a choice between two images that both clearly show the store front, but from slightly different angles then it becomes difficult to decide which to choose. I've generally just gone with the first one in that case since I assume that's what others will do. Since no guidance is given from the requestor regarding their preferences I don't know how else to approach it. Maybe they prefer shots that are angled from one side or the other. Who knows? It's likely just a popularity contest at that point.
I haven't really timed myself on how fast I can do Image Adjustments but I know I could go faster if I was sitting down at a table and really concentrating on it. Currently I watch things on my Tivo while I'm doing it and I often stop to click back for some dialogue I missed. I also tend to stop about every 10 hits and look at the total I've done and calculate how many more I need to do to get the Petzl helmet that is my goal from all of this.
Some people have asked if you could make a living doing this. I think you could probably do somewhere around 800 to 1000 a day if you did it at least 8 hours a day and were really dedicated. I'm sure I couldn't. Regardless though, if you submitted 800 a day and did this for 22 days a month, that would be 17,600 hits a month submitted. Assuming they were all processed and you had a rejection rate of about 15% then you'd be approved for 14,960 which at three cents a hit would leave you with $448.80 which means you would make about $2.55 an hour for the 176 hours you put in. These are very rough numbers of course, so your mileage may vary.
So I would say no, you can't really make much of a living doing this. I treat it as a way to get a discount on merchandise from Amazon by doing some work for them. It would be similar to washing a few dishes at your favorite restaurant before you sit down to eat. I have a specific goal that I'm working for, plus when I'm working on hits is the time of day that's usually almost completely unproductive for me anyway so why not make a buck while I watch an episode of Law and Order from 10 years ago?
First of all, I've found I prefer to work on cities like New York or San Francisco that have a higher population density. This is because they tend to have smaller store fronts that have lots of signs and seem to have their street address displayed more often. I could be completely wrong about this and have no data to back it up, but my internal neural net is wired this way so that's how I approach it currently. Working on images from a city like Phoenix also seems to have a lot of "None of the Above" answers, which myself and some other users feel are being held in "Pending Status" too long and doesn't lead to quick payment.
So once I choose a city to work in, I just start working. I take the first assignment that comes up and don't worry too much about pre-scanning it before I accept. The first reason for this is because if you wait more than a few seconds to accept it, there's a good chance it will be assigned to someone else. If you were quick enough you could probably get a quick idea if it's a potential "None of the Above" and skip it, but I think that would take only slightly less time than actually just completing the hit anyway.
As soon as I accept the hit, I look at the business name and just the numbers of the address. So if it was "Joe's Tacos" at "113 West Smith St" I'd just mouth "Joe's Tacos 113" to myself then start scrolling down to look at the images. If I see a good image I click on the radio button and keep scrolling. If I don't see a good image by the time I hit the bottom I click on "None of the Above" and submit and immediately start on the next hit. A lot of the times, at least recently, it seems that it's very obvious very quickly that it's "None of the Above." When you see five images of an empty lot you can finish it in less than five seconds probably.
Things slow down often when there are multiple images of the correct store front. This is also where I suspect I get most of my rejections. If there's a choice between two images that both clearly show the store front, but from slightly different angles then it becomes difficult to decide which to choose. I've generally just gone with the first one in that case since I assume that's what others will do. Since no guidance is given from the requestor regarding their preferences I don't know how else to approach it. Maybe they prefer shots that are angled from one side or the other. Who knows? It's likely just a popularity contest at that point.
I haven't really timed myself on how fast I can do Image Adjustments but I know I could go faster if I was sitting down at a table and really concentrating on it. Currently I watch things on my Tivo while I'm doing it and I often stop to click back for some dialogue I missed. I also tend to stop about every 10 hits and look at the total I've done and calculate how many more I need to do to get the Petzl helmet that is my goal from all of this.
Some people have asked if you could make a living doing this. I think you could probably do somewhere around 800 to 1000 a day if you did it at least 8 hours a day and were really dedicated. I'm sure I couldn't. Regardless though, if you submitted 800 a day and did this for 22 days a month, that would be 17,600 hits a month submitted. Assuming they were all processed and you had a rejection rate of about 15% then you'd be approved for 14,960 which at three cents a hit would leave you with $448.80 which means you would make about $2.55 an hour for the 176 hours you put in. These are very rough numbers of course, so your mileage may vary.
So I would say no, you can't really make much of a living doing this. I treat it as a way to get a discount on merchandise from Amazon by doing some work for them. It would be similar to washing a few dishes at your favorite restaurant before you sit down to eat. I have a specific goal that I'm working for, plus when I'm working on hits is the time of day that's usually almost completely unproductive for me anyway so why not make a buck while I watch an episode of Law and Order from 10 years ago?
Thursday, November 10, 2005
Progress
I ended up submitting more than 400 hits on Mechanical Turk by the time I went to sleep last night. The approvals seem to have slowed down for some reason and out of the 700+ hits I've submitted so far, over 300 are still in pending status. This is slightly worrisome, so I'll probably hold off on doing more for a while until I start seeing them processed. I'm loathe to continue doing work without getting paid, even if it is only about nine dollars we're talking about.
Last night I was working on Image Adjustment hits mostly from Miami and a few from Dallas and Philadelphia. I think part of the appeal for doing this is just to see the pictures. I enjoy driving around just to drive around, and the pictures you look at for these tasks represent that. It's interesting to see what's going on in some of the pictures. I haven't seen anything weird yet, but you get to see people going about their daily business, unaware that someone is driving by taking a series of pictures.
Last night I was working on Image Adjustment hits mostly from Miami and a few from Dallas and Philadelphia. I think part of the appeal for doing this is just to see the pictures. I enjoy driving around just to drive around, and the pictures you look at for these tasks represent that. It's interesting to see what's going on in some of the pictures. I haven't seen anything weird yet, but you get to see people going about their daily business, unaware that someone is driving by taking a series of pictures.
Wednesday, November 09, 2005
Initial Impressions
I joined Amazon's Mechanical Turk program on November 5, 2005 after reading about it on Slashdot. As could be expected, the Turk site was inundated at the time from being slashdotted, but I was able to get signed up and complete a single Image Adjustment Hit. I liked what I saw, so I gave it a few days and tried again on the night of the 8th.
So while I layed on the couch with my laptop and watched old episodes of Law and Order off the Tivo, I proceeded to submit 287 hits over the course of about three hours. I chose to work exclusively on Image Adjustment Hits. This type of Hit is the lowest paid, but can be completed the quickest. I could have completed more, but Amazon still seems to be having some performance issues which made it necesssary to go through a lot of extra steps in the process. Once this gets cleared up I think it might be possible to increase the number of Image Adjustment Hits returned to 500 or more in a three hour span of time.
There seemed to be a delay of approximately 10 minutes last night from the time I submitted a hit to when it was approved or rejected. As of this morning, 45 hits are still pending so I'm not sure what the delay is there. I suspect that a certain image is shown a predetermined number of times to different users and once that number is reached, the results are tallied and if the choice you made matches up to the majority it is accepted and paid. This is just a guess of course based on my observations, but it reminds me of some of the techniques used in standardized testing to correct for errors.
Of the remaining hits I returned last night, 34 were rejected and 208 were approved. This is a rejection rate of 14% of the hits that were processed. I don't know how good that rate compares to others, but at the speed I was doing them I suspect it is not too bad. I accepted any hit that showed up without looking at it first. I think it would be possible to get a better acceptance rate by only accepting the Image Adjustment hits that are obviously "None of the Above."
I was working on images mostly from Phoenix, Arizona last night and often there would be a series of images that pictured an empty lot or just the side of a warehouse. I could scan these in about 5 seconds and choose "None of the Above" and click submit. I found that if the series of images actually contained the business in question, the process slowed down and could take 30 seconds or more. I suspect these cases are also likely where most of the rejected hits originated from. If you have two images in the series that are very similar, there's a good chance that my choice might not agree with the majority of those submitting the hit.
I think another possible source of rejection on a hit could come from business names not matching what is on the front of the store. So if I'm looking for "Frank's Taco Stand" and the pictures contain a business with a big sign that says "Hot Tacos" I'd be inclined to say that was a match, but I'm not sure if MTurk would accept that.
At 3 cents a hit, I'm not going to make big amounts of money doing this, but since I spend an hour or two just vegging out on the couch most nights anyway, I might as well make a little extra cash while I'm doing it. So far I've earned about 7 bucks which is way less than minimum wage, but how many jobs let you make money by laying on the couch? My goal is to make enough to order a new caving helmet from Amazon so I need about 10 more nights.
So while I layed on the couch with my laptop and watched old episodes of Law and Order off the Tivo, I proceeded to submit 287 hits over the course of about three hours. I chose to work exclusively on Image Adjustment Hits. This type of Hit is the lowest paid, but can be completed the quickest. I could have completed more, but Amazon still seems to be having some performance issues which made it necesssary to go through a lot of extra steps in the process. Once this gets cleared up I think it might be possible to increase the number of Image Adjustment Hits returned to 500 or more in a three hour span of time.
There seemed to be a delay of approximately 10 minutes last night from the time I submitted a hit to when it was approved or rejected. As of this morning, 45 hits are still pending so I'm not sure what the delay is there. I suspect that a certain image is shown a predetermined number of times to different users and once that number is reached, the results are tallied and if the choice you made matches up to the majority it is accepted and paid. This is just a guess of course based on my observations, but it reminds me of some of the techniques used in standardized testing to correct for errors.
Of the remaining hits I returned last night, 34 were rejected and 208 were approved. This is a rejection rate of 14% of the hits that were processed. I don't know how good that rate compares to others, but at the speed I was doing them I suspect it is not too bad. I accepted any hit that showed up without looking at it first. I think it would be possible to get a better acceptance rate by only accepting the Image Adjustment hits that are obviously "None of the Above."
I was working on images mostly from Phoenix, Arizona last night and often there would be a series of images that pictured an empty lot or just the side of a warehouse. I could scan these in about 5 seconds and choose "None of the Above" and click submit. I found that if the series of images actually contained the business in question, the process slowed down and could take 30 seconds or more. I suspect these cases are also likely where most of the rejected hits originated from. If you have two images in the series that are very similar, there's a good chance that my choice might not agree with the majority of those submitting the hit.
I think another possible source of rejection on a hit could come from business names not matching what is on the front of the store. So if I'm looking for "Frank's Taco Stand" and the pictures contain a business with a big sign that says "Hot Tacos" I'd be inclined to say that was a match, but I'm not sure if MTurk would accept that.
At 3 cents a hit, I'm not going to make big amounts of money doing this, but since I spend an hour or two just vegging out on the couch most nights anyway, I might as well make a little extra cash while I'm doing it. So far I've earned about 7 bucks which is way less than minimum wage, but how many jobs let you make money by laying on the couch? My goal is to make enough to order a new caving helmet from Amazon so I need about 10 more nights.
Subscribe to:
Posts (Atom)