Friday, August 1, 2014

Some things I learned from running a big Mechanical Turk study

I'm not a big crowd researcher, but Mechanical Turk can be a great platform. The key words are "can be". It sounds great: pay a thousand people a dollar to do your survey, and for $1k, you have a huge amount of data overnight! But it's not really that simple. Here are some things I've learned. (there are a lot.)

Human/study design:
  • Pay people enough. This is maybe the #1 thing I hear on Turk forums, and the #1 piece of advice I can give to make the whole thing a good experience. These are people doing work. It's not just screwing around for fun. HITs are hard. Turking is hard. See also: Jeff Bigham's experiences Turking for a day.
  • $8/hour is enough. Sometimes a little less, but you're not getting by on $2/hour like you may have thought you could back in the early days. Or, if you're getting by on $2/hour, you are paying people sweatshop wages (and by the way, probably getting sweatshop level work). Why $8? People want to make about US minimum wage. It's a nice round benchmark, at least. It's still way cheaper than you could get it done any other way.
  • Turkers are good people. At least, the ones that you get for $8/hr are. They are not, for the most part, trying to scam you. Maybe 2% are. Accept that as a cost of doing business (that's what, 2 extra precious dollars?) and don't get too defensive about your task. 
  • 98%/1000 is a good threshold. That is, require turkers to have 98% acceptance and 1000 HITs completed. When we tried 95%, we got a few more rejects (though not dramatically more). When we restricted it to 5000 HITs completed, there were only about 300 qualified people who would do our task.
  • ACs (attention checks) are tough to get right. In our survey, we included three simple math questions: "what is two plus three?" etc. But then, even these are not perfect indicators of whether people are paying attention. We had 7-point likert scale answer options, so people would go try to click 5, and just miss and click 6 by mistake (or they'd be using a touchscreen or something). Also, dashing 15 minutes of work just because of one question seemed pretty cruel. I started accepting people if they got our ACs wrong, so long as they only missed one question and were just off by one. Relatedly:
  • Spell out exactly what will make you reject people. Our HIT had a list: "You will be rejected if..." This makes it much easier to deal with somewhat-angry Turkers who write to you. Many times, Turkers will get really mad if you reject them for something you didn't warn them about. I think that's a feature of mturk, not a bug.
  • Verifying they did surveys is mostly easy but not completely. You can't get their Turker number into your system. My standard approach is: you do our study, at the end we give you a number, then you enter the number into Mturk and we correlate your records with ours. About 1% of people don't understand this, or otherwise screw it up.
  • Reputation matters. If you're a crummy requester, folks can rate you on Turkopticon (TO) and talk about you on Turker Nation (TN). But if you're good, they'll rate you up on TO/TN, post you on Reddit's HITs Worth Turking For (hwtf), follow you with TurkAlert, and occasionally really get into it.
  • Engage. Get on Turker Nation and hwtf. (you can post your own HITs on hwtf, and in the requester forums on TN.) Respond to Turkers like you'd respond to people you hired to do a job, because they are people you hired to do a job.
  • Conflict is tough. You have only blunt weapons, and so do they. Because most HITs are accepted, and because the difference between being a 95% turker and a 98% turker is so big, every rejection really hurts workers. Also, they can trash you on sites like Turkopticon- not sure how much that matters, but it doesn't help. One disgruntled worker can make things difficult for you. So if someone's getting all angry at you, you're kind of incentivized to just pay them and get them off your back, before they go reviewing you all over the place.
  • Performance-based bonuses are good. We structured it as 30 cents base, plus 15 cents per Set you find. (our task was the game Set, where you try to find a bunch of sets of cards that fulfill certain criteria.) This meant that we ended up paying about $1.40 per person, but the best people would post in forums "I got $3.50 for this 15 minute HIT" or whatever. In general, most people ended up doing pretty well at the game, which I guess is a good sign that they were paying attention to it, which is good. We were worried that people would overlook our HIT because the base pay was low, but it was helpful that we could list it as a 30-cent HIT, and then say in the title "+ average $1.10 performance bonus!" (figure out that average value through pilot testing.)
  • If you let them, workers will repeat your HIT. We started off posting 40 assignments of a HIT at a time, but noticed that about 3/4 of our users each time would be repeaters. They use stuff like TurkAlert. So if you want not to have repeaters, make sure you just post one big HIT. (or use other 
  • There's no official Python API, but boto is pretty good. Documentation can be sparse, but supplement it with the mturk API, and you can usually figure out what you need to do.
  • The main Manage Hits page is mostly garbage. The one you want for most things, especially if you use the API too, is Manage Hits Individually. Looks like they wanted to replace the MHI page with the MH page, because it's got shiny new progress bars and stuff, but the shiny bars don't update very fast. At least MHI is up to date. Also, you can see how much you bonused each worker on MHI, and you can email workers without bonusing them.
  • Except! Rejecting workers is best on the main MH page. It lets you easily republish those HITs to other people. (this is an option that you don't even get in the API. What a mess.)
  • Also: The only way to approve an Assignment that you previously rejected is via the download/upload CSV, which you get to through the main MH page. Yes, this is pretty wonky.
  • You can't change much about the HIT after you post it. But you can change the qualifications, and other minor details about the HIT. You can't change the price or the content. To change the qualifications: you have to use an API call that is sort of obscure: ChangeHitTypeOfHit. (first you have to register your new qualifications as a HitType by calling RegisterHitType.) Wah! If you need help with this, let me know, I've got a script I can send you.
  • You're debugging while you're gathering data. When someone says "your site failed and that's why my data isn't complete"... are you going to reject them? Over forty five cents? Are you that sure your site is working perfectly? Are you a fool?

Wednesday, May 7, 2014

Public Relations

If corporations are people, what are their personalities? Can we have relationships with them? What is that like?

So I set off to explore the ways relationships between people and corporations could develop. As a result, I ended up studying the corporations themselves on social media. A corporation on social media is a strange new entity: it's sort of the corporation, but as it must be controlled in real time by a person, it's sort of one person too. It's a very public, very immediate spokesperson.

A person tweeting behind a corporate handle is neither that person (who would tweet about personal things) nor the company (which would tweet a boring party line, much like if you called a company on the phone and got a recording). It seems closer to the company identity, but I wanted to see if I could get to the human involved.

I started trying to find existing human conversations with companies, much like Chip Zdarsky and Applebee's. (or this, this, this, or this.) This proved fruitless, because most conversations with companies were someone complaining at a company, or else companies talking with other companies to try to appear fun; it was all business. So I went to create my own conversations.

I didn't want any of our conversations to be biased by anything particular about my identity, so I created @MarioLoweystro. Mario is intentionally as bland as possible. But he's obviously human; I didn't want anyone to think he's a bot. I then tweeted at some companies, trying to get to know the people behind them.

Here are my experiences.

And here's a guide showing how to make friends with corporations.

I started off with popular companies, including top-100 Twitter accounts, companies that are supposedly good at Twitter, and really big companies. This turned out not super fruitful - they tend to have millions of followers, and therefore, they can't talk with me. Their accounts are just broadcasts. But then I started tweeting at smaller companies, more local companies, less "sexy" companies. (from a list of Pittsburgh corporations.) They tended to be friendlier. As I went, I varied my styles and talked about different topics; to see what happened, check out my slides above.

Who were my best friends? Probably @WholeFoods, @Tesco, @Huntington_Bank, @Kennametal, and @Zappos. We had some good talks.

We had an exhibit at CMU too for all our final works:
I displayed the record of my conversations, the guide I created, and a bunch of name tags for the companies I met. Plus my computer, so you could try making friends with corporations too. (in fact, go log on to Twitter and try it yourself now!)

Monday, April 14, 2014

Thinking about predicting relationships and protests from Twitter

(There are some similarities.) I had a quick talk with Kenny Joseph the other day which got me to thinking about a couple things. Incidentally, we've got a project we'll be working on in Design Fiction that could use some thinking about the future consequences of these things.

On the protest front, there's this: Can Twitter predict major events such as mass protests?
What if a "they" could predict when you'll protest next, just based on your tweets? Or rather, if they could predict when someone would protest next?

On the relationships front, there's an idea in sociology that your conversation topics and your relationships co-evolve. I'm linking to this paper, even though I cannot say I understood it. But the idea is that you talk with your weak ties about pop culture things, and with your close friends about more niche things. They're not saying which way this evolves, which causes which, but it's a good marker at least of how someone's relationship status is now.

So the obvious dystopia is: The Government, The NSA, is watching all of us and they identify and Guantanamo all probable protesters. (or Turkey's government, or Egypt's, etc.)

But what about the corporate angle? One theme in Kenny's work is: how can we identify and change people's biases? In his case, it's to reduce violence against women and children, or potentially racial violence. What if they develop a powerful new technique to modify biases using Twitter, and Applebee's gets a hold of it? Do you get more people becoming friends of Applebees? More conversations like this?

Sunday, March 30, 2014

Fulfillment Fitness

Related to the aforementioned "design fiction" project, but much more successful, has been Fulfillment Fitness.

A reflection on the fulfillment centers that drive all your online purchases. Not really trying to be all exposing-cruelty about it, because it's hard to say how bad it is. News articles (one, two, three) would have you believe it's terrible, but people who work there (one two three four five six) seem more sanguine about it. Regardless, it is weird: people become basically robots stuck in a video game all day. (you could imagine SimFactory, where you manage a group of "workers", or a ripoff of Starcraft.) At the same time, sometimes people want to be robots stuck in a video game, like at the gym.

Anyway, it's been fun. Working with Angela and John has been great; they're both wizards, in their own ways. I'm learning how it's hard to get across a point, both to distill the point and to technically get it done. (I've learned to make many kinds of things, but they tend not to be the kind of things that tell stories or stir people's emotions.)

Monday, February 24, 2014

Tell Me Your Life Story, part 3

continued from part 2

It's up!

You can now publish only part of your life story, if you want. (chunks you leave out will show up as "private".) That's kind of neat. Adds a little mystery to it.

A little clunkier, more words, for better usability.

I hope it makes sense. I have no idea if it will. Also I have no idea if people will think this is in any way cool or makes them think.

I don't know that a website is a good medium for this. It feels too cheap. It's like an *application*, something that you have to do, to be more efficient or store some data or something. I think if I had to do it again, I might print it out as a deck of (large) laminated cards, and give it out (or sell it, in the far future...) with dry erase pens. As a small-party-game, or a Coffee Table Thing, it's kind of fun and personal; as a website, it's not.

Preliminary results with a few friends and a few Turkers: friends find it interesting but don't want to share much. I guess that's fine, as with so few users there is little anonymity. Turkers mostly tell the school-college-work story. Also reasonable, as they're (I'm assuming) mostly trying to make a few bucks.
A couple things people posted are neat:
"1 - 8: Don't remember, had a trampoline."
"15 - 25: I still wanted to be a singer, but Imade sure I got high grades and got accepted into a "good" college. I ended up dropping out many times."

Enough talk, try it out!

Wednesday, February 19, 2014

Tell Me Your Life Story part 2

continued from part 1
... partial implementation and further design.

I've been thinking a website from the start, just because it's the easiest way to make an interactive thing that people can use. But if I'm going website, then horizontal (the way I've drawn out life stories on paper) is not so good:
There's no way you can space everything out sort of equally and still leave room to type in each box. So I went vertical.
and you can sort of type stuff in here and save it; functional, not yet pretty. So now a couple of questions:
1. how should it look and feel?
2. what should it do when you're done?

For look and feel, it's got to be expansive and welcoming. This should be a space for people to creatively re-imagine their lives. I'm not going to tell them their re-imagination is wrong, or they'll retreat back into the boring school-school-work story. So none of this:
And none of this:
And nothing computery and cold, like I usually dig:
But I don't want it to be new-agey woo anything-goes; no pastel blues and greens and handwriting:
How about cartography? You're mapping your life. Map-making is a good analogy here: you have to take an expanse of time or space that exists, but the way you draw your map (even the projection you use) incorporates your current bias about it.
Plus, I like this aesthetic. I think the old-fashioned look makes it appear valuable. Still, it's not imposing to draw on old-fashioned paper.
shout out to ("rice paper 3") for the background.

For what it should do when you're done... well, I think if it just says "okay, thanks", that is not perhaps as provocative as it could be. There should be some way to display your story, I think, but it should also make you reflect on it.
"Is there anything you would change?"
"What did you learn from telling this story?"
"How do you feel about this story?"
"What happens in the next 5 years?"
(this is after you finish it; the 0-5, 5-8, etc would be filled in with whatever you entered)

Still to do: it would be neat to be able to compare this to some pre-existing milestones; maybe get your school/move/work dates in there too. Or maybe get common culturally-accepted milestones in there to compare your story to those.

A thing I've already learned from this: don't make websites for class projects. I spend more time debugging than I do thinking about the design or the overall experience of the thing.

Thursday, February 13, 2014

Tell me your life story

A class project for Design Fiction.
The prompt: "Our formal critical selves" - do something looking at your past in a counterfactual kind of way. See how your life could have been if you had done something differently, or if something different had happened to you.
The timeframe: two weeks. (due the Tuesday after next)

Early thoughts:
1. All the time I've "wasted" throughout the years, and count up the hours I could have been doing something "productive", in this creepy totalitarian sense.

Then I could transform it into N masteries, just by dividing by 10,000! If I've slacked for 20,000 hours by now, I could be an expert in piano and chess. Easy as that. The humor comes from the fact that of course it is not as easy as that.

2. 23 and Me is weird. I have all this data about what risks for diseases I might have. I can't act on it at all. What does it mean that I have a 1.2% chance to get kidney disease? Should I eat more prunes?
I could make it a little more real by making a "wheel of Dan" where you spin a wheel and it tells you "okay, with this lifetime where you started with Dan's genes, you have Alzheimer's and gout." And then I could let you upload your 23 and Me data, and then spin your own wheel, and compare with me, and maybe ask you if you'd trade with me. Make it personal. And maybe compare these risks to other risks, like the odds of being struck by lightning. Whatever.

The downside is that very few people have done 23 and Me, so it's hard to make it actually personal.
Another downside: what am I trying to say? I think there are a lot of interesting things to say about 23 and me, but others say them better than I do, or else I don't really care about them.

3. Something about browser history or email relationships over time. Meh. Been done. I tried to mine all the neat personal data I have about myself, my sleep logs and fitbits and stuff, but I ended up with the same old personal-informatics gripe: what the hell is this data good for?

4. Life stories. I started looking for meaning in Flappy Bird, and then started looking at how people look for meaning in Flappy Bird, which is kind of funny because there is probably not that much meaning to find in Flappy Bird. But all the game developers want there to be some meaning in Flappy Bird, so they tell all kinds of stories, like "really polish a simple game mechanic and it'll shine" or "make sure you can restart quickly" or whatever.

We tell these stories about our lives.
"You may ask yourself, well, how did I get here?"
And most of the time, there's no real story, you just did.
I started looking at the story I tell myself. It was very broken up by where I lived and where I worked. New school, new job, new box in my life story. But
if you mix up those boxes, draw the lines really arbitrarily, you might hit some really more interesting stories. I drew a bunch of random boxes with arbitrary lines. The second box above is "what I liked at various ages", the fifth is "how I felt at various ages", and both are much more interesting than "where I worked."
I want to make a tool to help people retell their life story, but give them a little twist: let the system arbitrarily decide where the boundaries end.

Messing around with the flow here a little bit, or what happens after you tell your story.

More to come!