No longer active

Comments have been turned off because of spam. Questions/comments: I'm at dantasse.com

Saturday, May 21, 2016

ICWSM 2016 neat things

Back from ICWSM! It was the first time I'd been there. Felt more like Ubicomp than anything else I'd been to. Lots of people finding some correlation with p<.00001 and r^2=0.3, so what does it mean? I mean, it definitely means something, but I'm kind of frustrated by how difficult it would be to turn it into an application. I think the social scientists were frustrated too, by people's lack of social science training. I think the computer scientists were all keenly aware of their data and method's weaknesses... but they still found something. (and it still got published.) Lots of interesting data, a lot of scraping things, and a lotttttt of Twitter. Less polite than CHI or CSCW, which is a double edged sword: on one hand, I was kind of taken aback by some blunt questions. On the other hand, if we're not disagreeing, how are we getting anywhere? I had two conversations end with "ok, let's agree to disagree," and they didn't feel great, but I'm open to the possibility that that's a sign of intellectual diversity or something else good.

Lots of shared data sets!

Neat things applicable to cities:
City Dashboard - kind of overwhelming, but a start!

LikeWays - recommend the most interesting path to a thing, not just the quickest. Someone with an iPhone, try this out and tell me what it's like.

"Will check-in for badges", Gang Wang - basically, Foursquare doesn't represent real mobility (of course); it's really only good for applications that don't really matter if you get them wrong (like recommending restaurants).

Emotions, demographics, and sociability in Twitter interactions, Kristina Lerman - I had wanted to do a study like this: correlate a ton of stuff in different geo areas, see what comes out. People in higher income places have more weaker ties. (there's a lot going on there, though; it's kind of hard to interpret, or know why that would be.)

Other neat papers:

Identifying platform effects in social media data, Momin Malik - uses regression discontinuity to understand sudden things that happen on social media, which are because of a thing the platform did, not because of real effects. For example, Netflix changed the labels on their reviews (something like "I somewhat like it" to "I sort of like it") which changed review scores to jump suddenly.

When a movement becomes a party, Pablo Aragon - there was a bunch of grassroots talk around elections in Spain, so they followed one party, Barcelona en ComĂș, to see if they stayed all grassroots and decentralized, or if they evolved into a hierarchical organization. They found two groups: one for the movement (which stayed decentralized) and one for the party (which got hierarchical).

"Blissfully Happy" or "Ready to Fight", Hannah Miller - you've probably seen this on the news, it's super popular. Some emojis look different on different platforms. I use :D a lot but I guess on an iphone it looks angry. Some emoji are hard to interpret even within platform. (those raised hands! what does that mean!) This can be a problem.

Other useful tools:
Bot or Not - is this twitter account a bot?
(another quick heuristic: if # of followers/# you follow < .1, it's likely you're a bot)
Face++: face recognition tools
Gender detector - Is this name male or female? (python) (a different one in ruby)
IBM Watson Personality Insights Service - give it text, it gives you Big 5 personality scores
Complex Contagion models: models a thing where you have to be exposed to something N times before you get it too.
CommonCrawl - if you ever need a huge crawl of the web.
Want to find a set of ppl with known ages on Twitter? Just search for tweeters wishing each other "Happy (N)th Birthday!" Similarly, want to know what time ppl wake up (to track daylight savings or something), just search for people saying "Good morning!" Twitter is big, and there are at least a few people who say almost anything.
For what people say more in a place than others: probability that it appears there minus probability it appears at all. From a paper about #foodporn.
I finally learned what a tensor is: an n-dimensional matrix. And there are tools like PARAFAC decomposition, which is similar to matrix factorization, which is useful in some cases.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.