Not to fan the flames on the social justice warriors.”I suppose I am one of those “social justice warriors” he's talking about. Rather, we should highlight this episode as one among the growing list of big data research projects that rely on some notion of “public” social media data, yet ultimately fail to stand up to ethical scrutiny.The Harvard “Tastes, Ties, and Time” dataset is no longer publicly accessible. And it appears Kirkegaard, at least for the time being, has removed the Ok Cupid data from his open repository. But the socio-technological revolution in how we meet our romantic partners has turned the dating scene upside down – and getting our head around the new code of dating ethics can be tough going at times!Do As You Would Be Done By Due to the anonymity it offers, on-line dating can cause us to kiss goodbye to our usual moral standards and start behaving in ways we normally wouldn’t dream of.Concerns over consent, privacy and anonymity do not disappear simply because subjects participate in online social networks; rather, they become even more important. The Ok Cupid data release reminds us that the ethical, research, and regulatory communities must work together to find consensus and minimize harm.
And it appeared again in 2010, when Pete Warden, a former Apple engineer, exploited a flaw in Facebook’s architecture to amass a database of names, fan pages, and lists of friends for 215 million public Facebook accounts, and announced plans to make his database of over 100 GB of user data publicly available for further academic research.The final methodology used to access the data is not fully explained in the article, and the question of whether the researchers respected the privacy intentions of 70,000 people who used Ok Cupid remains unanswered.I contacted Kirkegaard with a set of questions to clarify the methods used to gather this dataset, since internet research ethics is my area of study.a group of Danish researchers publicly released a dataset of nearly 70,000 users of the online dating site Ok Cupid, including usernames, age, gender, location, what kind of relationship (or sex) they’re interested in, personality traits, and answers to thousands of profiling questions used by the site.When asked whether the researchers attempted to anonymize the dataset, Aarhus University graduate student Emil O. Kirkegaard, who was lead on the work, replied bluntly: “No.As Kirkegaard stated: “Data is already public.” No harm, no ethical foul right? Many of the basic requirements of research ethics—protecting the privacy of subjects, obtaining informed consent, maintaining the confidentiality of any data collected, minimizing harm—are not sufficiently addressed in this scenario.