Momin M. Malik

I bring statistics and machine learning together with critical perspectives from social science to consider when, how, and why data and modeling succeed in their aims—and when, how, and why they can fail. I work to improve practice towards more responsible, robust, effective, and just uses of data and modeling. I have engaged in collaboration and outreach with practitioners in policy, public health, medicine, law, education, government, journalism, social science, industry, civil society, and elsewhere, helping them understand and decide if and how to adopt machine learning and data science.

My 2020 preprint, A hierarchy of limitations in machine learning, is my major individual work.


I am currently Senior Data Science Analyst - AI Ethics at the Mayo Clinic’s Center for Digital Health. Previously, I was Director of Data Science at Avant-garde Health, a healthcare data startup born out of Harvard Business School’s value-based healthcare research.

I am also a fellow and senior investigator at the Institute in Critical Quantitative, Computational, & Mixed Methodologies, and an instructor at the University of Pennsylvania’s School of Social Policy and Practice.

Download my CV. Last updated 15 May 2024.

Published work

Tracey A. Brereton, Momin M. Malik, Lauren M. Rost, Joshua W. Ohde, Lu Zheng, Kristelle A. Jose, Kevin J. Peterson, David Vidal, Mark A. Lifson, Joe Melnick, Bryce Flor, Jason D. Greenwood, Kyle Fisher, and Shauna M. Overgaard. 2024. AImedReport: A prototype tool to facilitate research reporting and translation of Artificial Intelligence technologies in health care. Mayo Clinic Proceedings: Digital Health 2 (2): 246–251. doi: 10.1016/j.mcpdig.2024.03.008. [MCP link]

The core of artificial intelligence (AI) research in health care is carried out by AI data scientists, AI engineers, and clinicians; however, successfully evaluating and translating AI technologies into health care requires cross-collaboration beyond this group. Throughout ideation, development, and validation, successful translation requires engaging with many domains, including AI ethicists, quality management professionals, systems engineers, and more. We found through a scoping review that the prioritization of proactive evaluation of AI technologies, multidisciplinary collaboration, and adherence to investigation and validation protocols, transparency and traceability requirements, and guiding standards and frameworks are expected to help address present barriers to translation. However, as identified by Lu et al through a systematic review assessing clinical prediction model adherence to reporting guidelines that no consensus exists regarding model details that are essential to report, with some reporting items being commonly requested across reporting guidelines yet other reporting items being unique to specific reporting guidelines. Unless there is clear, consistent, and unified best practices and communication and collaboration across domains, there will be gaps in development, accountability, and implementation. Documentation is a crucial part of reporting and translation, but its coordinated maintenance throughout the AI lifecycle remains a challenge.
We have established a proof-of-concept team-based documentation strategy for AI translation to simplify compliance with evaluation and research reporting standards through the development of AImedReport, a reporting guideline documentation repository. AImedReport organizes available reporting guidelines for different phases of the AI lifecycle, consolidating reporting items from different guidelines, assigning specific roles to team members, and guiding relevant information to capture when knowledge is generated.

Sayash Kapoor, Emily Cantrell, Kenny Peng, Thanh Hien Pham, Christopher A. Bail, Odd Erik Gundersen, Jake M. Hofman, Jessica Hullman, Michael A. Lones, Momin M. Malik, Priyanka Nanayakkara, Russel A. Poldrack, Inioluwa Deborah Raji, Michael Roberts, Matthew J. Salganik, Marta Serra-Garcia, Brandon M. Stewart, Gilles Vandewiele, and Arvind Narayanan. 2024. Reforms: Consensus-based recommendations for machine-learning-based science. Science Advances 10 (18): eadk3452. doi: 10.1126/sciadv.adk3452. [Science link]

Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear recommendations for conducting and reporting ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist (recommendations for machine-learning-based science). It consists of 32 questions and a paired set of guidelines. REFORMS was developed on the basis of a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility.

Raphael Frankfurter, Maya Malik, Sahr David Kpakiwa, Timothy McGinnis, Momin M. Malik, Smit Chitre, Mohamed Bailor Barrie, Yusupha Dibba, Lulwama Mulalu, Raquel Baldwinson, Mosoka Fallah, Ismail Rashid, J. Daniel Kelly, and Eugene T. Richardson. 2024. Representations of an Ebola ‘outbreak’ through story technologies. BMJ Global Health 9 (2): e013210. doi: 10.1136/bmjgh-2023-013210. [BMJ link]

Background: Attempts to understand biosocial phenomena using scientific methods are often presented as value-neutral and objective; however, when used to reduce the complexity of open systems such as epidemics, these forms of inquiry necessarily entail normative considerations and are therefore fashioned by political worldviews (ideologies). From the standpoint of poststructural theory, the character of these representations is at most limited and partial. In addition, these modes of representation (as stories) do work (as technologies) in the service of, or in resistance to, power.

Methods: We focus on a single Ebola case cluster from the 2013–2016 outbreak in West Africa and examine how different disciplinary forms of knowledge production (including outbreak forecasting, active epidemiological surveillance, post-outbreak serosurveys, political economic analyses, and ethnography) function as Story Technologies. We then explore how these technologies are used to curate ‘data,’ analysing the erasures, values, and imperatives evoked by each.

Results: We call attention to the instrumental—in addition to the descriptive—role Story Technologies play in ordering contingencies and establishing relationships in the wake of health crises.

Discussion: By connecting each type of knowledge production with the systems of power it reinforces or disrupts, we illustrate how Story Technologies do ideological work. These findings encourage research from pluriversal perspectives and advocacy for measures that promote more inclusive modes of knowledge production.

Young J. Juhn, Momin M. Malik, Euijung Ryu, Chung-Il Wi, and John D. Halamka. 2024. Chapter 47 - Socioeconomic bias in applying artificial intelligence models to health care. In Artificial Intelligence in clinical practice: How AI technologies impact medical research and clinics, edited by Chayakrit Krittanawong, 413–435. Academic Press. doi: 10.1016/B978-0-443-15688-5.00044-9. [Elsevier link (paywall)]

Socioeconomic status (SES) is a key dimension along which artificial intelligence (AI) models can be “biased”, or more specifically, along which AI models can exhibit disparate performance across demographic subgroups. However, measuring SES in ways usable for healthcare research is a challenge. We present the HOUSES index, an extensively validated way of measuring SES for healthcare applications, and show its use in detecting and measuring performance disparities in AI models. Going beyond measurement and theorizing about causal mechanisms, we take an understanding of AI as a form of causality-agnostic statistical modeling that automatically finds optimal correlations. With this, we present a hypothesis and supporting evidence that a lack of healthcare access among those of lower SES propagates through to these patients having lower-quality healthcare data, which leaves less of a valid signal for AI models to pick up, ultimately resulting in lower performance for those of lower SES.

Tracey A. Brereton, Momin M. Malik, Mark A. Lifson, Jason D. Greenwood, Kevin J. Peterson, and Shauna M. Overgaard. 2023. The role of AI model documentation in translational science: A scoping review. Interactive Journal of Medical Research 12: e45903. doi: 10.2196/45903.

Background: Despite the touted potential of artificial intelligence (AI) and machine learning (ML) to revolutionize health care, clinical decision support tools, herein referred to as medical modeling software (MMS), have yet to realize the anticipated benefits. One proposed obstacle is the acknowledged gaps in AI translation. These gaps stem partly from the fragmentation of processes and resources to support MMS transparent documentation. Consequently, the absence of transparent reporting hinders the provision of evidence to support the implementation of MMS in clinical practice, thereby serving as a substantial barrier to the successful translation of software from research settings to clinical practice.

Objective: This study aimed to scope the current landscape of AI- and ML-based MMS documentation practices and elucidate the function of documentation in facilitating the translation of ethical and explainable MMS into clinical workflows.

Methods: A scoping review was conducted in accordance with PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. PubMed was searched using Medical Subject Headings key concepts of AI, ML, ethical considerations, and explainability to identify publications detailing AI- and ML-based MMS documentation, in addition to snowball sampling of selected reference lists. To include the possibility of implicit documentation practices not explicitly labeled as such, we did not use documentation as a key concept but as an inclusion criterion. A 2-stage screening process (title and abstract screening and full-text review) was conducted by 1 author. A data extraction template was used to record publication-related information; barriers to developing ethical and explainable MMS; available standards, regulations, frameworks, or governance strategies related to documentation; and recommendations for documentation for papers that met the inclusion criteria.

Results: Of the 115 papers retrieved, 21 (18.3%) papers met the requirements for inclusion. Ethics and explainability were investigated in the context of AI- and ML-based MMS documentation and translation. Data detailing the current state and challenges and recommendations for future studies were synthesized. Notable themes defining the current state and challenges that required thorough review included bias, accountability, governance, and explainability. Recommendations identified in the literature to address present barriers call for a proactive evaluation of MMS, multidisciplinary collaboration, adherence to investigation and validation protocols, transparency and traceability requirements, and guiding standards and frameworks that enhance documentation efforts and support the translation of AI- and ML-based MMS.

Conclusions: Resolving barriers to translation is critical for MMS to deliver on expectations, including those barriers identified in this scoping review related to bias, accountability, governance, and explainability. Our findings suggest that transparent strategic documentation, aligning translational science and regulatory science, will support the translation of MMS by coordinating communication and reporting and reducing translational barriers, thereby furthering the adoption of MMS.

The Avant-Garde Health and Codman Shoulder Society Value Based Care Group, Adam Z. Khan, Matthew J. Best, Catherine J. Fedorka, Robert M. Belniak, Derek A. Haas, Xiaoran Zhang, April D. Armstrong, Andrew Jawa, Evan A. O’Donnell, Jason E. Simon, Eric R. Wagner, Momin Malik, Michael B. Gottschalk, Gary F. Updegrove, Eric C. Makhni, Jon J. P. Warner, Uma Srikumaran, and Joseph A. Abboud. 2022. Impact of the COVID-19 pandemic on shoulder arthroplasty: Surgical trends and postoperative care pathway analysis. Journal of Shoulder and Elbow Surgery 31 (12): 2457–2464. doi: 10.1016/j.jse.2022.07.020. [JSES link] 2022.

Background: COVID-19 triggered disruption in the conventional care pathways for many orthopedic procedures. The current study aims to quantify the impact of the COVID-19 pandemic on shoulder arthroplasty hospital surgical volume, trends in surgical case distribution, length of hospitalization, posthospital disposition, and 30-day readmission rates.

Methods: This study queried all Medicare (100% sample) fee-for-service beneficiaries who underwent a shoulder arthroplasty procedure (Diagnosis-Related Group code 483, Current Procedural Terminology code 23472) from January 1, 2019, to December 18, 2020. Fracture cases were separated from nonfracture cases, which were further subdivided into anatomic or reverse arthroplasty. Volume per 1000 Medicare beneficiaries was calculated from April to December 2020 and compared to the same months in 2019. Length of stay (LOS), discharged-home rate, and 30-day readmission for the same period were obtained. The yearly difference adjusted for age, sex, race (white vs. nonwhite), Centers for Medicare & Medicaid Services Hierarchical Condition Category risk score, month fixed effects, and Core-Based Statistical Area fixed effects, with standard errors clustered at the provider level, was calculated using a multivariate analysis (P < .05).

Results: A total of 49,412 and 41,554 total shoulder arthroplasty (TSA) cases were observed April through December for 2019 and 2020, respectively. There was an overall decrease in shoulder arthroplasty volume per 1000 Medicare beneficiaries by 14% (19% reduction in anatomic TSA, 13% reduction in reverse shoulder arthroplasty, and 3% reduction in fracture cases). LOS for all shoulder arthroplasty cases decreased by 16% (−0.27 days, P < .001) when adjusted for confounders. There was a 5% increase in the discharged-home rate (88.0% to 92.7%, P < .001), which was most prominent in fracture cases, with a 20% increase in discharged-home cases (65.0% to 73.4%, P < .001). There was no significant change in 30-day hospital readmission rates overall (P = .20) or when broken down by individual procedures.

Conclusions: There was an overall decrease in shoulder arthroplasty volume per 1000 Medicare beneficiaries by 14% during the COVID-19 pandemic. A decrease in LOS and increase in the discharged-home rates was also observed with no significant change in 30-day hospital readmission, indicating that a shift toward an outpatient surgical model can be performed safely and efficiently and has the potential to provide value.

Angelina Mooseder, Momin M. Malik, Hemank Lamba, Earth Erowid, Sylvia Thyssen, and Jürgen Pfeffer. 2022. Glowing experience or bad trip? A quantitative analysis of user reported drug experiences on In Proceedings of the Sixteenth International AAAI Conference on Web and Social Media (ICWSM-2022), 675–689. [AAAI Digital Library] [arXiv preprint (includes supplementary appendix)] is a website dedicated to documenting information about psychoactive substances, with over 36,000 user-submitted drug Experience Reports. We study the potential of these reports to provide information about characteristic experiences with drugs. First, we assess different kinds of drug experiences, such as ‘addiction’ or ‘bad trips’. We quantitatively analyze how such experiences are related to substances and user variables. Furthermore, we classify positive and negative experiences as well as reported addiction using information about the consumer, substance, context and location of the drug experience. While variables based only on objective characteristics yield poor predictive performance for subjective experiences, we find subjective user reports can help to identify new patterns and impact factors on drug experiences. In particular, we found a positive association between addiction experiences and dextromethorphan, a substance with largely unknown withdrawal effects. Our research can help to gain a deeper sociological understanding of drug consumption and to identify relationships which may have clinical relevance. Moreover, it can show how non-mainstream social media platforms can be utilized to study characteristics of human behavior and how this can be done in an ethical way in collaboration with the platform providers.

Young J. Juhn, Euijung Ryu, Chung-Il Wi, Katherine S. King, Momin Malik, Santiago Romero-Brufau, Chunhua Weng, Sunghwan Sohn, Richard R. Sharp, and John D. Halamka. 2022. Assessing socioeconomic bias in machine learning algorithms in health care: A case study of the HOUSES index. Journal of the American Medical Informatics Association 29 (7): 1142–1151. doi: 10.1093/jamia/ocac052. [AMIA link (paywall)]

Objective: Artificial intelligence (AI) models may propagate harmful biases in performance and hence negatively affect the underserved. We aimed to assess the degree to which data quality of electronic health records (EHRs) affected by inequities related to low socioeconomic status (SES), results in differential performance of AI models across SES.

Materials and Methods: This study utilized existing machine learning models for predicting asthma exacerbation in children with asthma. We compared balanced error rate (BER) against different SES levels measured by HOUsing-based SocioEconomic Status measure (HOUSES) index. As a possible mechanism for differential performance, we also compared incompleteness of EHR information relevant to asthma care by SES.

Results: Asthmatic children with lower SES had larger BER than those with higher SES (eg, ratio = 1.35 for HOUSES Q1 vs Q2-Q4) and had a higher proportion of missing information relevant to asthma care (eg, 41% vs 24% for missing asthma severity and 12% vs 9.8% for undiagnosed asthma despite meeting asthma criteria).

Discussion: Our study suggests that lower SES is associated with worse predictive model performance. It also highlights the potential role of incomplete EHR data in this differential performance and suggests a way to mitigate this bias.

Conclusion: The HOUSES index allows AI researchers to assess bias in predictive model performance by SES. Although our case study was based on a small sample size and a single-site study, the study results highlight a potential strategy for identifying bias by using an innovative SES measure.

Nicole C. Nelson, Kelsey Ichikawa, Julie Chung, and Momin M. Malik. 2022. Psychology exceptionalism and the multiple discovery of the replication crisis. Review of General Psychology 26 (2): 184–198. doi: 10.1177/10892680211046508. [SAGE link (paywall)] [MetaArXiv:sbv3q (preprint)]

This article outlines what we call the “narrative of psychology exceptionalism” in commentaries on the replication crisis: many thoughtful commentaries link the current crisis to the specificity of psychology’s history, methods, and subject matter, but explorations of the similarities between psychology and other fields are comparatively thin. Historical analyses of the replication crisis in psychology further contribute to this exceptionalism by creating a genealogy of events and personalities that shares little in common with other fields. We aim to rebalance this narrative by examining the emergence and evolution of replication discussions in psychology alongside their emergence and evolution in biomedicine. Through a mixed-methods analysis of commentaries on replication in psychology and the biomedical sciences, we find that these conversations have, from the early years of the crisis, shared a common core that centers on concerns about the effectiveness of traditional peer review, the need for greater transparency in methods and data, and the perverse incentive structure of academia. Drawing on Robert Merton’s framework for analyzing multiple discovery in science, we argue that the nearly simultaneous emergence of this narrative across fields suggests that there are shared historical, cultural, or institutional factors driving disillusionment with established scientific practices.

Maya Malik and Momin M. Malik. 2021. Critical technical awakenings. Journal of Social Computing 2 (4): 365–384. doi: 10.23919/JSC.2021.0035. [IEEE link]

Starting with Philip E. Agre’s 1997 essay on “critical technical practice”, we consider examples of writings from computer science where authors describe “waking up” from a previously narrow technical approach to the world, enabling them to recognize how their previous efforts towards social change had been ineffective. We use these examples first to talk about the underlying assumptions of a technology-centric approach to social problems, and second to theorize these awakenings in terms of Paulo Freire’s idea of critical consciousness. Specifically, understanding these awakenings among technical practitioners as examples of this more general phenomenon gives guidance for how we might encourage and guide critical awakenings in order to get more technologists working effectively towards positive social change.

Nicole C. Nelson, Kelsey Ichikawa, Julie Chung, and Momin M. Malik. 2021. Mapping the discursive dimensions of the reproducibility crisis: A mixed methods analysis. PLOS ONE 16 (7): e0254090. doi: 10.1371/journal.pone.0254090. [PLOS ONE link] [MetaArXiv:sbv3q (preprint)]

To those involved in discussions about rigor, reproducibility, and replication in science, conversation about the “reproducibility crisis” appear ill-structured. Seemingly very different issues concerning the purity of reagents, accessibility of computational code, or misaligned incentives in academic research writ large are all collected up under this label. Prior work has attempted to address this problem by creating analytical definitions of reproducibility. We take a novel empirical, mixed methods approach to understanding variation in reproducibility discussions, using a combination of grounded theory and correspondence analysis to examine how a variety of authors narrate the story of the reproducibility crisis. Contrary to expectations, this analysis demonstrates that there is a clear thematic core to reproducibility discussions, centered on the incentive structure of science, the transparency of methods and data, and the need to reform academic publishing. However, we also identify three clusters of discussion that are distinct from the main body of articles: one focused on reagents, another on statistical methods, and a final cluster focused on the heterogeneity of the natural world. Although there are discursive differences between scientific and popular articles, we find no strong differences in how scientists and journalists write about the reproducibility crisis. Our findings demonstrate the value of using qualitative methods to identify the bounds and features of reproducibility discourse, and identify distinct vocabularies and constituencies that reformers should engage with to promote change.

Hal Roberts, Rahul Bhargava, Linas Valiukas, Dennis Jen, Momin M. Malik, Cindy Bishop, Emily Ndulue, Aashka Dave, Justin Clark, Bruce Etling, Rob Faris, Anushka Shah, Jasmin Rubinovitz, Alexis Hope, Catherine D’Ignazio, Fernando Bermejo, Yochai Benkler, and Ethan Zuckerman. 2021. Media Cloud: Massive open source collection of global news on the open web. In Proceedings of the Fifteenth International AAAI Conference on Web and Social Media (ICWSM-2021), 1034–1045. [AAAI Digital Library] [Preprint (with appendix)]

We present the first full description of Media Cloud, an open source platform based on crawling hyperlink structure in operation for over 10 years, that for many uses will be the best way to collect data for studying the media ecosystem on the open web. We document the key choices behind what data Media Cloud collects and stores, how it processes and organizes these data, and open API access as well as user-facing tools. We also highlight the strengths and limitations of the Media Cloud collection strategy compared to relevant alternatives. We give an overview two sample datasets generated using Media Cloud and discuss how researchers can use the platform to create their own datasets.

Eugene T. Richardson, Momin M. Malik, William A. Darity, Jr., A. Kirsten Mullen, Michelle E. Morse, Maya Malik, Adia Benton, Mary T. Bassett, Paul E. Farmer, Lee Worden, and James Holland Jones. 2021. Reparations for Black American descendants of persons enslaved in the U.S. and their potential impact on SARS-CoV-2 transmission. Social Science & Medicine 276: 113741. doi: 10.1016/j.socscimed.2021.113741. [Science Direct link] [Supplementary Material]

Background: In the United States, Black Americans are suffering from a significantly disproportionate incidence of COVID-19. Going beyond mere epidemiological tallying, the potential for actual racial-justice interventions, including reparations payments, to ameliorate these disparities has not been adequately explored.

Methods: We compared the COVID-19 time-varying Rt curves of relatively disparate polities in terms of social equity (South Korea vs. Louisiana). Next, we considered a range of reproductive ratios to back-calculate the transmission rates βij for 4 cells of the simplified next-generation matrix (from which R0 is calculated for structured models) for the outbreak in Louisiana. Lastly, we modeled the effect that monetary payments as reparations for Black American descendants of persons enslaved in the U.S. would have had on pre-intervention βij and consequently R0.

Results: Once their respective epidemics begin to propagate, Louisiana displays Rt values with an absolute difference of 1.3 to 2.5 compared to South Korea. It also takes Louisiana more than twice as long to bring Rt below 1. Reasoning through the consequences of increased equity via matrix transmission models, we demonstrate how the benefits of a successful reparations program (reflected in the ratio βbb / βww) could reduce R0 by 31 to 68%.

Discussion: While there are compelling moral and historical arguments for racial injustice interventions such as reparations, our study describes potential health benefits in the form of reduced SARS-CoV-2 transmission risk. A restitutive program targeted towards Black individuals would not only decrease COVID-19 risk for recipients of the wealth redistribution; the mitigating effects would be distributed across racial groups, benefitting the population at large.

Diego Alburez-Gutierrez, Eshwar Chandrasekharan, Rumi Chunara, Sofia Gil-Clavel, Aniko Hannak, Roberto Inter- donato, Kenneth Joseph, Kyriaki Kalimeri, Momin M. Malik, Katja Mayer, Yelena Mejova, Daniela Paolotti, and Emilio Zagheni. 2019. Reports of the workshops held at the 2019 International AAAI Conference on Web and Social Media. AI Magazine 40 (4): 78–82. doi: 10.1609/aimag.v40i4.5287. [AAAI Digital Library]

The workshop program of the Association for the Advancement of Artificial Intelligence”s 13th International Conference on Web and Social Media was held at the Bavarian School of Public Policy in Munich, Germany on June 11, 2019. There were five full-day workshops, one half-day workshop, and the annual evening Science Slam in the program. The proceedings of the workshops were published in Research Topic of the Frontiers in Big Data. This report contains summaries of those workshops.

Kar-Hai Chu, Jason Colditz, Momin M. Malik, Tabitha Yates, and Brian Primack. 2019. Identifying key target audiences for public health campaigns: Leveraging machine learning in the case of hookah tobacco smoking. Journal of Medical Internet Research 21 (7): e12443. doi: 10.2196/12443. [JMIR link]

Background: Hookah tobacco smoking (HTS) is a particularly important issue for public health professionals to address owing to its prevalence and deleterious health effects. Social media sites can be a valuable tool for public health officials to conduct informational health campaigns. Current social media platforms provide researchers with opportunities to better identify and target specific audiences and even individuals. However, we are not aware of systematic research attempting to identify audiences with mixed or ambivalent views toward HTS.

Objective: The objective of this study was to (1) confirm previous research showing positively skewed HTS sentiment on Twitter using a larger dataset by leveraging machine learning techniques and (2) systematically identify individuals who exhibit mixed opinions about HTS via the Twitter platform and therefore represent key audiences for intervention.

Methods: We prospectively collected tweets related to HTS from January to June 2016. We double-coded sentiment for a subset of approximately 5000 randomly sampled tweets for sentiment toward HTS and used these data to train a machine learning classifier to assess the remaining approximately 556,000 HTS-related Twitter posts. Natural language processing software was used to extract linguistic features (ie, language-based covariates). The data were processed by machine learning tools and algorithms using R. Finally, we used the results to identify individuals who, because they had consistently posted both positive and negative content, might be ambivalent toward HTS and represent an ideal audience for intervention.

Results: There were 561,960 HTS-related tweets: 373,911 were classified as positive and 183,139 were classified as negative. A set of 12,861 users met a priori criteria indicating that they posted both positive and negative tweets about HTS.

Conclusions: Sentiment analysis can allow researchers to identify audience segments on social media that demonstrate ambiguity toward key public health issues, such as HTS, and therefore represent ideal populations for intervention. Using large social media datasets can help public health officials to preemptively identify specific audience segments that would be most receptive to targeted campaigns.

Momin M. Malik. 2018. Bias and beyond in digital trace data. PhD dissertation, Carnegie Mellon University School of Computer Science. [SCS Technical Report Collection] [Defense slides]

Large-scale digital trace data from sources such as social media platforms, emails, purchase records, browsing behavior, and sensors in mobile phones are increasingly used for business decision-making, scientific research, and even public policy. However, these data do not give an unbiased picture of underlying phenomena. In this thesis, I demonstrate some of the ways in which large-scale digital trace data, despite its richness, has biases in who is represented, what sorts of actions are represented, and what sorts of behaviors are captured. I present three critiques, demonstrating respectively that geotagged tweets exhibit heavy geographic and demographic biases, that social media platforms’s attempts to guide user behavior are successful and have implications for the behavior we think we observe, and that sensors built into mobile phones like Bluetooth and WiFi measure proximity and co-location but not necessarily interaction as has been claimed.

In response to these biases, I suggest shifting the scope of research done with digital trace data away from attempts at large-sample statistical generalizability and towards studies that situate knowledge in the contexts in which the data were collected. Specifically, I present two studies demonstrating alternatives to complement each of the critiques. In the first, I work with public health researchers to use Twitter as a means of public outreach and intervention. In the second, I design a study using mobile phone sensors in which I use sensor data and survey data to respectively measure proximity and sociometric choice, and model the relationship between the two.

Committee: Jürgen Pfeffer (co-chair), Institute for Software Research; Anind K. Dey (co-chair), Human-Computer Interaction Institute; Cosma Rohilla Shalizi, Department of Statistics & Data Science; and David Lazer, Northeastern University.

Jürgen Pfeffer and Momin M. Malik. 2017. Simulating the dynamics of socio-economic systems. In Networked governance: New research perspectives, edited by Betina Hollstein, Wenzel Matiaske, and Kai-Uwe Schnapp, 143–161. Cham, Switzerland: Springer. doi: 10.1007/978-3-319-50386-8_9. [Springer link (paywall)] [Authors’s copy (contains minor corrections)] [Full-sized vector image of my recreation of the World3 diagram] [BibTeX]

Excerpt: To the two traditional modes of doing science, in vivo (observation) and in vitro (experimentation), has been added “in silico”: computer simulation. It has become routine in the natural sciences, as well as in systems planning and business process management (Baines et al. 2004; Laguna and Marklund 2013; Paul et al. 1999) to recreate the dynamics of physical systems in computer code. The code is then executed to give outputs that describe how a system evolves from given inputs. Simulation models of simple physical processes, like boiling water or materials rupturing, give precise outputs that reliably match the outcomes of the actual physical system. However, as Winsberg (2010, p. 71) argues, scientists who rely on simulations do so because they “assume as background knowledge that we already know a great deal about how to build good models of the very features of the target system that we are interested in learning about.” This is not the case with social simulation. It is often done precisely to try and discover the important features of the target system when those features are unknown or uncertain. Social simulation is a kind of computer-aided thought experiment (Di Paolo et al. 2000) and as such, it is most appropriate to use as a “method of theory development” (Gilbert and Troitzsch 2005). Unlike in the natural sciences, uncertainty and the impossibility of verification are the rule rather than the exception, and so it is rare to find attempts to use social simulation for prediction and forecasting (Feder 2002).

Momin M. Malik and Jürgen Pfeffer. 2016. Identifying platform effects in social media data. In Proceedings of the Tenth International AAAI Conference on Web and Social Media (ICWSM-16), 241–249. May 18–20, 2016, Cologne, Germany. [Updated version, Chapter 2 from dissertation] [AAAI Digital Library] [ICWSM slides] [IC2S2 slides] [Sunbelt slides] [BibTeX]

Even when external researchers have access to social media data, they are not privy to decisions that went into platform design—including the measurement and testing that goes into deploying new platform features, such as recommender systems, that seek to shape user behavior towards desirable ends. Finding ways to identify platform effects is thus important both for generalizing findings, as well as understanding the nature of platform usage. One approach is to find temporal data covering the introduction of a new feature; observing differences in behavior before and after allow us to estimate the effect of the change. We investigate platform effects using two such datasets, the Netflix Prize dataset and the Facebook New Orleans data, in which we observe seeming discontinuities in user behavior but that we know or suspect are the result of a change in platform design. For the Netflix Prize, we estimate user ratings changing by an average of about 3% after the change, and in Facebook New Orleans, we find that the introduction of the ‘People You May Know’ feature locally nearly doubled the average number of edges added daily, and increased by 63% the average proportion of triangles created by each new edge. Our work empirically verifies several previously expressed theoretical concerns, and gives insight into the magnitude and variety of platform effects.

Momin M. Malik and Jürgen Pfeffer. 2016. A macroscopic analysis of news in Twitter. Digital Journalism 4 (8), 955–979. doi: 10.1080/21670811.2015.1133249. [Taylor & Francis link (paywall)] [Preprint] [BibTeX]

Previous literature has considered the relevance of Twitter to journalism, for example as a tool for reporters to collect information and for organizations to disseminate news to the public. We consider the reciprocal perspective, carrying out a survey of news media-related content within Twitter. Using a random sample of 1.8 billion tweets over four months in 2014, we look at the distribution of activity across news media and the relative dominance of certain news organizations in terms of relative share of content, the Twitter behavior of news media, the hashtags used in news content versus Twitter as a whole, and the proportion of Twitter activity that is news media-related. We find a small but consistent proportion of Twitter is news media-related (0.8 percent by volume); that news media-related tweets focus on a different set of hashtags than Twitter as a whole, with some hashtags such as those of countries of conflict (Arab Spring countries, Ukraine) reaching over 15 percent of tweets being news media-related; and we find that news organizations’ accounts, across all major organizations, largely use Twitter as a professionalized, one-way communication medium to promote their own reporting. Using Latent Dirichlet Allocation topic modeling, we also examine how the proportion of news content varies across topics within 100,000 #Egypt tweets, finding that the relative proportion of news media-related tweets varies vastly across different subtopics. Over-time analysis reveals that news media were among the earliest adopters of certain #Egypt subtopics, providing a necessary (although not sufficient) condition for influence.

Hemank Lamba, Momin M. Malik, and Jürgen Pfeffer. 2015. A tempest in a teacup? Analyzing firestorms on Twitter. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 (ASONAM 2015), 17–24. August 25–28, 2015, Paris, France. doi: 10.1145/2808797.2808828. Best student paper award. [ACM link] [BibTeX]

‘Firestorms,’ sudden bursts of negative attention in cases of controversy and outrage, are seemingly widespread on Twitter and are an increasing source of fascination and anxiety in the corporate, governmental, and public spheres. Using media mentions, we collect 80 candidate events from January 2011 to September 2014 that we would term ‘firestorms.’ Using data from the Twitter decahose (or gardenhose), a 10% random sample of all tweets, we describe the size and longevity of these firestorms. We take two firestorm exemplars, #myNYPD and #CancelColbert, as case studies to describe more fully. Then, taking the 20 firestorms with the most tweets, we look at the change in mention networks of participants over the course of the firestorm as one method of testing for possible impacts of firestorms. We find that the mention networks before and after the firestorms are more similar to each other than to those of the firestorms, suggesting that firestorms neither emerge from existing networks, nor do they result in lasting changes to social structure. To verify this, we randomly sample users and generate mention networks for baseline comparison, and find that the firestorms are not associated with a greater than random amount of change in mention networks.

Momin M. Malik, Hemank Lamba, Constantine Nakos, and Jürgen Pfeffer. 2015. Population bias in geotagged tweets. In Papers from the 2015 ICWSM Workshop on Standards and Practices in Large-Scale Social Media Research (ICWSM-15 SPSM), 18–27. May 26, 2015, Oxford, UK. [Updated version, Chapter 1 from dissertation] [AAAI Digital Library] [Slides] [BibTeX]

Geotagged tweets are an exciting and increasingly popular data source, but like all social media data, they potentially have biases in who are represented. Motivated by this, we investigate the question, ‘are users of geotagged tweets randomly distributed over the US population’? We link approximately 144 million geotagged tweets within the US, representing 2.6m unique users, to high-resolution Census population data and carry out a statistical test by which we answer this question strongly in the negative. We utilize spatial models and integrate further Census data to investigate the factors associated with this nonrandom distribution. We find that, controlling for other factors, population has no effect on the number of geotag users, and instead it is predicted by a number of factors including higher median income, being in an urban area, being further east or on a coast, having more young people, and having high Asian, Black or Hispanic/Latino populations.

Reports and blogging

Christelle Tessono, Yuan Stevens, Momin M. Malik, Supriya Dwivedi, Sonja Solomun, and Sam Andrey. 2022. AI oversight, accountability and protecting human rights: Comments on Canada’s proposed Artificial Intelligence and Data Act. November 2. Cybersecure Policy Exchange, Center for Information Technology Policy at Princeton University, and Centre for Media, Technology and Democracy at McGill University. [Report website]

Momin M. Malik. 2019. Can algorithms themselves be biased? Medium, Berkman Klein Center Collection. April 24, 2019. [Medium link] [Mobile-friendly PDF]

Io Flament, Cristina Lozano, and Momin M. Malik. 2017. Data-driven planning for sustainable tourism in Tuscany. Cascais, Portugal: Data Science for Social Good Europe. [Report]


Sayash Kapoor, Emily Cantrell, Kenny Peng, Thanh Hien Pham, Christopher A. Bail, Odd Erik Gundersen, Jake M. Hofman, Jessica Hullman, Michael A. Lones, Momin M. Malik, Priyanka Nanayakkara, Russell A. Poldrack, Inioluwa Deborah Raji, Michael Roberts, Matthew J. Salganik, Marta Serra-Garcia, Brandon M. Stewart, Gilles Vandewiele, and Arvind Narayanan. 2023. REFORMS: Reporting Standards for Machine Learning Based Science. [arXiv:2308.07832]

Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear reporting standards for ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist (Reporting Standards For Machine Learning Based Science). It consists of 32 questions and a paired set of guidelines. REFORMS was developed based on a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility.

Momin M. Malik, Afsaneh Doryab, Michael Merrill, Jürgen Pfeffer, and Anind K. Dey. 2020. Can smartphone co-locations detect friendship? It depends how you model it. [arXiv:2008.02919]

We present a study to detect friendship, its strength, and its change from smartphone location data collected among members of a fraternity. We extract a rich set of co-location features and build classifiers that detect friendships and close friendship at 30% above a random baseline. We design cross-validation schema to test our model performance in specific application settings, finding it robust to seeing new dyads and to temporal variance.

Momin M. Malik. 2020. A hierarchy of limitations in machine learning. [arXiv:2002.05193]

“All models are wrong, but some are useful,” wrote George E. P. Box (1979). Machine learning has focused on the usefulness of probability models for prediction in social systems, but is only now coming to grips with the ways in which these models are wrong—and the consequences of those shortcomings. This paper attempts a comprehensive, structured overview of the specific conceptual, procedural, and statistical limitations of models in machine learning when applied to society. Machine learning modelers themselves can use the described hierarchy to identify possible failure points and think through how to address them, and consumers of machine learning models can know what to question when confronted with the decision about if, where, and how to apply machine learning. The limitations go from commitments inherent in quantification itself, through to showing how unmodeled dependencies can lead to cross-validation being overly optimistic as a way of assessing model performance.

Other research

I have worked on projects outside of my main focus, contributing data analysis and/or theory.

Gabriel Ferreira, Momin Malik, Christian Kästner, Jürgen Pfeffer, and Sven Apel. 2016. Do #ifdefs influence the occurrence of vulnerabilities? An empirical study of the Linux Kernel. In Proceedings of the 20th International Systems and Software Product Line Conference (SPLC ’16), 65–73. September 19–23, 2016, Bejing, China. doi: 10.1145/2934466.2934467. Nominated for Best Paper Award. [ACM link] [arXiv preprint] [BibTeX]

Kathleen M. Carley, Momin Malik, Peter M. Landwehr, Jürgen Pfeffer, and Michael Kowalchuck. 2016. Crowd sourcing disaster management: The complex nature of Twitter usage in Padang Indonesia. Safety Science 90, 48–61. doi: 10.1016/j.ssci.2016.04.002. [ScienceDirect link (paywall)]

Previous works

These are works done before my PhD. I am still proud of them, but they are quite different from my subsequent research.

Urs Gasser, Momin Malik, Sandra Cortesi, and Meredith Beaton. 2013. Mapping approaches to news literacy curriculum development: A navigation aid. Berkman Center Research Publication No. 2013-25. [SSRN link]

Momin Malik, Sandra Cortesi, and Urs Gasser. 2013. The challenges of defining ‘news literacy’. Berkman Center Research Publication No. 2013-20. [SSRN link]

Momin M. Malik. 2013. The role of incumbency in field emergence: The case of Internet studies. Poster presented at the Science of Team Science (SciTS) Conference 2013, Northwestern University, Evanston, IL, June 24–27, 2013. [PDF]
(Note that this is a poster version of my MSc thesis, adapted for the topic of SciTS. Also, I have since realized the error of a non-statistical approach to significance claims.)

Momin M. Malik. 2012. Networks of collaboration and field emergence in ‘Internet Studies’. Thesis submitted in partial fulfillment of the degree of MSc in Social Science of the Internet at the Oxford Internet Institute at the University of Oxford. Oxford Internet Institute, University of Oxford, Oxford, UK. [PDF]

Urs Gasser, Sandra Cortesi, Momin Malik, and Ashley Lee. 2012. Youth and digital media: From credibility to information quality. Berkman Center Research Publication No. 2012-1. [SSRN link]

Urs Gasser, Sandra Cortesi, Momin Malik, and Ashley Lee. 2010. Information quality, youth, and media: A research update. Youth Media Reporter. [Online]

Momin M. Malik. 2009. Survey of state initiatives for conservation of coastal habitats from sea-level rise. Rhode Island Coastal Resources Management Council. [PDF]

Momin M. Malik. 2008. Rediscovering Ramanujan. Thesis submitted in partial fulfillment for an honors degree in History and Science. The Department of the History of Science, Harvard University, Cambridge, MA. [PDF]


“AI translation for healthcare: Ethics & bias.” 2023 FDA Experiential Learning Program, virtual site visit to Software as a Medical Device (SaMD) Regulatory group. Center for Digital Health, Mayo Clinic. Rochester, MN [held online], September 20, 2023.

“What AI ethics ought to be about: A framework for healthcare based on AI as correlation-only modeling” (invited talk). AI Speaker Series. Center for Digital Health, Mayo Clinic. Rochester, MN [held online], June 19-21, 2023.

“Don’t trust explainable AI: Proper validation is what matters” (poster presentation). AI Summit. Department of Artificial Intelligence & Informatics, Mayo Clinic. Rochester, MN, June 19-21, 2023. [Poster]

“Critique and quantitative methods in the case for reparations”. The FXB Center’s Making the Public Health Case for Reparations Methods Workshop. François-Xavier Bagnoud Center for Health and Human Rights at Harvard University. Boston, MA, June 5, 2023. [Slides]

“Generalizability, meaningfulness, and meaning: Machine learning in the social world” (invited talk). Seminario Conjunto de Estadística y Ciencia de Datos [Joint Seminar on Statistics and Data Science], Centro de Investigación en Matemáticas (CIMAT). Guanajuato, Mexico [held online], May 10, 2023. [Slides]

“Conceptualizing progress: Beyond the ‘common task framework’.” Part of “Beyond the data and model: Integration, enrichment, and progress”, Webinar 3 in support of the NIH/NCATS Bias Detection Tools for Clinical Decision Making Challenge. With Shauna M. Overgaard, Young J. Juhn, and Chung Il Wi. National Center for Advancing Translational Sciences, National Institutes of Health. [online], February 17, 2023. [Slides] [Video]

Invited panelist, “Incorporating ethical thinking into research & innovation through education, planning, conduct, and communication.” Chaired by Michael Hawes, with session organizers Jing Cao and Stephanie Shipp, and co-panelists James Giordano, Jeri Mulrow, Katie Shay, and Nathan Colaner. Sponsored by Committee on Professional Ethics (primary), Statistics Without Borders, Committee on Scientific Freedom and Human Rights, and Committee on Funded Research. JSM 2022 Invited Session for ASA COPE, Joint Statistical Meeting 2022. Washington, DC, August 7, 2022.

“When (and why) we shouldn’t expect reproducibility in machine learning-based science: Culture, causality, and metrics as estimators” (invited talk). The Reproducibility Crisis in ML-based Science [workshop]. Center for Statistics and Machine Learning, Princeton University. [online], July 28, 2022. [Slides] [Video]

“Ethical considerations for measuring impact to health care” (invited guest lecture). AIHC 5030: Introduction to Deployment, Adoption & Maintenance of Artificial Intelligence Models/Algorithms, Spring 2022 (Instructors: Dr. Shauna Overgaard, PhD, and Dr. Chris Aakre, MD). Artificial Intelligence in Health Care Track, Mayo Clinic Graduate School of Biomedical Sciences. Rochester, MN [remote], June 2, 2022.

“Ethics in the lifecycle of AI: From research and development to clinical implementation” (invited guest lecture). CTSC 5350: Ethical Issues in Artificial Intelligence and Information Technologies, Spring 2022 (Instructors: Dr. Richard Sharp, PhD and Dr. Barbara Barry, PhD). Clinical and Translational Sciences Track, Mayo Clinic Graduate School of Biomedical Sciences. Rochester, MN [remote], May 31, 2022.

“A critical perspective on measurement in digital trace data and machine learning, and implications for demography” (invited talk). Max Planck Institute for Demographic Research Seminar Series, Rostock, Germany, April 26, 2022. [Slides]

Invited panelist, “Predictive justice.” With co-panelist Safiya Noble. AERA Presidential Session, “Expansive futures for disability intersectional learning research: Braiding culture, history, equity, and enabling technologies.” 2022 American Educational Research Association Annual Meeting. San Diego, CA, Saturday, April 23, 2022.

“The technical perspective on ethics: An overview and critique” (invited talk). Center for Digital Ethics and Policy 2022 Annual International Symposium: Digital Ethics for a Sustainable Society. School of Communication, Loyola University Chicago. [held online], March 29, 2022. [Slides]

“Critical approaches to machine learning” (invited session). ICQCM Summit, Baltimore, MD, Sunday, March 21, 2022. [Slides]

Invited panelist, “AI, race, and algorithmic justice in research.” Moderated by Ezekiel Dixon-Román, with co-panelists Meredith Broussard and Kadija Ferryman. ICQCM Summit, Baltimore, MD, Saturday, March 22, 2022.

Invited panelist, “Approaches to managing trustworthy AI.” Moderated by Maggie Little, with co-panelists Ashley Casovan, Jacob Metcalf, Rayid Ghani, and Ram Kumar. Panel 5 at Kicking off NIST AI Risk Management Framework workshop. National Institutes of Standards and Technology, U.S. Department of Commerce. [held online], October 20, 2021.

“Machine learning in the hierarchy of methodological limitations” (invited talk). TILT Seminar Series 2021, Tilburg Institute for Law, Technology, and Society, Tilburg University. Tilburg, Netherlands [held online], September 21, 2021. [Slides]

“Networks and graphical models: A survey.” Networks 2021. July 6, 2021 [delivered online]. [Slides]

“Defining critical quantitative and computational methodologies” (invited session). Moderated by Ezekiel Dixon-Román. William T. Grant AQC SCHOLARS Virtual Seminar Series, Institute in Critical Quantitative, Computational, & Mixed Methodologies, Johns Hopkins University. May 27, 2021 [delivered online]. [Slides]

“Media Cloud: Massive open source collection of global news on the open Web.” Fifteenth International AAAI Conference on Web and Social Media (ICWSM-2021). [held online], June 10, 2021.

“Critical theory and quantification” (invited session). With Maya Malik and Ezekiel Dixon-Román. Histories of Artificial Intelligence: A Genealogy of Power, Mellon Sawyer Seminar, University of Cambridge. January 20, 2021 [delivered online]. [Slides]

“A hierarchy of limitations in machine learning” (invited talk). Math and Democracy Seminar Series, Center for Data Science, New York University. October 5, 2020, New York, New York [delivered online]. [Slides]

“A hierarchy of limitations in machine learning: Data biases and the social sciences” (invited webinar). Webinar Series: Data Cultures in Higher Education, Faculty of Psychology and Education, Universitat Oberta de Catalunya (Open University of Catalonia). September 29, 2020, Barcelona, Spain [online]. [Slides] [Video, with Spanish subtitles (Una jerarquía de limitaciones en el Machine Learning. Sesgos en su uso en investigaciín social)]

“Machine learning won’t save us: Dependencies bias cross-validation estimates of model performance.” 2020 Sunbelt Virtual Conference of the International Network for Social Network Analysis. July 17, 2020. [Slides]

“Anti-racism and COVID-19” (invited talk). With Eugene T. Richardson, William A. Darity, Jr., James Holland Jones, A. Kirsten Mullen, and Paul E. Farmer. Global Health and Social Medicine Seminar Series, Department of Global Health & Social Medicine, Harvard Medical School. June 3, 2020, Cambridge, Massachusetts [delivered online].

“Antiracism and COVID-19” (invited talk). With Eugene T. Richardson. Antiracism & Technology Design Seminar, Space Enabled research group, MIT Media Lab. May 13, 2020, Cambridge, Massachusetts [delivered online].

“Critical technical practice revisited: Towards `analytic actors’ in data science” (invited talk). STS Circle, Program on Science, Technology & Society, Harvard Kennedy School. March 5, 2020, Cambridge, Massachusetts. [Slides]

“Revisiting ‘all models are wrong’: Addressing limitations in big data, machine learning, and computational social science” (invited talk). Wednesdays@NICO Seminar Speaker Series, Northwestern Institute on Complex Systems, Northwestern University. February 5, 2020, Evanston, Illinois. [Slides] [Video]

“How STS can improve data science” (invited talk). Science, Technology and Society Lunch Seminar, Tufts University. January 23, 2020, Medford, Massachusetts. [Slides]

“A hierarchy of limitations in machine learning” (invited talk). Microsoft Research New England. December 3, 2019, Cambridge, Massachusetts. [Slides]

“Correlates of oppression: Machine learning and society” (invited talk). Guest lecture in MIT CMS.701/CMS.901: Current Debates in Media, Fall 2019 (Instructor: Dr. Sasha Costanza-Chock). Comparative Media Studies, Massachusetts Institute of Technology. October 30, 2019, Cambridge, Massachusetts. [Slides]

“Statistics and machine learning: Foundations, limitations, and ethics” (invited talk). Colby College Department of Mathematics and Statistics, Colloquium Fall 2019, Colby College. October 7, 2019, Waterville, Maine. [Slides]

“A critical introduction to machine learning.” 2019 ACM Richard Tapia Celebration of Diversity in Computing Conference. September 19, 2019, Marriott Marquis San Diego Marina, San Diego, California. [Slides]

“Everything you ever wanted to know about network statistics but were afraid to ask.” XXXIX Sunbelt Social Networks Conference of the International Network for Social Network Analysis. June 18, 2019, UQAM, Montreal, Quebec. [Slides] [R script]

“Three open problems for historians of AI.” Towards a History of Artificial Intelligence, Columbia University. May 24, 2019, New York, New York. [Slides] [Video]

“Interpretability is a red herring: Grappling with ‘prediction policy problems.’ 17th Annual Information Ethics Roundtable: Justice and Fairness in Data Use and Machine Learning. April 5, 2019, Northeastern University, Boston, Massachusetts. [Slides and draft] [Draft only]

“What can AI do with copyrighted data?” (invited talk). Bracing for Impact – The Artificial Intelligence Challenge: A Roadmap for AI Governance in Canada. Part II: Data, Policy & Innovation. IP Osgoode, Osgoode Hall Law School, York University. March 21, 2019, Toronto Reference Library, Toronto, Canada.

“The ethical implications of technical limitations” (invited talk). Fairness, Accountability & Transparency/Asia, Digital Asia Hub and ACM/FAT*. January 12, 2019, Shun Hing College, University of Hong Kong, Hong Kong.

“Machine learning for social scientists.” Fairness, Accountability & Transparency/Asia, Digital Asia Hub and ACM/FAT*. January 11, 2019, Shun Hing College, University of Hong Kong, Hong Kong. [Slides]

“‘AI’ is a lie: Getting to the real issues.” AGTech Forum, Berkman Klein Center for Internet & Society at Harvard University. December 13, 2018, Cambridge, Massachusetts. [Slides]

“Theorizing sensors for social network research” (invited talk). Computational Social Science Institute, UMass Amherst. December 7, 2018, Amherst, Massachusetts. [Slides]

“What everyone needs to know about ‘prediction’ in machine learning” (invited talk). Leverhulme Centre for the Future of Intelligence, University of Cambridge. December 3, 2018, Cambridge, UK. [Slides]

“Anxiety, crisis, and a computational future for journalism.” Philip Merrill College of Journalism / College of Information Studies, University of Maryland. November 27, 2018, College Park, Maryland.

“Networks, yeah! The representation of relations” (invited talk). Data & Donuts, DigitalHKS, Harvard Kennedy School, Harvard University. November 2, 2018, Cambridge, Massachusetts.

“Demystifying AI: Terms of disservice.” AI Working Group, Berkman Klein Center for Internet & Society. October 28, 2018, Cambridge, Massachusetts.

“Surprising aspects of “prediction” in data science.” 0213eight, Harvard Alumni Association. October 13, 2018, Cambridge, Massachusetts.

“From the forest to the swamp: Modeling vs. implementation in data science” (invited talk). Techtopia @ Harvard University. October 2, 2018, Cambridge, Massachusetts.

Thesis defense: Bias and beyond in digital trace data. Institute for Software Research, School of Computer Science, Carnegie Mellon University. August 9, 2018, Pittsburgh, Pennsylvania. [Slides]

“Friendship and proximity in a fraternity cohort with mobile phone sensors.” XXXVIII Sunbelt Conference of the International Network for Social Network Analysis. Modeling network dynamics (ses15.05). July 1, 2018, Utrecht, Netherlands. [Slides]

“A critical introduction to statistics and machine learning.” Cascais Data Science for Social Good Europe Fellowship, Nova School of Business and Economics, Universidade NOVA de Lisboa. August 15, 2017, Cascais/Lisbon, Portugal. [Part I Slides] [Part II Slides]

“A social scientist’s guide to network statistics” (guest lecture). 70/73-449: Social, Economic and Information Networks, Fall 2016 (Instructor: Dr. Katharine Anderson). Undergraduate Economics, Tepper School of Business, Carnegie Mellon University. November 10, 2016, Pittsburgh, Pennsylvania. [Slides]

“Platform effects in social media networks.” 2nd Annual International Conference on Computational Social Science. Social Networks 1. June 24, 2016, Evanston, Illinois. [Slides]

“Identifying platform effects in social media data.” Tenth International AAAI Conference on Web and Social Media (ICWSM-16). Session I: Biases and Inequalities. May 18, 2016, Cologne, Germany. [Slides]

“Social media data and computational models of mobility: A review for demography.” 2016 ICWSM Workshop on Social Media and Demographic Research (ICWSM-16 SMDR). May 17, 2016, Cologne, Germany. [Slides]

“Platform effects in social media networks.” XXXVI Sunbelt Conference of the International Network for Social Network Analysis. Social Media Networks: Challenges and Solutions (Sunday AM2). April 10, 2016, Newport Beach, California. [Slides]

“A social scientist’s guide to network statistics (presented to statisticians).” stat-network seminar, Department of Statistics, Carnegie Mellon University. March 25, 2016, Pittsburgh, Pennsylvania. [Slides not public, see these slides for the same content.]

“Ethical and policy issues in predictive modeling” (guest lecture). 08-200/08-630/19-211: Ethics and Policy Issues in Computing, Spring 2016 (Instructor: Professor James Herbsleb). Institute for Software Research, School of Computer Science, Carnegie Mellon University. March 1, 2016, Pittsburgh, Pennsylvania. [Slides]

“Population bias in geotagged tweets”. 2015 ICWSM Workshop on Standards and Practices in Social Media Research (ICWSM-15 SPSM). May 26, 2015, Oxford, UK. [Slides]

“Inferring social networks from sensor data”. XXXIV Sunbelt Conference of the International Network for Social Network Analysis. Network Data Collection (Saturday AM2). February 22, 2014, St Pete Beach, Florida. [Slides]

Acknowledged in

I try to properly acknowlege people who contribute to my work, and conversely am proud to be found in the acknowledgements of the following works:

Apryl Williams. 2024. Not my type: Automating sexual racism in online dating. Stanford, CA: Stanford University Press.

Sireesh Gururaja, Amanda Bertsch, Clara Na, David Gray Widder, and Emma Strubell. 2023. To build our future, we must know our past: Contextualizing paradigm shifts in Natural Language Processing. [Preprint]

Barbara Kiviat. 2023. The moral affordances of construing people as cases: How algorithms and the data they depend on obscure narrative and noncomparative justice. Sociologcal Theory. doi: 10.1177/07352751231186797. [IEEE link]

Ben Green. 2021. Data science as political action: Grounding data science in a politics of justice. Journal of Social Computing 2 (3): 249–265. doi: 10.23919/JSC.2021.0029. [IEEE link]

Jonnie Penn. 2021. Algorithmic silence: A call to decomputerize. Journal of Social Computing 2 (4): 337–356. doi: 10.23919/JSC.2021.0023. [IEEE link]

Chelsea Barabas, Audrey Beard, Theodora Dryer, Beth Semel, and Sonja Solomun. 2020. Abolish the #TechToPrisonPipeline. Coalition for Critical Technology, June 22. [Letter website]

Dariusz Jemielniak. 2019. Socjologia Internetu (in Polish). Warszawa: Wydawnictwo Naukowe Scholar. [Publisher website] [Sample content and reference list from author]

Keiki Hinami, Michael J. Ray, Kruti Doshi, Maria Torres, Steven Aks, John J. Shannon, and William E. Trick. 2019. Prescribing associated with high-risk opioid exposures among non-cancer chronic users of opioid analgesics: A social network analysis. Journal of General Internal Medicine 34: 2443–2450. doi: 10.1007/s11606-019-05114-3. [Springer link (paywall)] [PubMed record (abstract only)]

Viktor Mayer-Schönberger and Kenneth Cukier. 2013. Big Data: A revolution that will transform how we live, work, and think. Boston and New York: Eamon Dolan/Houghton Mifflin Harcourt. [Book website]

Mary Madden, Amanda Lenhart, Sandra Cortesi, Urs Gasser, Maeve Duggan, Aaron Smith, and Meredith Beaton. 2013. Teens, social media, and privacy. Pew Internet & American Life Project. [Report website]

Press, quotes, and commentaries/editorials

Quotes from me, or notable coverage/mentions of my work:

Will Knight. 2022. Sloppy use of machine learning is causing a ‘reproducibility crisis’ in science. Wired, August 10. [Wired link]

Elizabeth Gibney. 2022. Could machine learning fuel a reproducibility crisis in science? ‘Data leakage’ threatens the reliability of machine-learning use across disciplines, researchers warn. Nature 608: 250–251. doi: 10.1038/d41586-022-02035-w. [Nature link]

Scottie Andrew. 2021. Reparations for slavery could have reduced Covid-19 transmission and deaths in the US, Harvard study says. CNN, February 16. [CNN link]

Wendy Hui Kyong Chun and Jorge Cottemay. 2020. Reimagining Networks: An interview with Wendy Hui Kyong Chun. The New Inquiry, May 12. [New Inquiry link]

Susan Cassels and Sigrid Van Den Abbeele. 2021. A call for epidemic modeling to examine historical and structural drivers of racial disparities in infectious disease [Commentary on “Reparations for Black American descendants of persons enslaved in the U.S. and their potential impact on SARS-CoV-2 transmission”]. Social Science & Medicine 276: 113833. doi: 10.1016/j.socscimed.2021.113833. [Science Direct link]

Bob Franklin. 2016. The future of journalism: Risks, threats, and opportunities [Mention of “A macroscopic analysis of news content on Twitter”]. Journalism Practice 10 (7): 805–807. doi: 10.1080/17512786.2016.1197640. [Taylor & Francis link]

Reviewing, organizing, and program committees

I am a Senior PC member for the International AAAI Conference on Web and Social Media (ICWSM), 2020–present.

I was Sponsorship Chair for the 14th International Conference on Web and Social Media (ICWSM-2020), Atlanta, Georgia, June 8–June 11, 2020.

I was an Editorial Board member for the 2019 special issue on “Critical Data and Algorithms Studies” in Frontiers in Big Data Data, Mining and Management (Frontiers Media S.A.).

I was co-organizer of the Workshop on Critical Data Science at 13th International Conference on Web and Social Media (ICWSM-2019), Munich, Germany, June 11, 2019.

I was posters co-chair for the 11th International ACM Web Science Conference 2019 (WebSci ’19), Boston, Massachusetts, June 30–July 3, 2019.

I have done peer review for:


I may be reached at gmail (my first name dot my last name).

This website is my primary online presence, but I maintain profiles elsewhere as well: