Back to all blogs

Six years of SafeGraph for Academic Researchers: What 700+ Published Papers Tell Us About The Value of Data Access

March 19, 2026
By
Dewey Data

In March 2020, as the pandemic brought the world to a standstill, SafeGraph did something unusual: they opened their data to academic researchers for free. The timing was urgent. Researchers needed granular location data to understand how people were moving, where they were gathering, and how policy interventions were changing behavior in real time. SafeGraph provided it, and researchers responded immediately.

Six years later, the scale of what followed is worth pausing on. Dewey's published research database now tracks 720 papers that cite SafeGraph as a data source. Research topics span economics, public health, urban planning, finance, GIS, marketing, computer science, and public policy. Papers have appeared in Nature Communications, PNAS, Management Science, Journal of Health Economics, Journal of Urban Economics, and hundreds of working papers across NBER and SSRN. Graduate students have used it for dissertations. Urban planners have used it to study the recovery of downtown economies. And that body of work is still growing.

This matters beyond SafeGraph. For data providers considering whether to make their data available to academic researchers, and for librarians and research administrators evaluating data investments, the SafeGraph case offers a clear-eyed lesson about research timelines and what genuine long-term impact looks like. Academic research doesn't move fast. A dataset opened in 2020 produced papers in every following year, each wave building on the methods and findings of the last. Data that earns trust in the academic community keeps generating citations for years.

SafeGraph data usage across academic disciplines

With SafeGraph, every POI in the dataset is individually identifiable (with a persistent Placekey identifier, precise coordinates, NAICS industry code, and brand affiliation). This means researchers can track what happens to a specific Walmart or a specific urgent care clinic or a specific downtown coffee shop across months and years and compare it to every other establishment of the same type across the country.

The result, in the Dewey research database, looks like this across disciplines:

  • Urban Planning & Transportation: The largest cluster by far, covering downtown recovery, transportation access, neighborhood change, and the geography of services
  • Economics: COVID-19 economic impacts, local labor markets, consumer behavior, retail dynamics, and place-based policy evaluation
  • GIS: Spatial methodology papers and environmental health research
  • Public Policy: Policing, school closures, social movement geography, disaster response, and public health interventions
  • Healthcare: Pandemic transmission modeling, substance use research, healthcare access and equity
  • Finance: Asset pricing, supply chain disruption, and real estate impacts
  • Marketing: Omnichannel retail behavior, brand loyalty, advertising response, and media influence
  • Computer Science: Mobility modeling, machine learning applications, network analysis

One dataset can cross departmental boundaries in ways one would not expect.

SafeGraph data in practice: Recent published examples

To make that breadth concrete, here are a handful of papers from the database that show the range of questions SafeGraph has made answerable.

Multifaceted urban resilience after disruption (Texas A&M, University of South Florida, Grand Valley State University, and Emory University): This paper used a heterogeneous graph neural network trained on SafeGraph Places data to analyze how different types of urban locations recover from disruptions. The Places dataset provided the POI-level structure needed to build the graph network, linking establishments by type, proximity, and visitor overlap.

COVID-19 and travel to major landmarks (University of Florida): Researchers combined SafeGraph Places data with Google Trends, Twitter/X, and Tripadvisor to analyze how pandemic conditions affected the travel distances and search behavior associated with prominent landmarks. The POI data established the precise locations and footprints of destinations at scale.

Stadium construction and neighborhood change (University of Pennsylvania): Using micro-level data including SafeGraph, this paper evaluates whether NFL stadium construction generates lasting agglomeration effects in surrounding neighborhoods, examining housing prices, business activity, and demographic composition before and after stadium openings.

EV charging stations and local business spending (MIT): Using SafeGraph Spend data across 4,000+ EV charging stations and 140,000 California businesses, researchers showed that installing a single charging station increased nearby establishment spending by 1.4% in 2019, translating to $6.7 million in incremental economic activity, and that the effect was concentrated in food, entertainment, and retail. This paper could not have been written without establishment-level transaction data.

SafeGraph Spend Data: What it contains and what research it can support

SafeGraph currently offers two primary datasets through Dewey: Places (global POI data updated monthly) and Spend (aggregated, anonymized credit and debit card transactions tied to individual POIs, available back to January 2019). 

For researchers whose questions involve economic activity at the place level, Spend is the dataset to know about, and it remains underutilized relative to what it can support.

What's in it: Each monthly record in Spend corresponds to a single POI and includes total spend, transaction count, unique customer count, median spend per transaction and per customer, spending by day of week, and modeled demographic breakdowns by customer income and loyalty. Crucially, it pairs with Places through the Placekey identifier, so every transaction record inherits the full context of the POI it's attached to: location, industry, brand, open/close history, and building geometry.

What it can answer that other data can't: Because Spend is anchored to individually identified establishments rather than aggregated to ZIP codes or metro areas, it supports research designs that require a clear treated and control group of businesses, the kind of event-study and difference-in-differences frameworks that dominate applied microeconomics. If you want to know what happens to nearby restaurant revenue when a park opens, when a subway station closes, when a competitor arrives, or when a disaster strikes, you need establishment-level transaction data with geographic precision. Spend provides that.

Here are several research directions that are directly tractable with SafeGraph Spend today:

Retail competition and market structure: How do new entrants (a dollar store, a pharmacy, a fast-casual chain) affect spending at neighboring businesses? The establishment-level transaction data allows researchers to construct a matched comparison group of similar businesses that did not receive a nearby entrant, and measure spending changes before and after the opening. The median_spend_per_transaction and raw_num_customers fields separately identify whether effects work through basket size, customer traffic, or both.

Infrastructure and local economic development: The MIT EV charging paper is a template for a larger class of questions. What does a new transit stop, a new park, a new hospital, or a new broadband tower do to the businesses around it? Spend enables researchers to measure these spillovers with precision that aggregate economic statistics cannot achieve.

Policy evaluation in public finance: Local tax holidays, zoning changes, enterprise zone designations, business improvement districts, any policy that operates on the physical retail environment can be evaluated using Spend as the outcome measure. Researchers can use the NAICS codes from the linked Places data to define which business types should theoretically be affected, and test treatment effects with appropriate controls.

Natural disasters and economic resilience: Spend data going back to January 2019 means researchers have access to pre-disaster baselines for any major weather event or disaster since 2020. Tracking spending recovery at affected establishments, comparing recovery rates by business type and neighborhood income, and quantifying the economic cost of disruption are all direct applications.

How to access SafeGraph Spend Data through Dewey

A few things worth knowing before working with Spend for the first time:

Panel normalization matters: SafeGraph publishes a Spend Transaction Panel file each month, which tracks the size and composition of the underlying transaction panel over time. Because the panel fluctuates, longitudinal analyses should normalize raw spend figures against the panel to avoid conflating changes in coverage with changes in actual economic activity.

Missing rows, not zeroes: If a POI has fewer than four unique customers in a given month, it is excluded from that month's file entirely. This means blank rows represent suppressed data, not zero activity. Researchers modeling small or specialized businesses should account for this in their sample construction and be aware of potential selection bias if suppressions are non-random.

Placekey enables cross-dataset joining: Because Spend uses Placekey as its primary identifier, and SafeGraph has published code notebooks for joining Placekey to common research identifiers like CRSP, Compustat tickers, and Census geographies, it can be linked to a wide range of other data sources. Dewey's documentation includes sample Python code for mapping to Census block groups, tracts, and CBSAs.

Start with the seminar: SafeGraph's team recorded a deep-dive seminar with Dewey covering both Places and Spend walking through methodology, schema, and common analytical patterns. It's available on the Dewey seminars page and is the most efficient 60-minute orientation available before diving into the data.

SafeGraph has supported more academic research than perhaps any other commercial data provider in Dewey's catalog, and the Spend dataset represents the most immediate opportunity for researchers whose questions involve economic activity, consumer behavior, or place-based policy. Access it, along with all of SafeGraph's current datasets, through the Dewey platform at app.deweydata.io.