Washington, DC has a gunshot detector network. In technical terms, a system of 300 acoustic sensors mounted on buildings with fast enough networking that sounds can be triangulated using precise timing.*
The Washington Post broke the story to most residents. Gun violence in cities is an unquestionable scourge and the detector system is a clear step toward a solution, so the story was entirely positive.
ShotSpotter, the company* that installed DC’s system, offers a cloud service by which their own personnel monitor the sensors, listen to recordings, and pass the results to police officers. While the police provided data to the media - the raw data for the WP’s story - what’s the raw product? After all, a network of gunshot detectors is a network of microphones, installed throughout a city and running 24/7. The detectors don’t just detect high-pressure sounds: SST advertises them as having subsonic to supersonic range. The implication should be clear: gunshot detectors are microphones. Their being microphones is certainly useful for their main purpose, since many shooters will say their own or the victim’s name right before or after the crime. The applications outside of this purpose, like the ability to record private conversations, are more troublesome.4
ShotSpotter’s UI, from SST via Richmond Confidential
ShotSpotter retains the data to let police review incidents. The database is geographically distributed, just like any good web service, and one can only assume that it runs on public TCP.
Above are Capital Bikeshare stations - 337 locations, including a few that have closed or haven’t opened yet. The Bikeshare system publishes trip data every quarter - start & end locations, with minutely accuracy.
The District Department of Transportation maintains cameras that remotely monitor traffic and occasionally prove useful in criminal proceedings. They license the data from these cameras to a private company called TrafficLand.com, that then resells the data to online and television news.
While some municipalities do this themselves - for instance, the NYC DOT’s system - TrafficLand* has similar contracts with 50+ departments and handles 18,000 cameras. The cameras in DC update every two seconds or less.
Cities are sensors.
I’ll remove my tin-foil hat and add a few thoughts.
Given the sensor infrastructure that’s public and obvious in operation, the most powerful technique is cross-referencing. That is, when data is released with minutely precision or a few extra decimal places of latitude and longitude, the potential hacks multiply. Which is to say,
DC traffic camera on TrafficLand.com
Several cameras include Capital Bikeshare racks in their field of view. Given a 2 second camera frequency and minutely Bikeshare data frequency, it’s reasonable to assume that by recording all cameras and cross-referencing bike trips one could de-anonymize the dataset.
Deanonymization is a trick in which you can recover personally identifiable information from supposedly anonymized datasets. The first popular example was Netflix’s dataset, which in 2006 was successfully remapped to specific members. Later, New York City’s weakly anonymized taxi data was quickly decoded to reveal exact license plate numbers.anon
Storing one day of one camera’s footage costs 200MB. Roughly 175 cameras that cover the DMV would consume 35GB/day or 3.1 terabytes of storage per quarter. That’s around $90 to store three months of data on Amazon’s S3 service.s3
These sensor systems are large investments, made over long periods of time, with support of the government and often the community. Capital Bikeshare’s data releases are well-known and traffic cameras are an expected utility. The MPD’s relationship with ShotSpotter has been relatively quiet, but is casually mentioned in annual reports. On the other coast, Oakland, CA residents opposed the removal of ShotSpotter, saying it did good for their community.
But traffic cameras are cameras, gunshot detectors are microphones.* Some seem innocuous, some creepy, some cross into the public domain, and some don’t. The gap between stated purpose and usage is real: in Boston, a license plate reader system that was supposed to be tracking stolen cars seemed to do everything but.
And so sensor data feels different. Letting everyone listen through a gunshot detector would expose not just crime but police brutality. Open access to high-resolution cameras throughout the city will show traffic and also presidential convoys.
Every new eye added to the network, every new connection from the world to a database, has mixed consequences. In Ferguson, both volunteers and police using body cameras to establish wrongdoing and record encounters. It’s remarkable that surveillance could be a tool for activism.
Unfiltered, sensor data shows everything, reveals everything. Or rather, where the sensors are. In DC, that’s where the police chose to install the sensors and it initially excluded the far northwest - so a gunshot-detected map of the city is more a map of where gunshot detectors are than gunshots.
Commuting to the old office, I would pass three cameras two times a day.
This is one of them, after being split into 2-second chunks and run through simple and stupid processing to show only foreground, moving objects.
Thoughts and prior art are greatly appreciated - let me know at @tmcw.
Thanks to Eric Mill for reviewing drafts of this article. Thanks to the satellite crew at Mapbox for image-processing advice.