The city of Dallas provides a real-time snapshot of active police calls on Dallas Open Data. The published data set shows a block, location, nature of call, etc. for current active calls. It is updated every two minutes. These are calls where the unit assigned to the call has arrived and is currently working the call. Calls for service that are not releaseable due to privacy laws are not included.
The primary objective of this project is to capture these events in real-time and display them on a map of the city of Dallas. The map should automatically update as new calls are arrive, and older calls fall off. Server side events will deliver these updates to the front end.
Instead of geographic coordinates, the source data set provides an optional block field, and a location which may be a cross-street or actual address. In order to visualize these calls on a map, lattitude and longitude will be determined by forward geocoding location information. The project will:
The data pipeline is an event-driven workflow triggered when a new json file is downloaded from Dallas Open Data.
The workflow is divided into parallel branches. One branch for identifies additions, updates and deletions by comparing the current Open Data API response with the previous api response. Another branch attempts to forward geocode the call's location information. Calls and geographic data are cached in dynamodb. Dynamodb streams are utilized to push updates an API and persist updates in s3 for further analysis. Calls and Address event are synthesized on the front end.
The API prototype was built using python's FastAPI framework. It runs in a docker container hosted fly.io. Its methods allow fetching the most recent active calls with enriched location details, and streaming real-time updates to the client. When a new client connects, it will download all current calls, and then subscribe to /get-events/ to receive streaming updates for calls and address information.