Exploratory data analysis of public Chicago transportation datasets from Chicago Data Portal.
Chicago Data Portal provides traffic crash report information from all police districts starting from September 2017 to present, with data updated daily. Traffic Crashes data from electronic crash reporting system (E-Crash) are available for some police districts in 2015, but citywide data are not available until September 2017.
Crash data shows information about each traffic crash on city streets within the City of Chicago limits and under the jurisdiction of Chicago Police Department (CPD). About half of all crash reports, mostly minor crashes, are self-reported at the police district by the driver(s) involved and the other half are recorded at the scene by the police officer responding to the crash.
Many of the crash parameters, including street condition data, weather condition, and posted speed limits, are recorded by the reporting officer based on best available information at the time, but may disagree with posted information or other assessments on road conditions. A traffic crash within the city limits for which CPD is not the responding police agency, typically crashes on interstate highways, freeway ramps, and on local roads along the City boundary, are excluded from this dataset.
All crashes are recorded as per the format specified in the Traffic Crash Report, SR1050, of the Illinois Department of Transportation. The crash data published on the Chicago data portal mostly follows the data elements in SR1050 form. The current version of the SR1050 instructions manual with detailed information on each data elements is available here.
As per Illinois statute, only crashes with a property damage value of $1,500 or more or involving bodily injury to any person(s) and that happen on a public roadway and that involve at least one moving vehicle, except bike dooring, are considered reportable crashes. However, CPD records every reported traffic crash event, regardless of the statute of limitations, and hence any formal Chicago crash dataset released by Illinois Department of Transportation may not include all the crashes listed here.
Only crashes with a property damage value of $1,500 or more or involving bodily injury to any person(s) and that happen on a public roadway and that involve at least one moving vehicle, except bike dooring, are considered reportable crashes. However, CPD records every reported traffic crash event, regardless of the statute of limitations, and hence any formal Chicago crash dataset released by Illinois Department of Transportation may not include all the crashes listed in Traffic Crashes dataset.
You can download Chicago Traffic Crashes data in CSV
format from the Export menu on the crashes web page:
Note: as of 2023-01-31 Chicago Traffic Crashes dataset CSV
is about 362 MB and contains over 691K traffic crash reports with 49 columns describing crash details, road and weather conditions, injuries and damages.
This data and SQL scripts repository was created to demonstrate our new DuckDB Sql Tools VSCode IDE extension and other basic data tools available to developers, data analysts, and data scientists in Visual Studio Code for exploratory data analysis (EDA).
In order to use Traffic Crashes data, DuckDB, SQL scripts and our DuckDB Sql Tools Code extension to experiment with this data and tools, clone this chicago-transport
repository first:
$ git clone https://github.com/RandomFractals/chicago-transport
This transport demo repository contains /data
folder with SQL scripts to create Traffic Crashes DuckDB in-memory instance, export traffic crashes database in .parquet
format, and a simple select
SQL query to view the last 10K crash reports with most of the data columns from the Chicago Traffic Crashes dataset to get you started:
Note: due to the 100 MB
github file size limit, please download raw Chicago Traffic Crashes CSV
data following download link and instructions above.
After CSV
data download, copy it over to your local chicago-transport
project /data
folder and rename it to traffic-crashes.csv
.
Install Visual Studio Code IDE, Node.js runtime, and DuckDB Sql Tools extension to load and view Chicago Traffic Crashes data with DuckDB.
Demonstration of loading 362 MB
of all recorded Chicago Traffic Crash reports from saved data/traffic-crashes.csv
into DuckDB and querying it with DuckDB Sql Tools extension:
Note: to run create-traffic-crashes-duckdb-table.sql
with DuckDB SQL Tools change traffic-crashes.csv
data file path in CREATE TABLE
SQL statement to absolute path pointing to your local copy of chicago-transport
repository and data folder.
You can also explore our Observable and Quarto Chicago Transportation Notebooks 📚 collection created in 2022 on github or on Observable site.