And just as OSS is a community, it is also an ethos and a mindset. And from that mindset, a broader “open” movement has emerged, one that includes Open Data and Open Processes. Today, let’s talk about one of the most impactful offshoots of this movement: Open Data.
What Is Open Data?
I first encountered the concept of open data back in the 1980’s while working for NASA at the Marshall Space Flight Center. My mentors explained to me that, as a public institution supporting scientific discovery, much of the data that we collected would be made available to the broader public.
All those years ago, the “broader public” was assumed to be primarily other scientists and academics around the world. Since that time broadband is widespread and distribution of data is neither expensive nor difficult. Over time, data creators beyond the world of science and academia are big supporters of open data.
So, what qualifies as open data? According to the Open Knowledge Foundation, a dataset must be:
- Freely available for anyone to use, reuse, and redistribute.
- Accessible in a widely usable format (e.g., CSV, JSON, APIs).
- Licensed to allow open use (e.g., Creative Commons, Open Data Commons, Wikidata).
- Non-proprietary and not restricted by copyright, patents, or legal requirements.
(For a deeper dive, check out the https://opendefinition.org/ and https://opendatahandbook.org/guide/en/what-is-open-data/.)
Where Can I Find Open Data?
When I first heard about open data, I thought “Oh, that makes sense. The US government funds this research. It’s owned by the people of the USA.” Since that time, most government agencies at federal, state, and local levels now provide lots of data to the public that is not otherwise restricted from public view – a breadth of data sets ranging from budgets, census, crime stats, weather, public transport, property sales, and more.
Academic institutions also share many open data sets, ranging from entire museum collections and augmented reality (AR) museum walkthroughs, archives, curated manuscripts, and data sets from their researchers.
But open data is available from all sorts of businesses and for-profit enterprises. For example, you can access massive data sets covering equities markets, financial reports, supply chain activities, and much more. Other commercial enterprises enjoy community support for their open data sets, for example, Amazon provides a massive data set from their IMDB website covering more than 50,000 movies and TV shows, actors, crew members, and ratings. There’s even a weekly newsletter by Jeremy Singer-Vine with loads of links to interesting data sets. Here are some other top sources:
- The U.S. government’s open data portal, launched under President Obama.
- A treasure trove of community-contributed datasets for data science and machine learning.
- A collaborative project to create a free editable map of the world.
- Global development data across sectors and countries.
- While known for code, GitHub also hosts thousands of open data projects.
Why Open Data Matters
Open data is more than just a transparency tool—it’s a catalyst for innovation. Here’s how:
- Transparency: Citizens can hold institutions accountable.
- Innovation: Developers can build apps and services without licensing fees.
- Collaboration: Researchers can validate and build on each other’s work.
- Efficiency: Organizations avoid duplicating data collection efforts.
For example, open transit data powers real-time navigation apps. Open health data supports pandemic modeling. And open financial data fuels fintech innovation.
How You Can Get Involved
Just like contributing to OSS, there are many ways to support open data:
- Build apps using public APIs (e.g., weather, transport, health).
- Analyze datasets to uncover insights or create visualizations.
- Clean and enrich data to improve usability.
- Package datasets into accessible formats for others.
- Contribute to civic tech or help maintain open data portals.
If you're looking to sharpen your skills or build a portfolio, Open Data projects are a great place to start.
Open data is a powerful force for good. It democratizes access to information, fuels innovation, and strengthens civic engagement. Whether you're a developer, analyst, or just curious, there’s never been a better time to explore the world of open data.
Who knows? Maybe your next big idea will come from a dataset just waiting to be discovered.