Skip to Main Content
site header image

Creating Infographics: Free Datasets

 Google Public Data provides103 public international and national information datasets can be used and illustrated with a variety of graphs. Personal datasets can also be loaded. Graph can be shared by URL.

 

 

National Climatic Data Center

Curated by: National Centers for Environmental Information (formerly NOAA)
Example data set: Local Climatological Data (LCD)

If weather and climate science is your thing, you can’t get much more detailed than the National Climatic Data Center. They’ve done a little rebranding, merging the National Oceanic and Atmospheric Administration (NOAA) data centers to become the National Centers for Environmental Information (NCEI).

Here you can find an archive of climate and weather data sets across the US, the largest archive of environmental data in the world. It is a huge resource for all kinds of weather data, including meteorological, oceanic, climate, atmospheric, and geophysical data.

Global Health Observatory data

Curated by: World Health Organization (WHO)
Example data set: Universal access to reproductive health

As part of their core goal for better health information worldwide, the World Health Organization makes their data on global health publicly available through the Global Health Observatory (GHO). The GHO acts as a portal with which to access and analyze health situations and important themes.

The various data sets are organized according to themes, such as mortality, health systems, communicable and non-communicable diseases, medicines and vaccines, health risks, and so on. The WHO’s health statistics are to go-to source for global health information and is also used in the work of the US Centers for Disease Control and Prevention.

Data.gov.sg

Curated by: Singaporean government
Example data set: Singapore Residents By Age Group, Ethnic Group And Gender, End June, Annual (2017)

There are actually a lot of great government data websites on the internet. Most of them are incredible wealths of data and information. The US has one of the most known at data.gov, and the UK and Australia also have great corresponding sites. With all of those, and with large population samples, we have a lot of data to access. So why Singapore?

Frankly, Singapore’s government data website is just so visually accessible. The homepage is full of small visualizations telling stories about each data set. Part of data visualization is making sure that not only does it display information in an accurate and relevant format, but also that it’s appealing catch interest. Most of the government data sites are utilitarian and simple, enough to get the data across in an easy to understand way. Singapore, however, brightens it up with colorful visualizations, splashes of color in the graphs, and a “Similar Datasets” section at the bottom of every data set to encourage readers to explore.

Amazon Web Services Open Data Registry

Curated by: Amazon
Example data set: 1000 Genomes Project

As more organizations make their data available for public access, Amazon has created a registry to find and share those various data sets. There are over 50 public data sets supported through Amazon’s registry, ranging from IRS filings to NASA satellite imagery to DNA sequencing to web crawling. The data sets also include usage examples, showing what other organizations and groups have done with the data.

Pew Internet

Curated by: Pew Research Center
Example data set: Teens, Social Media & Technology 2018

The Pew Research Center’s mission is to collect and analyze data from all over the world. They cover all sorts of topics like politics, social media, journalism, the economy, online privacy, religion, and demographic trends. While they do their own nonpartisan, non-advocacy research and analysis, they also offer their raw data for public access. Access simply requires a brief registration on the site and credit to Pew Research Center as the source of the data, with a waiver that Pew is not responsible for alternative data conclusions.

In a way, making data accessible is also another research project for Pew. They already have all the information about how they use the data in their research and they are interested in learning how others use their data as well. They have one request — to contact them by email if anything is published as a result of the data acquired.

Google Trends

Curated by: Google
Example data set: "Cupcake" search results

This is one of the widest and most interesting public data sets to analyze. Google’s vast search engine tracks search term data to show us what people are searching for and when. You can explore statistics on search volume for almost any search term since 2004. Enter in any search term, or a handful of search terms, and click the download button to analyze the data outside of the Trends website.

There are a variety of filters to narrow down trends according to location (worldwide or by country), various time ranges, categories, or even specific search types (web vs image vs YouTube search results). You can easily see what topics are popular at the moment and what is currently trending on the Trends homepage. Google also highlights several interesting examples of trends with data visuals on that homepage.

If you’re interested in more Google data, check out Google FinanceGoogle Public Data, and Google Scholar.

Earthdata

Curated by: NASA
Example data set: Atmospheric Electricity (Lightning)

Earthdata is part of NASA’s Earth Science Data Systems Program, specifically the Earth Observing System Data and Information System (EOSDIS). EOSDIS acts as a means to process and distribute Earth science data from the Earth observation satellites, aircraft, and field measurements.

Via Earthdata, the public can access NASA’s data, news, and event information. It covers data from Earth’s atmosphere, solar radiance, the cryosphere (arctic/frozen areas), the ocean, land surface (gravity, geomagnetism, tectonics), and human environments.

 

Kaggle

Inside Kaggle you’ll find all the code & data you need to do your data science work. Use over 50,000 public datasets and 400,000 public notebooks to conquer any analysis in no time.

 

United States Census Data

The U.S. Census Bureau publishes reams of demographic data at the state, city, and even zip code level. It is a fantastic data set for students interested in creating geographic data visualizations and can be accessed on the Census Bureau website. Alternatively, the data can be accessed via an API. One convenient way to use that API is through the choroplethr. In general, this data is very clean, very comprehensive and nuanced, and a good choice for data visualization projects as it does not require you to manually clean it.

 FBI Crime Data

The FBI crime data is fascinating and one of the most interesting data sets on this list. If you’re interested in analyzing time series data, you can use it to chart changes in crime rates at the national level over a 20-year period. Alternatively, you can look at the data geographically.

 CDC Cause of Death

The Centers for Disease Control and Prevention maintains a database on cause of death. The data can be segmented in almost every way imaginable: age, race, year, and so on. Since this is such a massive data set, it’s good to use for data processing projects.

 Medicare Hospital Quality

The Centers for Medicare & Medicaid Services maintains a database on quality of care at more than 4,000 Medicare-certified hospitals across the U.S., providing for interesting comparisons. Since this data will be spread over multiple files and might take a bit of research to fully understand, this could be a good data cleaning project.

SEER Cancer Incidence

The U.S. government also has data about cancer incidence, again segmented by age, race, gender, year, and other factors. It comes from the National Cancer Institute’s Surveillance, Epidemiology, and End Results Program. The data goes back to 1975 and has 18 databases, so you’ll have plenty of options for analysis.

Bureau of Labor Statistics

Many important economic indicators for the United States (like unemployment and inflation) can be found on the Bureau of Labor Statistics website. Most of the data can be segmented both by time and by geography. This large data set can be used for data processing and data visualization projects.

Bureau of Economic Analysis

The Bureau of Economic Analysis also has national and regional economic data, including gross domestic product and exchange rates. There’s a huge range in the different groups of data found here—you can browse by place, economic accounts, and topics—and these groups are organized into even smaller subsets throughout.