Skip to Main Content

Finding Datasets

This Guide provides information on how to search out and access pre-existing datasets.

Imagine an Ideal Dataset

There are many datasets out there. To figure out which datasets would fit your topic or interest it can help to ask questions about what your ideal dataset would look like:

  • Who or what is being studied? (e.g, people, towns, countries, etc.)
  • What about them is being studied? (e.g., people's height, towns' populations', countries' exports, etc.)
  • Where? (e.g., people in Iowa, towns in Ethiopia, countries in Southern Asia)
  • When? (e.g., the 1990's, the Middle Ages, during World War II, etc.)
  • How often is the data collected? (e.g, once, or weekly, monthly, yearly, etc.) Note: in dataset speak this can involve some special jargon:
    • Cross sectional data is collected at on particular point in time across a group of individuals (for instance, if you measured the height of 10 5th grade children on April 10th, 2022) (Jupp, 2011, p. 53).
    • Longitudinal data is collected at multiple times across the same group of individuals (for instance, if you followed the same 10 from 5th grade to 6th grade and measured their heights every month) (Jupp, 2011, p. 165).
    • Panel data is collected at multiple times across different groups of individuals (for instance, if every year you measured the height of 10 children from the incoming 5th grade class) (Jupp, 2011, p. 212 ).
  • Format? What dataset file formats (e.g., an Excel file, CSV, etc.) work with the statistical software (e.g., SPSS, Stata) you're using or have access to through TCNJ?

Try to be specific about your topic/question! For example:

  Less specific: More specific
Topic The health of U.S. adults over time. Records of heart disease among U.S. adults in the last 10 years, broken down by year.
What "Health" is too broad. What's an example of health? Heart disease
When "Over time" is too vague. What time period are you interested in? Last 10 years

Jupp, V. (2011). The SAGE dictionary of social research methods. London, England: SAGE Publications, Ltd doi: 10.4135/9780857020116

Data or Statistics?

Another helpful question to ask is whether you are looking for data or statistics because the two are often found in different places:

  • Data are individual observations, the raw output of research, often aggregated into a table of some sort. For example, the height and weight of each individual participant in a study.
  • Statistics interpret data by providing sums, averages, counts, etc. If you're looking for a quick number - for instance, the average height or weight of U.S. adults - you're looking for statistics.

An image of an excel file. Each row includes information about a car or truck, including miles travelled and gallons of gas used. These rows are labelled "Data". Below is an average of all the miles travelled and gallons of gas used. It is labeled "Statistics".

Starting Your Search

Click here for a list of places to start searching for datasets.