Dataset Overview

Learn how to build the datasets that you'll connect to visualization elements.

Creating Datasets

Datasets contain the data that are displayed in visualization elements like charts, maps, tables, and filters. You can create a new dataset by selecting New Dataset in the Project panel and then selecting from the following:

  • Blank Dataset: Creates a new empty dataset.
  • Upload Dataset: Upload your own dataset, which should be in CSV or TSV format. First, select a file to upload and give the new dataset a name. On the next screen, use the checkboxes to select which columns from the uploaded file should be included in the new dataset, and what Type to assign to each. If you are on the Starter or Classroom plan, the uploaded dataset must be less than 5 MB in size when stored in JSON format. Datasets uploaded on the Pro plan must be less than 20 MB.
Tip

Datasets size limitations are based on the stored JSON size, which is indicated in the bottom right corner of an open dataset, or in the bottom left corner during the dataset uploading sequence. Datasets stored in JSON are approximately 2-3 times larger than the uploaded CSV file.

  • New dataset from Basemap: Create a dataset with records that exactly match the features of a basemap. Begin by selecting a basemap that has been added to the project. Next, use the checkboxes to select which basemap properties to add as dataset fields (columns), and what Type to assign to each. All features (geographic shapes) in the basemap will be added as dataset records (rows). Creating a dataset in this way can be helpful when building a choropleth or any other map where there should be one dataset record for every map feature.

Datasets Overview

Datasets consist of Fields (columns) and Records (rows). You can add new fields by selecting Add Field on the right of the dataset. You can also duplicate an existing field by selecting the field menu (three dots) in any field heading and choosing Duplicate Field. Records can be added by selecting Add Record at the bottom of the dataset.

Click on any field header to sort the dataset by records in that field. The first click sorts by records in ascending order, the second in descending order, and the third clears the sorting.

Use the toolbar at the top of the screen to use the following dataset features:

  • Default Font: Choose whether to display the dataset using a serif or monotype font.
  • Row Height: Choose whether to increase row height from the default option.
  • Filter: Enter a keyword to search the dataset for a term. Records that don't include the keyword are temporarily hidden. Choose a match algorithm: "Contains" (the keyword can appear anywhere in the record), "Fuzzy" (a term similar to the keyword can appear anywhere in the record), or "Exact" (the keyword must match the record exactly, with no additional text in the cell).

Dataset Format

Datasets can store data in either wide data or long data format, depending on what you need to create your visualization. Read our Choosing a Format documentation to better understand how to select the right data format.

In wide data format, you will have individual fields (columns) for every series. For instance, in this dataset containing GDP per capita records for Germany and Sweden, the two series (Germany and Sweden) are added as individual fields:

Date
Germany
Sweden
19902230430594
20204674952838

In long data format, you will have one column containing the series name (the "Country" field), and another containing the values (the "GDP per capita" field):

Date
Country
GDP per capita
1990Germany22304
1990Sweden30594
2020Germany46749
2020Sweden52838

You do not need to specify in the dataset editor which data format you have chosen, but you do need to ensure that you have used a data structure that corresponds to the data structure of any visualization elements (charts, maps, etc.) that are connected to the dataset.

Data Types

Each field (column) can be assigned a data type, which ensures that the field is displayed correctly and consistently whenever it is used in a visualization element. Data types are specified by opening the field menu (three dots) in any field heading. The following data types are available:

  • Text: Use this for any categorical data (e.g., country names, product types, or education levels).
  • Number: Use this for numeric data that isn't better suited to the percent, currency, or measurement types listed below. Choose a Format to display values as plain numbers, fixed decimals (e.g., 1.0, 1.00), or compact notation with suffixes (e.g., 1m, 1.0m).
  • Percent: Use this for numeric data that is in percentage form. The percentage sign (%) is always appended to values in this field. Note that values are stored in their raw decimal form. For instance, if you enter the value 0.05, this is displayed as 5% and if you enter the value 5, this is displayed as 500%. If you paste a value with the percentage sign already appended (e.g., 5%), this is stored as 0.05 and displayed as 5%. Choose a Format to apply to values, which are similar to those formats for Number, but include a % sign.
  • Currency: Use this for numeric data representing monetary values. Specify a Currency Symbol, which will be displayed before the value unless the After Text option is toggled on. Choose a Format to apply to values, which are similar to those formats for Number, but include the currency sign.
  • Measurement: Use this for numeric data representing a particular measurement. Choose a Measurement Symbol from a list of about 80 options in the following categories: acceleration, angles, area, concentration, data storage, density, energy/power, flow rate, frequency, length/distance, power/intensity, pressure, radiation, speed, temperature, time, volume, and weight/mass. Choose a Format to apply to values, which are similar to those formats for Number. The measurement symbol will always be displayed immediately after the formatted value.
  • Coordinates: Use this for a field that will store latitude or longitude values.
  • Date: Use this for a field that will contain date values. Select a Format, which includes a number of commonly used formats as well as the option for a completely custom date format.
Tip

The rounding selected for any numeric field (in the Format option) will always be used in values displayed in tables, tooltips, or annotations in the visualization. However, chart axes are always rounded optimally based on their range and adjacent values.

For the number, percent, currency, and measurement data types, you can also optionally choose a Data Aggregation method, which determines how multiple values in the column are combined when needed.

Data Aggregation

For the number, percent, currency, and measurement data types, you can also optionally choose a data aggregation, which determines how multiple values in the column are combined when needed. View the Data Aggregation page for more details about this functionality.