We are starting a data science corner on GovData360. Send us an email at contact email to tell us what you would like to see here. Meanwhile, here are a few starters.

An R Package to query TCdata360 and Govdata360 data, metadata, and more

We developed data360r, an R package that allows users to easily query our data, metadata, and more using easy, single-line functions, powered through our TCdata360 and Govdata360 APIs. We made this R package easy-to-use to empower the entire gradient of R users – from the beginners to the experts.

Go here for data360r installation, examples, and use cases in just 3 lines of R code. You can also read our blog for an overview of the benefits of using data360r.

Interactive exploratory tool featuring GovData360 data

Explore GovData360 data in a different way. We put together the Open GovData360 Topology tool, a cloud visualization that groups countries according to the main GovData360 indicators across time. This is an innovative way to look at big datasets. It is written in R and reads directly from GovData360 API. This blog describes a few analytical possibilities of the tool.

Data Resources



Suggested Peers algorithm

Suggested Peers uses countries' similarities calculated by computing the distance between countries in an embedded country space following the t-SNE algorithm.

For each country, values are found for the following indicators:

    Export Basket Composition
  • Export Product Share - WITS
  • Import Product Share - WITS
  • ICT goods exports (% of total goods exports) - WDI
  • Agricultural raw materials exports (% of merchandise exports) - WDI
  • Human Capital
  • Adult literacy rate, population 15+ years, both sexes (%) - WDI
  • Current education expenditure, total (%of total expenditure in public institutions) - WDI
  • Labor force, total - WDI
  • Unemployment, total (% of total labor force) - WDI
  • Physical Capital
  • Gross fixed capital formation (current US$) - WDI
  • Gross capital formation (current US$) - WDI
  • GDP per capita
  • GDP per capita (US$) - GCI
  • GDP per capita (constant 2005 US$) - WDI
  • GDP per capita, PPP (current international $) - WDI
  • Population
  • Population - IMF WEO
  • Population, total - WDI

By-product indicators are split up into one 'indicator' for each product.

From these values, a data matrix A is constructed (where A_{ij} is the value of the jth indicator for the ith country). Missing values (not all indicators have values for all countries) are calculated as the mean of all present values for that indicator.

t-SNE is then run on A with a Euclidean metric (and perplexity of 40, early exaggeration of 4 and learning rate of 1000) to create a 2D embedding space.

A k-d tree is then created from the embedded countries, allowing us to efficiently perform (Euclidean) nearest neighbors search. For a given country, the similar countries are then defined to be:

  • The 4 nearest neighbours
  • And the 16 next nearest neighbours closer than a specified threshold (currently 1000)
giving between 4 and 20 similar countries as a result.

Measuring Export Competitiveness Algorithm

Measuring Export Competitiveness Country Comparator uses a methodology developed at the World Bank. Read more.

Available Indicators By Country