An R Package to query TCdata360 and Govdata360 data, metadata, and moreWe developed data360r, an R package that allows users to easily query our data, metadata, and more using easy, single-line functions, powered through our TCdata360 and Govdata360 APIs. We made this R package easy-to-use to empower the entire gradient of R users – from the beginners to the experts.
Go here for data360r installation, examples, and use cases in just 3 lines of R code. You can also read our blog for an overview of the benefits of using data360r.
Interactive exploratory tool featuring GovData360 dataExplore GovData360 data in a different way. We put together the Open GovData360 Topology tool, a cloud visualization that groups countries according to the main GovData360 indicators across time. This is an innovative way to look at big datasets. It is written in R and reads directly from GovData360 API. This blog describes a few analytical possibilities of the tool.
Packages and Libraries
Suggested Peers algorithm
Suggested Peers uses countries' similarities calculated by computing the distance between countries in an embedded country space following the t-SNE algorithm.
For each country, values are found for the following indicators:
Export Basket Composition
- Export Product Share - WITS
- Import Product Share - WITS
- ICT goods exports (% of total goods exports) - WDI
- Agricultural raw materials exports (% of merchandise exports) - WDI Human Capital
- Adult literacy rate, population 15+ years, both sexes (%) - WDI
- Current education expenditure, total (%of total expenditure in public institutions) - WDI
- Labor force, total - WDI
- Unemployment, total (% of total labor force) - WDI Physical Capital
- Gross fixed capital formation (current US$) - WDI
- Gross capital formation (current US$) - WDI GDP per capita
- GDP per capita (US$) - GCI
- GDP per capita (constant 2005 US$) - WDI
- GDP per capita, PPP (current international $) - WDI Population
- Population - IMF WEO
- Population, total - WDI
By-product indicators are split up into one 'indicator' for each product.
From these values, a data matrix is constructed (where is the value of the th indicator for the th country). Missing values (not all indicators have values for all countries) are calculated as the mean of all present values for that indicator.
t-SNE is then run on with a Euclidean metric (and perplexity of 40, early exaggeration of 4 and learning rate of 1000) to create a 2D embedding space.
A k-d tree is then created from the embedded countries, allowing us to efficiently perform (Euclidean) nearest neighbors search. For a given country, the similar countries are then defined to be:
- The 4 nearest neighbours
- And the 16 next nearest neighbours closer than a specified threshold (currently 1000)