DataKind harnesses the power of data science to enable social sector experts to move the needle on seemingly insurmountable issues like poverty, healthcare access, and improved education. Modern scientific research for Sustainable Development depends on the availability of large amounts of relevant real-world data. However, no extensive global databases currently associate existing data sets with the research domains they cover. Microsoft AI for Humanitarian Action brought in trusted Microsoft Azure Infrastructure and Data & AI Designated Solutions Partner, Valorem Reply, to help DataKind combine 23 different data sets into a single relational database to analyze food security and insecurity. Beginning with Somalia data sets, Valorem’s team was tasked with building a repeatable model DataKind could replicate and scale for use in other countries. The resulting SDG Data Catalogue can now extract and organize deep knowledge of datasets that can be hidden in plain sight in the continuous stream of research generated by the scientific community.
We built the SDG Data Catalogue as an innovative solution to a critical problem facing modern research: the lack of a comprehensive, global database of datasets and their associated research domains. We are excited to see how DataKind uses the system to leverage data science for the greater good.
- Steve Cummings, VP Customer Success, Valorem Reply
CHALLENGE
- Existing environment is fragmented, difficult to manage and develop.
- Enhance the interoperability and utility of publicly available data to create tools and insights to improve targeting of communities in need.
- Measure the impact of food availability on population density and poverty in Africa using public data.
SOLUTION
- Upgrade and modernize DataKind’s infrastructure using an Azure Landing Zone.
- Ingest 13 different public data sources.
- Build a relational database in PostgreSQL and transform the data for DataKind and Microsoft research scientists to use in their modeling.
RESULT
- Evidence-based decision making.
- Increased efficiency.
- Enhanced data literacy.
- Repeatable Data Science food security model for multi-country expansion.