Skip to content
@dataforcanada

Data for Canada

Bridging the gap between Canadian open data availability and usability with high-performance, analysis-ready assets.

Data for Canada

www.dataforcanada.org

Service Status

Check the current status of all our services here: 👉 status.dataforcanada.org

Mission

Data for Canada exists to bridge the gap between open data availability, resiliency, and usability. We curate, clean, and re-engineer high-value Canadian datasets into high-performance, analysis-ready formats for data engineers, researchers/scientists, developers, and systems.

The Problem

Canada creates incredible amounts of open data, from foundational road networks to federal census statistics and orthoimagery. However, these datasets are often locked in legacy formats, fragmented portals, or structures that require significant engineering effort to normalize. For our target audience, the "time-to-insight" is often bottlenecked by data preparation.

Data Stability: Beyond technical barriers, open data can be ephemeral. Links break, portals are reorganized, and priorities shift, causing valuable datasets to vanish from the public web. This instability makes it risky to build long-term research or software on top of data providers.

What Guides Us

We prioritize our work in a utilitarian manner, aiming to provide the greatest amount of good to the greatest amount of individuals. Our approach is guided by the principles of digital preservation and the need to keep public information accessible over the long term.

Our approach is guided by the following:

The Solution

We act as the transformation layer. We aggregate datasets with permissive licenses and process them into "digestible" standards optimized for modern downstream applications.

  • For Data Engineers, Researchers/Scientists, and Developers: Skip the cleaning phase. Access normalized, documented data ready for analysis.
  • For Systems: Standardized data structures designed to feed directly into pipelines, data warehouses, and downstream services.

Our Stewardship: Data for Canada takes ownership of the datasets we create, from start to finish. We ensure that data remains consistent and available, acting as a stable foundation for your work, and allowing for reliable analysis across time and space. By decoupling access from the original source, we ensure your pipelines don't break even if the upstream location changes or expires.

Target Software Ecosystem

We adopt an open-source first approach, while supporting proprietary solutions to the best of our ability to ensure maximum accessibility. We target the latest versions of these software packages (e.g., modern GDAL/OGR) to leverage the newest improvements.

Our data is optimized for:

Category Recommended Stack & Libraries
Core & Desktop GDAL/OGR, QGIS, QField
Python & Data GeoPandas, Lonboard, DuckDB, SedonaDB
Database PostgreSQL with PostGIS and pg_mooncake extensions
Serving GeoServer, Martin, ZOO-Project
Serverless Cloudflare Workers, AWS Lambda, Google Cloud Run functions
Enterprise ArcGIS Pro, ArcGIS Enterprise

Explore Sample Datasets

See our processing pipeline in action. View samples and documentation for our current priority processes:

High-Level Overview

Note: The data sources in the diagram below are prioritized from left to right, reflecting our current focus on processing high-value statistical, foundational, and orthoimagery datasets first.

High Level Overview

Get Involved

We are actively looking for new members and partners to help shape this project.

🇨🇦 Infrastructure Support: Selective Mirroring

To support data sovereignty, safeguard against data loss, and improve local access speeds, we are currently seeking selective mirroring in Canada. See our Infrastructure.

We are looking for academic institutions, research organizations, or infrastructure partners interested in hosting mirrors of specific, high-value dataset subsets. If you have bandwidth and storage capacity to spare for the Canadian open data ecosystem, please contact us.

Contributing & Feedback

Right now, we primarily need feedback on file naming convention, our datasets and their underlying processes, and the infrastructure used to generate them. If you have thoughts on data quality, format optimization, or pipeline improvements, we want to hear from you.

  • Discussions: Head over to #dataforcanada:matrix.org to chat, or go to the individual process GitHub repos to comment on specific issues.

License

This project is licensed under the MIT License.

Pinned Loading

  1. www.dataforcanada.org www.dataforcanada.org Public

    Source code for the main Data for Canada website.

    Mermaid 1

  2. metadata-labs metadata-labs Public

    A place to experiment with metadata dissemination and its associated processes.

  3. process-statcan-data-labs process-statcan-data-labs Public

    Process various Statistics Canada datasets.

    Jupyter Notebook 6 2

  4. process-foundation-labs process-foundation-labs Public

    Shell

  5. process-orthoimagery-labs process-orthoimagery-labs Public

    Turn raw raster data into production-ready maps.

    Shell

  6. process-field-imagery-labs process-field-imagery-labs Public

    Open, privacy-first pipeline for field and oblique imagery of Canada across time and space, with time-aware metadata.

Repositories

Showing 10 of 18 repositories
  • .github Public
    dataforcanada/.github’s past year of commit activity
    0 0 0 0 Updated Feb 18, 2026
  • www.dataforcanada.org Public

    Source code for the main Data for Canada website.

    dataforcanada/www.dataforcanada.org’s past year of commit activity
    Mermaid 1 MIT 0 8 0 Updated Feb 18, 2026
  • status.dataforcanada.org Public

    Status page for Data for Canada websites.

    dataforcanada/status.dataforcanada.org’s past year of commit activity
    Markdown 0 MIT 0 1 0 Updated Feb 18, 2026
  • decentralized-distribution-labs Public

    Resilient, decentralized data distribution via Source Cooperative, Zenodo, Internet Archive, Data for Canada infrastructure, and community, using peer-to-peer delivery and FAIR metadata.

    dataforcanada/decentralized-distribution-labs’s past year of commit activity
    Shell 0 MIT 0 16 0 Updated Feb 17, 2026
  • dataforcanada/dataforcanadapkgs-labs’s past year of commit activity
    Nix 0 MIT 18,236 0 0 Updated Feb 12, 2026
  • dataforcanada/dataforcanadapkgs’s past year of commit activity
    0 MIT 0 1 (1 issue needs help) 0 Updated Feb 12, 2026
  • process-statcan-data-labs Public

    Process various Statistics Canada datasets.

    dataforcanada/process-statcan-data-labs’s past year of commit activity
    Jupyter Notebook 6 MIT 2 22 0 Updated Feb 9, 2026
  • dataforcanada/static.dataforcanada.org’s past year of commit activity
    0 0 0 0 Updated Feb 9, 2026
  • process-orthoimagery-labs Public

    Turn raw raster data into production-ready maps.

    dataforcanada/process-orthoimagery-labs’s past year of commit activity
    Shell 0 MIT 0 5 (1 issue needs help) 0 Updated Feb 9, 2026
  • dataforcanada/process-elevation-labs’s past year of commit activity
    0 MIT 0 1 0 Updated Feb 6, 2026