Data Sources

GIS building data

In the case of France, the GIS building data used is taken from the ‘Bâtiment’ table of the ‘Bâti’ section of the BDTOPO V3 database produced by IGN. This table contains polygons representing the footprints and heights of buildings obtained from aerial images as well as descriptive attributes obtained from merging the GIS data with cadastre and tax registers. The following attributes are available :

  • NATURE : architectural type of the building

  • USAGE1 : main use of the building (residential, services, industrial, agriculture, etc.)

  • USAGE2 : secondary use of the building

  • DATE_APP : construction date

  • LEGER : set to True if the building does not have foundations or is open on one side

  • NB_LOGTS : number of dwellings

  • NB_ETAGES : number of floors

  • MAT_MURS : wall material

  • MAT_TOITS : roof material

  • HAUTEUR : height of the building (between ground and bottom of the roof)

  • Z_MIN_SOL : minimal altitude of the floor

  • Z_MAX_SOL : maximal altitude of the floor

  • Z_MIN_TOIT : minimal altitude of the roof

  • Z_MAX_TOIT : maximal altitude of the roof

The datasets can be downloaded by department from the IGN website.

../_images/bdtopo_capture.png

A view of BDTOPO footprint data with an OpenStreetMap layer

The role of this dataset in the buildingmodel methodology is central, as it provides a simplified geometry (3D right prism) of the buildings to simulate, including the adjacencies between them and the number of floors. In addition, the descriptive data is used to pair the buildings with coherent district-level census data and energy diagnosis samples

Census data

Population and housing censuses are regularly counducted by countries to obtain accurate data on their population characteristics, its localization and housing conditions. Of particular interest to buildingmodel’s purposes is the data regarding housing conditions at the local level. In the case of France, it consists in the “Fichier Détail Logements Ordinaires” which contains a description of each dwelling anonymized at the district level. The following attributes of the dwellings are relevant for buildingmodel :

  • IRIS : the administrative code of the district

  • ACHL : period of construction of the building

  • CATL : category of occupancy of the dwelling (main residence, secondary home, vacant, etc.)

  • CHFL : building-level heating system

  • CMBL : main heating fuel

  • INPER : number of occupants of the dwelling

  • SURF : area class of the dwelling

  • TYPL : type of dwelling (individual house, apartment, etc.)

The dataset can be downloaded by department from the INSEE website.

../_images/census_construction_year_residential_type.png

The number of dwellings in the census dataset by construction year class and residential type

Energy diagnosis data

Energy diagnosis are performed to estimate the energy consumption related to heating, cooling, ventilation and domestic hot water in buildings. Two main methodologies are used that differ in the way the energy consumption is calculated :

  • Bill-based : energy bills are collected, analyzed and corrected to estimate the energy consumption

  • Audit-based : an energy audit is performed to determine the characteristics of the dwelling envelop, ventilation, heating and domestic hot water systems. The result of the audit is then used as input of a conventional calculation method based on static thermal losses modelling to estimate the energy consumption of the dwelling.

Additional data describing the dwelling (date of construction, number of occupants, living area, address) are collected and will be used in buildingmodel to match the energy diagnosis records with similar buildings derived from GIS and census data.

In France, energy diagnosis are mandatory for dwellings when they are sold or rented since 2006. ADEME, an agency in charge of improving energy efficiency, collects the records and makes them available at ADEME website.

While the availability of more than 5 million energy diagnosis records is welcome, great care has to be taken in the exploitation of this database for the following reasons :

  • Bill-based diagnosis cannot distinguish between occupant behavior (such as heating setpoint, dhw use, presence duration) and dwellings’ intrinsic energy performance

  • Audit-based diagnosis necessitates the collection of dozens of characteristics that can be hard to obtain, while allowing auditors to use default values when they estimate the data cannot be retrieved

  • Contrary to GIS and census data collection which are supervised by public entities, energy diagnosis are performed by companies commissioned by dwelling owners. This creates a conflict of interest, as the result of the energy diagnosis has a direct impact on the market value of the dwelling.

  • A selection bias is created by the fact that the energy diagnosis has been mandatory only when the dwelling is sold or rented

../_images/dpe_consumption_by_class.png

The distribution of primary energy consumption by living area for the complete energy diagnosis database of residential dwellings.

The large discontinuities around the limits of each energy class illustrate the conflict of interest described above : when estimated consumption are close but above the lower bound of an energy class, a significant share of auditors will falsify the results to get just below the lower bound and gain an energy class.

../_images/dpe_sample_selection.png

The number of energy diagnosis records in the ADEME dataset by construction year class and residential type. The comparison with the equivalent data from the census dataset illustrates the selection bias in energy diagnosis.

Climate data

The following climate measurements at a 1-hour resolution are used as input of buildingmodel :
  • air temperature

  • direct normal radiation

  • diffuse horizontal radiation

  • dew point temperature

  • opaque sky cover

Buildingmodel includes a wrapper to pvlib read_epw function to allow easy use of the EPW file format.

Note

When estimating average energy consumption of buildings over long periods of time, it is recommended to use synthetic weather data sets designed to encapsulate the wide range of conditions in one typical or reference year. Such data sets are made available for hundreds of locations at Climate.OneBuilding.Org.