ince its inception, CASD has been committed to opening transnational access to confidential data for research purposes. With this in mind, Access has always been possible from countries of the European Union and its associated countries under the same conditions as for resident researchers in France. Access is also now open from North America with some specific provisions. CASD has actively participated in European and international projects aimed at facilitating the use of confidential data across national borders.
Secure access from European Union countries and associated countries
For residents of the European Union and its associated countries, the accreditation criteria and procedures are identical to those of residents in France (authorization procedure). Accreditation with the Statistical Confidentiality Committee and enrollment at CASD require a short trip to France. Secure access can then be done from the applicant’s hosting institution, via the installation of an SD-Box sent by post.
INSEE data now accessible to North America
INSEE, in its desire to support research, has also authorized access to its data with CASD SD-Boxes located in the United States and Canada under the following conditions:
- Researchers who are citizens of a Member State of the European Union while working at a North American university or research center;
- North American researchers working as part of a research project in partnership with a center or university in a Member State of the European Union.
CASD strives to promote easier European and international access for researchers to confidential data across national boundaries. After having participated in two European projects on the conditions and feasibility of a network of secure centers for both national and European data, (DwB FP7 for national data and EssNet,funded by Eurostat, for European data), CASD is coordinating the IDAN (International Data Access Network) building cooperation between 6 secure centers across 4 countries (France, Germany, the Netherlands, and the United Kingdom) with the aim of making it easier for researchers to use confidential data from these centers for the same project. Partnerships are gradually being set up, such as the one with the German Federal Employment Agency FDZ / IAB.
CASD is a partner of the ESSnet SCFE project (Sharing Common Functionalities in the ESS), coordinated by INSEE and funded by EUROSTAT through the European Commission. This project aims to promote sharing experiences and developing common solutions between statistical institutes and statistical authorities within the ESS in the perspective of a “customer-oriented” architecture.
CASD works with INSEE on reusing the RDF (Resource Description Framework) of INSEE to recuperate the main metadata from the source so it can be made available to users.
CESSDA (Consortium of European Social Sciences Data Archives) has a mission: coordinate the access to data in Europe for SSH research through best practices in archiving, documentation and access to data bases for member countries. It is one out of five large sized SSH research infrastructures recognized by the ESFRI (European Strategy Forum on Research Infrastructures) process.
CASD contributes to CESSDA developments and to its activities through PROGEDO (PROduction et GEstion des DOnnées en SHS – Production and Management of SSH data-).
France, founding member of the European Network
Dating back to 1976, CESSDA has been an official European research infrastructure since 2013 and was granted the status of ERIC (European Research Infrastructure Consortium) in 2017, involving ministries or Research Councils in its governance.
France is a founding member of CESSDA ERIC with its PROGEDO infrastructure, whose CESSDA France-Réseau Quetelet department is to act as the French National Service Provider for CESSDA.
Transnational data access
Few partner databanks of CESSDA currently host detailed data which require secure access, such as those from public statistics and administration. The potential from a research standpoint for such data to be made available will be of the utmost importance for CESSDA after its initial construction phase (creating a catalog and a European database).
CASD involvement and long term interest
CASD is currently involved in two CASSDA activities in the scope of its construction phase. CASD thus follows the standards for metadata which will be recommended to CESSDA partners and on the establishment of a single access point (one stop shop) enabling researchers to quickly access every available resource.
In the long run, CASD is interested by CESSDA’s role in deploying a distributed secure infrastructure as drawn out in the DwB project for confidential national data and in the ESSnet DARA project for European confidential data (Eurostat), both projects for which CASD has carried out a Proof of Concept.
Metadata for national data in Europe…
CIMES (Centralising and Integrating Metadata from European Statistics) provides an overview of official microdata made available for statistical research in Europe.
This database describes access procedures, data documentation (public files, scientific files and secure files), access conditions and links to data supplier websites. CIMES does not provide access to the data; it only contains metadata. CIMES thus gathers information currently dispersed all over Europe, either in INS or data archives, and stores it in a structured database compliant with DDI.
…Available thanks to controlled and specific documentation
The documentation differentiates series, studies and data sets. A series is a group of studies showing a data collection process of transversal, longitudinal, or repeated data. It illustrates a continuous data collection process by INS, where each collection is a study. The data set refers to a group of files made available. For example, the survey on active population (LFS) is a series, the data collection from 2007 is a study, and the data set is a set of available files: for example the scientific file of survey on active population (EPA –SUF-) 2007.
Each of these levels is documented by a list of fields and controlled vocabulary. Currently CIMES covers 277 series, including 1 893 studies and 2 244 data sets in 31 European countries, also supplying links to main integrated microdata in Europe.
CIMES is an FP7 DwB product aiming to increase the use of official microdata in Europe. The Web application for CIMES is developed by CASD.
ADP, CED, CNRS-Réseau Quetelet, FORS, GESIS, RODA, ONS and UL have been involved in CIMES, collecting and structuring metadata during DwB. The work is still under way, supported by CNRS and CASD.
Facilitating transnational access to confidential data
Researchers face numerous obstacles when wanting to work on detailed (confidential) data from a country in which they do not reside. Authorisation procedures and access modalities vary. Only a small number of countries authorize access from another country. With 29 partners in 14 European countries, involving national statistical institutes and social science databanks, the Data without Boundaries (2011-2015) FP7 project set the first stones for cooperation to ease transnational access to confidential data for research purposes.
A Proof of Concept for a secure distributed network
CASD, partner in the project alongside INSEE, built alongside DESTATIS (the German federal statistics institute) and GESIS a proof of concept for a secure distributed network, to help authorized users work from a one stop shop on transnational data in various secure centers.
This proof of concept shows the possibility of setting up an integrated highly secure infrastructure, flexible enough to take into account various national legislations and the resulting constraints. At the least, it enables the use of the same equipment, same procedures, and the same access point by all. Such a network, will potentially help a research team, spread across many countries, to work together on multiple sources and combine analysis time.
Towards a step by step construction
DwB partners have also worked concretely on solutions to immediately improve the information and documentation on national data in Europe. In particular, the CASD built the CIMES tool, Centralising and Integrating Metadata from European Statistics, a catalog of basic metadata, in English and in the international DDI format, on the data available in the various countries, the different types of files (public, de facto anonymized, confidential) with their procedures and links to data producers.
CASD continues to build, step by step, with partners and as close to researchers’ projects and needs as possible, creating bi-lateral and multilateral co-operations to concretely advance the construction of this infrastructure for research.
CASD built an infrastructure for secure remote access (proof of concept) for the DARA project with Germany, the UK, Hungary and Portugal.
CASD presented the DARA European access centre pilot, at the WGSC (Working Group on Statistical Confidentiality) in Luxembourg (Eurostat). The pilot scheme was praised and deeply satisfied project partners.
[Schematics of the DARA European pilot]