A real meeting point between data producers on the one hand, and the research community on the other, CASD has developed real expertise in the areas of security, scientific processing requiring computing power and, more generally, making sensitive data available. The creation of a dedicated, patented technological solution more than 15 years ago now enables the CASD to envisage multiple potential uses.
CASD proposes 4 distinct main offers around its secure data distribution technology.
Secure bubble for health data
CASD provides a secure service for the provision of data from the National Health Data System (SNDS) taking into account the requirements of the General Data Protection Regulations (GDPR) and the Référentiel de Sécurité des Données de Santé (RSDS).
For any use, the CASD can provide a dedicated secure bubble (see prices) with a customised software environment, agreed with the users, to enable them to carry out their processing in the best possible conditions while guaranteeing data protection and traceability. This system allows users to free themselves from the constraint of setting up their own environment and to limit administrative procedures by delegating part of the responsibilities to a trusted third party.
Secure external access to data
Strong and scalable processing power in constant evolution
CASD has over 15 years of experience in supplying calculation services for scientific research. Competition is fierce in this sector; every day counts in a researcher’s career. Some calculations can require weeks, so a 20% efficiency increase can make a researcher gain weeks. CASD has a cluster for complex processing of large datasets.
CASD’s experience has been fully leveraged for the design of the its hosted calculation servers. Thus, each component of the computing server infrastructure is determined with the greatest care for an optimal match with the system and the scientific software of data processing:
- Hard drives (including SSD integration) and their setup,
- Linkage cards for direct bay attachments,
- Network connections (10GB/s),
- Graphics Processing Units
- Mother cards and acceleration cards
The OS is optimized for processing large datasets with a configuration enabling lower disk accessing delay and transfer from disk to processor.
For several years now, advances in virtualization have made it possible to integrate this technology into CASD’s computing servers, thus reinforcing, when necessary, the possibilities for expansion and custom power allocation.
The architecture of the platform continues to evolves and adapt its calculation power to a project’s requirements.
Secure Datalab for Proof of Concept
This Datalab may be setup for experimental projects, carried out by internal or external staff, and could lead to one or many proofs of concept and enable an entity to experiment before generalizing new architecture.
A DataScientist with an SD-Box and biometric card can access a DataLab to process large datasets. This DataLab relies on :
- An efficient Windows environment
- A Hadoop cluster with at least 4 physical nodes, quickly and easily extensible
- An up-to-date set of software : Spark, TensorFlow, R, RStudio, Dataiku DataScience Studio, qGIS, Python, SAS, Stata, SPSS, etc.
- With possible setup of :
- Deep Learning servers (graphic processors)
- OpenStreetMap Dedicated server
- SQL Server
- TeraMemory Server
Trusted third party for data matching
A trusted third party is an independent entity which has no stakes in using either source or resulting data guaranteeing the confidentiality of directly identifying data (data which can be used to identify someone).
CASD was selected by Equipex (a French public research funding scheme). 6 axes were submitted including one on matching methodology and on the trusted third party role CASD could play. The international jury insisted on this aspect and deemed it as the most crucial of all possible aspects.
To match data from different producers while ensuring confidentiality, a trusted third party is necessary.
Schema for a standard data matching operation on hashed identifying data
CASD aims to guarantee the data confidentiality demanded by producers and CNIL (the French authority on digital confidentiality) when matching data.
CASD’s main mission is to make confidential data available in a highly secure manner. Other CASD missions include data documentation, research support, training and data matching.
To make data available, CASD hosts its own servers, ensuring that data cannot be retrieved form this secure environment. Users must connect from an CASD-authorized IP address, and use both an SD-Box and a biometric card (both supplied by CASD).
CASD and the prerequisites defining a trusted third party
CASD is made of 27 people in “business poles”:
- A Datascience and IT infrastructure department for daily management, R&D in datascience and IT security.
- A Statistics department specialized in collecting and formatting, documenting data and ensuring that the French statistical secrecy rules are respected (output checking).
- A Project Management Service responsible for all organizational and administrative issues.
All processing operations linked to data matching are done in the highly secure IT infrastructure, to which even CASD staff must securely connect through a secure SD-Box.
CASD staff are not researchers and have no vested interest in the data, its stakes, or research results.
All CNIL prerequisites to qualify as a trusted third party are met: independence, no conflict of interest, etc.
For any extra information regarding trusted third party operations: firstname.lastname@example.org
Finally, the CASD also offers other complementary services (such as the creation of a secure Dataroom), based on the expertise of its employees.
Numerous private players – banking, insurance, energy, transport, etc. – are involved. (see our references) – have placed their trust in the CASD to carry out their projects.
Do not hesitate to contact us:
- by phone at +33 (0)1 70 26 69 32
- or by e-mail to the following address email@example.com