- BAC+3 en ingénierie de données ou en sciences informatiques
- Master en sciences informatiques ou équivalent
- Etudes de dossier
- Learn how to understand the analysis, design, implementation & monitoring of IT & Big Data architectures;
- Leverage the most prevalent programming languages and their libraries for applied machine and deep learning;
- Learn how to architect and deploy highly distributed data and computation clusters such as Hadoop, SPARK or Microsoft Orleans;
- Discover the DevOps world and set up continuous integration architecture;
- Be trained to and take two Enterprise-Level Certification examination:
- Amazon AWS *Cloud-Computing DSTI Chair*
Preparation for AWS Certified Solutions Architect – Associate - Preparation for Cloudera Certified Data Engineer
- Amazon AWS *Cloud-Computing DSTI Chair*
- Directeur de projet informatique
- Directeur des études informatiques
- Urbaniste – architecte fonctionnel du SI
- Administrateur de bases de données
- Project Management officer
- Dataminer – Datascientist
- Chief Data Officer
- Data Protection Officer
- Consultant informatique décisionnelle – big data
Data Management
- Data Bases
Relational Databases Management Systems
Using MySQL & Microsoft SQL Server: stand-alone and cluster deployments, integration in software, ETL, persistence frameworks
Advanced SQL for Data Wrangling
Complex joins & subqueries, stored procedures & triggers - NoSQL databases
Key-value store, Document store, Graph database , hybrid approaches with Apache Cassandra - Big Data
The Hadoop Ecosystem
HDFS, MR, YARN, SPARK
- Data Pipeline
Classic ETL solutions – Cloud-based solutions with AWS Data Pipeline & AWS Kinesis – Open-source solution with Apache Kafka & Beam
Machine learning - Foundations of Statistical
Analysis & Machine Learning Distributions – Descriptive & Inferential Statistics – Classification & Regression Trees
Data Science
- Machine Learning with Python
Language fundamentals & common frameworks for machine learning: NumPy, SciPy, scikit-learn - Machine Learning with R
Language fundamentals, recursive and functionnal programming, data frames, common machine learning packages - Deep Learning
- Deep Learning on GPU
- Recurrent Neural Networks, LSTM, Residual Networks
- Distributed & Performance Programming
- Operational Methodologies
- Programming langages for Data Engineering
- C & C++ for Distributed Computing
- Portable and scalable large-scale parallel applications using OpenMP & OpenMPI
- Java & Scala programming
- Java for Map Reduce in Hadoop & Scala for SPARK
- Microsoft .NET for Distributed Computing
- Task Parallel Library – Asynchronous programming – Orleans framework for distributed systems
- Scientific Programming
- Fundamentals in Fortran & MATLAB, Fortran for R packages, MATLAB with C/C++
- Information Systems
- Design of Information Systems Algorithmics approaches to relational data modelling and object-oriented programming
DevOps
Software Engineering Project Management & Quality PMBOK (PMI) – Agile Approaches – Kanban – Quality Metrics – Unit & Integration testing
DevOps & Continuous Integration The DevOps toolbox: Nagios, Consul, Docker, Ansible, GitHub – Levaraging Visual Studio for DevOps – Continuous Integration with Jenkins & Kubernetes
Cybersecurity
Cybersecurity System Security Design Patterns – Network security – Data at-rest and in-transit encryption – Code safety – Application to blockchain technologies
Cloud & IT
- Cloud Computing
- Amazon AWS & Microsoft Azure
Preparation to AWS Certified Solutions Architect – Associate Certification – Comparative overview of Microsoft Azure - IT Fundamentals
- Semantic Web
Representing and querying web-rich data (RDF, SPARQL), Introducing Semantics in Data (RDFS, Ontologies), Tracing and following data history (VOiD, DCAT, PROV-O) - IT Foundations for Data Engineering
Computer Architecture – Operating Systems & Virtualisation – Networking