Master Spécialisé en Ingénierie de données et Big Data

  • Learn how to understand the analysis, design, implementation & monitoring of IT & Big Data architectures;
  • Leverage the most prevalent programming languages and their libraries for applied machine and deep learning;
  • Learn how to architect and deploy highly distributed data and computation clusters such as Hadoop, SPARK or Microsoft Orleans;
  • Discover the DevOps world and set up continuous integration architecture;
  • Be trained to and take two Enterprise-Level Certification examination:
  • Directeur de projet informatique
  • Directeur des études informatiques
  • Urbaniste – architecte fonctionnel du SI
  • Administrateur de bases de données
  • Project Management officer
  • Dataminer – Datascientist
  • Chief Data Officer
  • Data Protection Officer
  • Consultant informatique décisionnelle – big data

Data Management

Data Science

  • Data Bases
    • Relational Databases Management Systems
      Using MySQL & Microsoft SQL Server: stand-alone and cluster deployments, integration in software, ETL, persistence frameworks
    • Advanced SQL for Data Wrangling

      Complex joins & subqueries, stored procedures & triggers
    • NoSQL databases
      Key-value store, Document store, Graph database , hybrid approaches with Apache Cassandra
  • Big Data
    • The Hadoop Ecosystem
      HDFS, MR, YARN, SPARK
    • Data Pipeline
      Classic ETL solutions – Cloud-based solutions with AWS Data Pipeline & AWS Kinesis – Open-source solution with Apache Kafka & Beam
  • Machine learning
    • Foundations of Statistical
      Analysis & Machine Learning Distributions – Descriptive & Inferential Statistics – Classification & Regression Trees
    • Machine Learning with Python
      Language fundamentals & common frameworks for machine learning: NumPy, SciPy, scikit-learn
    • Machine Learning with R
      Language fundamentals, recursive and functionnal programming, data frames, common machine learning packages
  • Deep Learning
    • Deep Learning on GPUstrong>
      Recurrent Neural Networks, LSTM, Residual Networks

Distributed & Performance Programming

Operational Methodologies

  • Programming langages for Data Engineering
    • C & C++ for Distributed Computing
      Portable and scalable large-scale parallel applications using OpenMP & OpenMPI
    • Java & Scala programming
      Java for Map Reduce in Hadoop & Scala for SPARK
    • Microsoft .NET for Distributed Computing
      Task Parallel Library – Asynchronous programming – Orleans framework for distributed systems
    • Scientific Programming
      Fundamentals in Fortran & MATLAB, Fortran for R packages, MATLAB with C/C++
  • Information Systems
    • Design of Information Systems Algorithmics approaches to relational data modelling and object-oriented programming
  • DevOps
    • Software Engineering Project Management & Quality PMBOK (PMI) – Agile Approaches – Kanban – Quality Metrics – Unit & Integration testing
    • DevOps & Continuous Integration The DevOps toolbox: Nagios, Consul, Docker, Ansible, GitHub – Levaraging Visual Studio for DevOps – Continuous Integration with Jenkins & Kubernetes
  • Cybersecurity
    • Cybersecurity System Security Design Patterns – Network security – Data at-rest and in-transit encryption – Code safety – Application to blockchain technologies

Cloud & IT

  • Cloud Computing
    • Amazon AWS & Microsoft Azure
      Preparation to AWS Certified Solutions Architect – Associate Certification – Comparative overview of Microsoft Azure
  • IT Fundamentals
    • Semantic Web
      Representing and querying web-rich data (RDF, SPARQL), Introducing Semantics in Data (RDFS, Ontologies), Tracing and following data history (VOiD, DCAT, PROV-O)
    • IT Foundations for Data Engineering
      Computer Architecture – Operating Systems & Virtualisation – Networking
Facebook
Facebook
Google+
http://empsi-sup.com/programmes/master/master-specialise-en-ingenierie-de-donnees">
YouTube
LinkedIn