Work Experience

  • Data Scientist
    CEA Paris-Saclay, France (Sep 2021-Present)

    • Built Natural Gas consumption analysis of US at county level to find correlation between temperature and consumption, and their dependence on socio-economic variables like income, employment, housing etc

    • Developed interactive visualization tools and deployed them as a cloud application (Using Plotly, Streamlit, Voila, and Heroku)

    • Designing methods to quantify regional budgets of anthropogenic CO2 emissions using ground-based near real-time activity data on energy and mobility. Carbon Monitor: https://carbonmonitor.org

  • Machine Learning Intern
    Orange Labs, Cesson-Sévigné, France (Feb 2021-Aug 2021)

    • Validated and enhanced data within a knowledge graph in the Internet of Things domain using state of the art algorithms for deep graph learning and graph embeddings for knowledge graphs

    • Achieved excellent performance for embeddings with 98% ROC-AUC and > 80% average precision scores in prediction of missing links in the graph by using an end-to-end machine learning pipeline

    • Machine Learning Intern
      Laboratoire Hubert Curien, Saint-Étienne, France (Apr 2020-Jul 2020)

      • Generated dynamic graph embeddings (numerical representation of proteins) using deep Autoencoders and predicted missing protein interactions in the network with an accuracy score of 80% using SVM

      • Identified important protein clusters (ranging from 1 to 24 clusters depending on the cell type) from dynamic graph embeddings using K-NN and clustering algorithms

Education

Master Machine Learning Data Mining
Université Jean Monnet (Average: 15,6/20) (2019-2021)

Languages and Tools

  • Python, R, Matlab, GIT, SQL, LaTeX, HTML

  • PowerBI, Docker, Pyspark, Linux, Cloud Computing

  • TensorFlow, Pytorch, Keras, Sci-kit Learn, MatplotLib, Seaborn, Plotly, Numpy, Pandas, NetworkX, DGL, Pytorch Geometric, Sci-Py, BERT, NLTK, SpaCy