A seasoned Senior Machine Learning Engineer specializing in Natural Language Processing (NLP) with over six years of hands-on experience in designing and implementing AI/ML solutions across industries, including aviation, technology, retail, and energy. Expertise spans across building MLOps architectures, fine-tuning Large Language Models (LLMs), automating workflows, and developing business-critical systems that drive measurable impact, such as revenue optimization and operational efficiency.
● Implementation of Natural Language Processing features to the internal applicant tracker system, consisting on a text parser for the creation of tabular data from the CVs of the candidates and a text evaluation tool that calculated an score based on the relevancy of the experience detailed on each cv on relationship to the job description provided on the job offer.
● Design and implementation of an MLOps architecture to handle different providers of Large Language Models, which included OpenAI, Gemini (google) and Groq, in addition to a common interface to create custom prompts, test the performance of the models based on a Needle in the Hay approach, and measure the usage of each API.
● Implementation of a POC based on LLMs to parse a proposal of an experiment document into tabular data that outlined specific entities related to it like: metric of interest, domain, sample size, unit of measurement of the metric of interest, and time scope.
● Design and implementation of several components of an internal measurement and experimentation platform which aims to follow the life cycle of all the experiments within the company from proposal, audit, scheduling, running, and conclusions obtained. The platform is built within GCP using Cloud Run, Firestore, BigQuery, GCS, GCP Workflows, and Python.
● Refactoring and upgrade of business critical Natural Language Processing models related to the valuation of domains and the main recommendation engine of suggested domains for customers.
● Optimization of MLOps related workflows on a migration process from on premise servers to AWS, by performing the implementation of SageMaker, Step Functions and EMR, along with GitHub Actions for CICD related operations.
● Design and implementation of a POC system aimed as an alternative to the legacy solution used for the proposal of alternative domain names to users in the case that the searched domain was not available for sale. This system used a fine tuned version of Llama2 LLM implemented as a sequence to sequence model, and a list of already available domain names as knowledge base for Retrieval Augmented Generation. This system used Chroma DB, HuggingFace, Transformers, LangChain.
● Implementation of automation systems for performance testing and metrics evaluation of NLP models related to valuation and appraisal of domain names, user recommendation and named entity recognition systems.
● Design and implementation of a POC chatbot that was aimed to implement security policies from user's prompts / queries by leveraging OpenAI ChatGPT3.5-turbo LLM. A Retrieval Augmented Generation was also implemented with the intention of reducing costs from API usage, context length, and improving model's performance.
● Improvement in earnings from the valuation system of ~ + 2M USD/ month thanks to the upgrades in performance.
● Implementation of custom NLP models on the main stream of data for the implementation of deduplication of Entities,Named Entity Recognition and Topic Modeling using Spark and Spacy on AWS.
● Principal Data Scientist on a Globant's ventures initiative that implements chatbots for companies in the retail sector, making use of the RASA Framework for the Training, Testing and Deployment of NLP / NLU Models / Chatbots systems.
● Implementation of RPA / OCR system that used a pre-trained NLP model for text classification and summarization, and a custom implementation of AWS Textract for OCR.
● Implementation of a Time Series Forecasting model, to predict the output in watts from several renewable energy plants. A process of data consolidation was implemented in addition to feature engineering and real time metrics evaluation.
● Setting the quality requirements, and general architecture of AI, And Advanced Analytics solutions across all the accounts that had deliverables related to AI / Machine Learning implementations.
● Processing and extraction of 10 TB of data from the client's database of the biggest energy utility company in South America to implement a smart pricing solution that was offered to 50+ Million customers.
● Quality check and migration of an HPC system to PySpark for an important risk assessment company in the financial sector.
● Implementation of Data Science workflows for several AI implementations including churn prediction, clients segmentation, construction of dashboards, and insight extraction.
● Development and implementation of several components of the SAAS catalog of the company including: a churn prediction API, a search / recommendation engine and a NLP engine for entity extraction and text classification. Those components were later used by 50+ Million customers companies like Claro, Rappi, and Movistar.
● Development of an NLP based text classification API, to tag assets on Workep’s application according to its contents.
● Implementation of an internal Data Analytics Suite that described churn prediction and customers behaviour metrics.
● Database design, implementation APIs and definition of entities on the initial version of a peer to peer services application with a backend based on Django.
● Design and implementation of a custom-made search engine based on word similarity of NLP embeddings using Spacy.
● Definition of coursework and lectures on the 2nd Seminar on Artificial Intelligence Applied to Science and Engineering aimed at students of the Master’s degree on Mechanical Engineering.
● Development and Implementation of a Dimensionality Reduction model based on PCA to assess the risk to develop type 2 diabetes on a curated database of patients provided by one of the clients.
● Implementation of an anonymization engine, to pre-process sensitive data from patients, in order to be compliant with GDPR.
● Development and implementation of tools for automating several processes using python. Such included: generate excel reports, update inventory and create product catalogs.
● Build and maintain the e-commerce sales channel using Django, MySQL and a AWS based deployment.
● Relevant Coursework: Python, Scikit Learn, Pandas, NumPy, SQL, NLP, Applied Statistics, Data Visualization, Anaconda, Three Based Models, Unsupervised Learning.
● Relevant Coursework: Neural Networks, Convolutional Neural Networks, Sequence Models, NLP, Transformers.
● Relevant Coursework: Applied Physics, Thermal / Fluid Systems Simulation, Python Programming, High Performance Computing, Embedded Systems, Finite Numeric Methods.