Hi, I’m Muno —
I am a Senior Associate Data Scientist at the Bank of New York (BNY).
M.Sc. in Artificial Intelligence & Innovation, Carnegie Mellon University ’24.
B.Sc. in Electrical Engineering (Machine Learning & Controls), University of California San Diego ‘22.
I’m passionate about AI, data science, product management, and business development, with a focus on developing and evaluating technology innovations. My work bridges strategy, investment, and market research to drive data-driven decision-making and impactful outcomes.
Prioritizing Challenges and Identifying Solutions for an Early-Stage Marketplace
With $500K in pre-seed funding in a fictional case study, I faced the challenge of balancing two marketplace sides (cat sitters and owners) while building toward a billion-dollar business. Core issues included payment delays, last-minute cancellations, high churn rates, and inefficient booking processes. I proposed a dual payment verification system and operational improvements to boost user trust, reduce churn, and enhance platform reliability.
Optimizing Revenue Growth Through Strategic New Business Acquisition
This project optimized revenue growth for Plantiva, a fictional company, by allocating 20 employees across acquisition, account management, and support roles over 24 months. Key takeaways include balancing acquisition and retention, leveraging compounding revenue from account management, and reducing churn through better customer satisfaction.
Investment Memo: Sourcing Companies for a VC Investment Thesis
This project involved identifying three startups aligned with a venture capital thesis focused on visual technology and AI at the pre-seed and seed stages. After evaluating key metrics such as innovation, scalability, and market potential, Mixedbread.ai emerged as the strongest fit due to its disruptive embedding generation technology and alignment with high-growth semantic search markets.
IRGraph: Leveraging NLP, LLMs, and Knowledge Graphs for Investor Relations
Building on EDAHub, IRGraph automates the analysis of earnings calls to provide insights into executive-analyst interactions based on topics, sentiment, emotion, and stock market dynamics. The project integrates upstream data enrichment tasks using Neo4j, OpenAI and FinBERT and downstream visualization using NeoDash.
Generative AI for Venture Capital Due Diligence
As a VC Associate, I explore the use of Large Language Models (LLMs) to enhance the evaluation of startups and corporate ventures by focusing on key metrics such as market potential, technical feasibility, and business model viability. Through the development of structured prompts and rubrics, I demonstrate how LLMs can deliver consistent, scalable, and data-driven insights while reducing subjectivity.
EDAhub: Data Analysis for Investor Relations Communications
This project spans Oct 2023 - May 2024. The focus is to build a multi-modal and multi-media AI system with a voice/text UI to extract relevant data from financial statements, create summaries, perform Exploratory Data Analysis (EDA), create charts/graphs, and analyze differences across financial documents.
BurgerBot: GPT-4, Segmentation, and Manipulation
The goal of our study is to develop "BurgerBot," a framework combining GPT-4 and a segmentation model to guide a robotic arm in assembling a plastic burger toy. The objective is to understand real-time human-robot conversational interactions, focusing on the robot's adaptability to diverse instructions.
Scientific Named Entity Recognition
The goal is to build an end-to-end NLP system involving collecting our own data and training a model to identify specific entities such as method names, task names, dataset names, metric names and their values, hyperparameter names and their values within scientific publications from recent NLP conferences (ACL, EMNLP, and NAACL).
Building My Own BERT
Develop a minimalist version of BERT (Bidirectional Encoder Representations from Transformers), implementing some important components of the BERT model (self attention, layers, model, optimizer, and classifier) to perform sentence classification on sst dataset and cfimdb dataset.
Machine-Generated Text Detection
This project leverages the Multi-Scale Feature Fusion Network (MSFFN), a CNN-based model that extracts and fuses features at different scales for nuanced classification. It achieves 80.64% accuracy, outperforming transformer-based baselines with just 1.1 million parameters, making it lightweight and cost-effective. The ensemble variant, MSFFN-Ensemble, further enhances performance, achieving 86.82% accuracy, showcasing the power of domain-specific and generator-specific learning.
Corporate Strategy and Product Management
In this course, I worked with Philips in a group of students to help them become more entrepreneurial. As part of the corporate strategy team, we evaluated new business ideas to boost their market share and revenue. This involved analyzing customer personas, market size, MAP planning, and value innovation, as well as developing skills for pitching to the C-suite.
Movie Streaming Recommendation Service
The focus of this group project is to implement, evaluate, operate and monitor a movie streaming recommendation service in production, which entails many concerns, including deployment, scaling, reliability, drift and feedback loops. The streaming service has about 1 million customers and 27,000 movies.
Attention-based Speech-to-Text Deep Neural Network
In this Kaggle competition, I learned how to build an encoder to effectively extract features from a speech signal, how to construct a decoder to sequentially spell out the transcription of the audio, and how to implement an attention mechanism between the decoder and the encoder.
Face Classification and Verification using CNNs
In this Kaggle competition, the task was to build a face classifier that can extract feature vectors from face images and a face verification system that computes the similarity between feature vectors of images. I used a CNN architecture to build this model in order to achieve high accuracy on this classification and verification task.
Frame Level Classification of Speech
In this Kaggle competition, the task was to predict the phoneme label for each frame in the test set of the speech recordings, which are raw mel spectrogram frames. I used a multi-layer perceptron model and explored various hyperparameters to improve the accuracy of the prediction of the phoneme state labels for each frame in the test set.
Language Modeling using RNNs
I implemented an RNN-based language model, text prediction and degeneration, greedy decoding, and regularization techniques for RNNs. Here, I go over these concepts and a demonstration of language modeling for machine translation (English to French).
RNNs, GRUs, and CTC (MyTorch ep.3)
I implemented RNNs, GRUs, and CTC from scratch. Here, I go over a binary sentiment prediction task using the TensorFlow IMDB dataset, give an overview of these architectures and compare RNNs, GRUs, and LSTMs.
Convolutional Neural Networks (MyTorch ep.2)
I implemented convolutional neural networks (CNN) from scratch. Here, I go over a traffic sign identification task and describe the significance of convolutional layers, pooling layers, downsampling and upsampling layers, classification layer, forward pass and backpropagation through layers.
Multi-Layer Perceptron (MyTorch ep.1)
I implemented an MLP from scratch with 0, 1, and 4 hidden layers. Here, I go over an MLP sentiment analysis task and describe the significance of linear layers, activation functions, forward inference, criterion functions (MSE and CELoss), SGD optimization, batch normalization regularization, and backpropagation.
Deep Learning and Sentiment Analysis to Forecast Stock Market Volatility
This project aims to investigate the effectiveness of using sentiment analysis techniques on data obtained from multiple news sources, namely Webull, Twitter, and Reddit, to predict stock price differences during the COVID-19 pandemic. We focus on two companies, one large-cap (Zoom) and one small-cap (AMC), to explore the correlation between market sentiment and stock price movement.