A Guide to Machine Learning Research

About the writer

Joshua Adrian Cahyono

Data Science and Artificial Intelligence

Batch 2026

A*STAR

Continental

Temasek

Connect on LinkedIn Personal Website

Hi, I'm Joshua, a an undergraduate studying Data Science and Artificial Intelligence (Batch 2026) at NTU Singapore. I'm passionate about applying AI to solve real-world problems and create meaningful impact. My research interests lie in both Generative AI and Reinforcement Learning, and I've published papers in both areas. I'm currently working as a Machine Learning Research Engineer Intern at Temasek Laboratories, NTU.

What is AI?

At this point, most people have a rough idea of what Artificial Intelligence is. But to keep it simple:

Artificial Intelligence is a technology that allows machines to learn from data and make decisions based on that learning.

So no, it's not magic — it's just math and patterns. A machine trained on data learns to recognize patterns and use them to make predictions, classify stuff, or even generate new content.

AI has come a long way — from science fiction to something you interact with daily. Think Siri, Alexa, Google Translate, Spotify recommendations, ChatGPT — AI is shaping how we use tech.

Types of AI:

Narrow AI: Focused on a specific task (like playing chess, image recognition, or translation).
General AI: A system that can perform well across many domains, with intelligence comparable to a human expert.
Superintelligence: A hypothetical system that surpasses human intelligence, theoretically capable of replacing most if not all economic jobs we have today.

Right now, many argue we've reached General AI — or at least something close — with Large Language Models like ChatGPT that can reason, code, write, and communicate like humans.

Why is AI important?

We're in the brink of one of the biggest technological revolution in history. It is the only time when intelligence itself is becoming a commodity. Businesses are automating workflows, making smarter decisions, and cutting costs using AI. At the same time, new startups are emerging as AI companies, offering tools powered entirely by machine learning.

In short — AI is becoming the brain of the digital economy. Ignore it, and you fall behind.

What kinds of companies or industries need AI researchers?

Right now, most AI-related jobs are still centered in the tech industry — especially in companies that are building AI products or infrastructure (e.g., OpenAI, Google DeepMind, Meta, Microsoft, etc.).

But we're at the tipping point. As AI gets more integrated into everyday systems, we'll start to see AI roles across every industry — not just in building AI, but in applying it.

Industries already hiring AI engineers and researchers:

Healthcare: AI for diagnosis, drug discovery, and hospital management
Finance: Fraud detection, algorithmic trading, risk modeling
Manufacturing: Predictive maintenance, quality control
Retail & E-commerce: Recommendation systems, demand forecasting, customer insights
Autonomous Vehicles & Robotics: Perception, control, planning
Energy & Sustainability: Smart grid management, climate modeling, energy optimization

In Singapore, agencies like AISG (AI Singapore) and GovTech are pushing AI forward through national initiatives. Plus, startups and unicorns in ASEAN are increasingly building AI into their core products.

Over the next 5–10 years, AI won't just be a niche skill — it will be a foundational skill across sectors. From optimizing supply chains to personalizing education, AI will power the next wave of digital transformation.

In short: AI engineers will be everywhere.

What are some available roles within this field?

There are quite a few directions you can go in the AI/ML space — depending on your interest (research vs engineering vs ops) and skillset (theory, software, data, systems, etc). Here are some of the most common and emerging roles:

ML Ops Engineer
Responsible for setting up and maintaining the infrastructure for training and deploying ML models. They handle CI/CD for models, monitor model drift, and ensure models can scale and perform in production environments.
AI/ML Engineer
Builds and integrates machine learning models into products. They bridge the gap between research and production — taking ideas from notebooks to actual systems. Needs strong coding, systems thinking, and a good understanding of ML techniques.
AI/ML Researcher
Works on developing new models, algorithms, or training techniques. May focus on areas like NLP, computer vision, reinforcement learning, or foundational model research. Typically seen in academia, big tech research labs, or AI startups.
Data Scientist
Uses data analysis, statistics, and machine learning to extract insights and build models. Often works on business problems like churn prediction, pricing, or recommendation systems.
Applied Scientist / Research Engineer
Somewhere between research and engineering. Translates cutting-edge papers into practical solutions, often collaborating with both researchers and engineers. Common in big tech and deep-tech startups.
Prompt Engineer / LLM Specialist
Newer role focused on designing effective prompts, fine-tuning large language models (LLMs), or building applications on top of models like GPT, Claude, etc. Think chatbots, AI copilots, internal tools.
AI Product Manager
Oversees the development and deployment of AI products. Needs to understand both the technical and business side of things. Works closely with engineers, designers, and stakeholders.
AI Ethics / Safety Specialist
Focuses on the robustness, alignment, fairness, and safety of AI systems. Important in the development of trustworthy AI, especially as models become more powerful. Involves both technical and philosophical thinking.

This is just a small sample of the roles available. As this is a growing field, there will be more roles in the future, just like how we have more tech roles in the industry today.

Personal Journey

Where it all began...

The term "AI" has been around for decades, but only recently has it become a game changer. My interest in AI sparked during high school after watching the AlphaGo documentary. As a chess and Go player, I was fascinated by how a machine could defeat top human players in a game long thought too complex for computers. That moment was stuck with me for a long time, and I knew I wanted to be part of this journey.

When I got into NTU's Data Science and AI program, I started learning even before university began—taking the DeepLearning.ai course. I loved the math behind neural networks and seeing it applied in real life, like building my first cat vs dog classifier from scratch. This was the start of my AI journey.

The Start of My AI Journey

From there I started to learn more about AI and the subfields of AI, such as Computer Vision, Natural Language Processing, RL, and more. After ChatGPT's release, I also started to learn about the latest research in AI, such as Generative AI, and the latest breakthroughs in the field. I will list more of my learning resources in the Learning Resources section.

Internship and Research Experiences

I have been fortunate to have many opportunities to work on AI projects, and I would like to share my experiences here. I will keep it short here and share the takeaways from each experience.

1. MLDA (Machine Learning Data Analytics) Research Group (Year 1) - Visual Retrieval

I joined the Machine Learning Data Analytics group (MLDA) in my first year, working on a visual retrieval project with Huawei. Despite being new to research and PyTorch, I dove into literature reviews, idea brainstorming, and implementation. The project was a struggle but a huge learning milestone—it deepened my interest in research. Here, I learned how research is often messy, nonlinear, and driven by curiosity. Struggling through that first project taught me how to navigate ambiguity, read papers effectively, and implement ideas from scratch—skills that became foundational.

2. A*STAR Internship (Year 2) - Offline RL for Medical Recommendation

I landed a summer internship at A*STAR by cold-emailing and applying early. I worked on offline reinforcement learning (RL) for medical recommendation, implementing data pipelines with BigQuery and modeling RL agents in PyTorch. It was a great blend of data engineering and ML modeling, and I discovered I preferred building models over data cleaning and data scientist work. We eventually published a software IP and submitted a paper.

3. NTU URECA (Year 2) - LMMs Benchmarks and Evaluations Pipeline

In parallel, I joined Asst. Prof. Ziwei Liu's lab through NTU's URECA program to work on benchmarking and evaluation pipelines for large multimodal models (LMMs). Though I initially hoped to work on CV or RL, I learned a lot about LLMs and collaborated closely with PhDs. Collaborating with PhD mentors also gave me a glimpse into rigorous academic research and teamwork.

This resulted in co-authorship on two papers, including Otter-2 model and LMMS-eval pipeline.

In Year 3, I interned at Continental on autonomous navigation using Diffusion Planners for the Jackal robot. The role combined research and engineering—reading papers, adapting algorithms, and testing them on real hardware. Here I realize the difficulty of translating research to real-world applications, and the importance of having a good understanding of the practical side of AI.

Conclusion and Key Lessons

What a journey all this has been! As I am writing this, I realized that I have been through quite a lot in the past 3 years, and I can feel the growth in my skills and knowledge. I also realized that there are many "luck" involved in my journey, where a lot things seem to fall into place. To me, I am sure this is all part of God's plan for me, and I am grateful for the opportunities and experiences I have been given.

Some of the key lessons that I have learned as an AI researcher are:

Learn not to be intimidated by the math: It is scary, but it is not as scary as you think. Learning to understand the intuition behind it and why it is used.
Be a Jack of All Trades, Master of One: This might be more of a personal opinion and might be controversial depending on the person. However, I think it is important to be able to understand the big picture of the field and Computer Science in general. A lot of skills and concepts are also quite transferable and even most breakthroughs in AI are built upon random borrowed ideas from other fields. In general, it is also good to have skills in software engineering (web dev / mobile dev, etc.), SQL and database systems, classical algorithms, distributed systems, and more. Soft skills like collaboration, communication, writing, and presentation, team management are also important, so do not neglect them! These are also good for career pivots when you decide research is not for you.
Be patient and persistent: AI research is a long journey, and it is not always linear. There will be times when you feel like you are not making any progress, and that is okay. Just be patient and persistent, and you will get there.
Be open to new ideas and concepts: AI is a field that is constantly evolving, and there are always new ideas and concepts to learn. Be open to new ideas and concepts, and you will be able to pick up new things quickly.
You cannot do everything at once: It is important to be able to prioritize and focus on few projects. This is coming from personal experience, where I tried to do too many things at once and ended up not being able to do anything well.
Don't burn out and spend all your time stuck on a problem: Find time to rest and refresh your mind. It is usually difficult to continue to be productive if you are too tired.
Don't be afraid to ask for help or advice: This will bring you a long way. If possible connect with more experienced people in the field and listen to their advice. Sometimes these connections will also bring you opportunities in the future!

Recommended Paths

Fundamentals (For beginners who have no prior experience)

Just like any other field, getting the fundamentals right is key. Luckily, for AI, there are many resources online to get you started. I would recommend taking online courses to get a good foundation in ML/AI. Some of my recommendations are:

Deep Learning Specialization is a good starting point for beginners in Deep Learning.
Machine Learning Specialization is a good to know the more traditional ML concepts.
RL Specialization is an amazing course for beginners in Reinforcement Learning.
Mathematics for Machine Learning is a good course if you are not comfortable with the math or want to refresh your knowledge. Calculus, Linear Algebra, and Probability are used all the time in AI research.
All the other courses offered by DeepLearning.ai are also great, and I highly recommend them!
LLM Agents quite a recent one touching on agentic AI systems. In general, it is useful to learn latest trends in AI.

These courses are highly recommended for beginners in AI, as they provide a solid foundation in the mathematical and theoretical aspects of the field. They serve as an excellent starting point for AI enthusiasts and typically include a capstone project, allowing learners to apply their knowledge in a practical setting. While beneficial for AI engineers, mastering these concepts is essential for aspiring AI researchers.

Practical for Beginners (For those who understand Basic ML Theory)

After you have a good foundation in the fundamentals, I would recommend understanding the practical side of AI. This is where you can apply your knowledge to real-world problems. There are the courses that are more practical and applied, and are a good starting point once you have a good foundation in the fundamentals.

Worldquant Applied Data Science Lab perfect for those more into the Data Science part, learning all the necessary libraries and common techniques.
Huggingface Courses is a good course for beginners. The NLP and Computer Vision courses are especially recommended!
Kaggle Courses is a good course for beginners. The courses are project based and integrate well with the Kaggle platform and datasets.

Diving Deeper (For those who wants to be an expert in AI)

Join AI Challenges and Competitions:

A good way to practice your skills and apply your knowledge to real-world problems is to participate in AI challenges and competitions. Here are some of the ones I recommend to start with:

Kaggle Competitions: A website that lists out many AI challenges and competitions. I would recommend starting with the Kaggle Playground, which is a great way to understand the basics of ML.
AI Crowd: A website that lists out many AI challenges and competitions. These are usually more niche and fundamental AI research, but still a good way to learn and apply your knowledge.
ML Contest: A website that lists out many upcoming and ongoing AI challenges and competitions.

Paper Reading and Conferences:

After you have a good foundation in the fundamental courses, you can and should explore more about the latest research in AI. Be comfortable with diving into papers and the latest research. Also, not all papers are math heavy, so do not be afraid to read them! Some of the famous ones are also worth implementing and playing with. Here are some resources that I recommend:

Paper with Code is a great resource to find the latest research in AI.
Huggingface Daily Papers is a great resource to find the latest research in NLP and Computer Vision.
Go to NeurIPS or any of top AI conferences and watch their presentation. They provide the slides and videos of the presentations online for free after the conference!

There are many communities and forums that you can join to learn and talk with others about AI. Here are some of the ones I recommend to start with:

Subfields of AI

Pre-Generative AI

These are the the foundations of AI, all of which are the most important topics to learn for beginners in AI. An AI researcher should be comfortable with these topics before diving into research on more specialized and recent fields.

Traditional ML

Machine learning models that use statistical methods to learn patterns from data. Even though the field is now dominated by deep learning models, traditional ML is still a good starting point for beginners in AI, and is still used in many real-world applications. List of topics to start learning:

Linear Regression
Logistic Regression
Decision Trees
Random Forest
Gradient Boosting
Support Vector Machines

Neural Networks

Neural networks are a type of machine learning model that use artificial neural networks to learn patterns from data. This is the crux of all the recent deep learning breakthroughs in the field, and you must master this before diving into the next deep learning topic. List of topics to start learning:

Classification
Regression
Perceptron
Multi-Layer Perceptron (MLP)
Backpropagation
Activation Functions
Loss Functions
Regularization

Time Series

Analysis and prediction of sequential data points indexed by time. Powers forecasting, anomaly detection, and pattern recognition in financial markets, weather, and IoT systems. List of topics to start learning:

Autoregressive Integrated Moving Average (ARIMA)
Recurrent Neural Networks (RNNs)
Long Short-Term Memory (LSTMs)
Gated Recurrent Units (GRUs)

Computer Vision

Systems that interpret and understand visual information from images or videos. Powers facial recognition, object detection, and image segmentation. List of topics to start learning:

Image Classification
Object Detection
Semantic Segmentation
Convolutional Neural Networks (CNNs)
Residual Networks (ResNet)
You Only Look Once (YOLO)
Unet
Visual Transformers (ViTs)

Natural Language Processing

Enables machines to understand, interpret, and generate human language. Drives chatbots, translation services, and sentiment analysis. List of topics to start learning:

Text Classification
Question Answering
Language Translation
LSTM for Language Modeling
Transformers
Bidirectional Encoder Representations from Transformers (BERT)
Generative Pre-trained Transformer (GPT)

Reinforcement Learning

Training agents to make decisions by rewarding desired behaviors. Used in robotics, game AI, and autonomous systems. List of models to start learning:

Markov Decision Processes (MDPs)
Agent-Environment Interaction
Q-learning
Deep Q Network (DQN)
Actor-Critic (A2C, A3C)
Proximal Policy Optimization (PPO)
Monte Carlo Tree Search (MCTS)

Unsupervised and Self-Supervised Learning

Self-supervised learning is a type of unsupervised learning where the model learns to predict some aspect of the input data. List of topics to start learning:

Clustering and Dimensionality Reduction
Contrastive Learning
Autoencoders
Generative Adversarial Networks (GANs)
Variational Autoencoders (VAEs)

Generative AI

These are the most recent fields in AI, all of which were from the 2017 onwards. It is mainly driven by the development of Transformer models and Diffusion models. Most researchers and industries are focusing on these fields, so it is important to learn these after the fundamental topics in AI.

Large Language Models

Neural networks trained on vast text datasets to understand and generate human-like text. Examples include GPT-4, Claude, and Llama.

Diffusion Models

Generate high-quality images by gradually transforming noise into coherent content. Power popular image generators like DALL-E and Stable Diffusion.

Multimodal Models

AI systems that process and combine multiple types of data (text, images, audio). Can generate and understand across different modalities.

Agentic AI and Systems

AI systems that act autonomously to achieve goals, combining reasoning, planning, and execution capabilities.

Technical AI Interviews

I have been through some technical interviews, and most of the time, as long as you have a good foundation in ML/AI, you should be fine. I have some additional tips that I think are useful for those who are preparing for technical interviews:

Practice Leetcode and other coding platforms: Surprisingly, this is still used even for ML research roles, so it is still important to be comfortable with it. But I do not recommend spending too much time on it, as it is not the main focus of AI research. Sometimes, you also need to implement ML algorithms from scratch, so it is good to be comfortable with it.
Review Math, Statistics and Probability: This is the heart of AI, so it is important to be comfortable with it. You probably won't be asked to calculate probabilities or statistical values, but understanding the concepts is crucial.
Review basics of ML and AI: This is a must, as it is the foundation of AI research. You should not only memorize algorithms and techniques, but also understand the concepts and the why behind it. (Example: Why do we need to normalize the data? Why do we need to add L2 regularization? Why MSE loss is used?)
Prepare to share your projects as past experiences: In almost every interview of mine, I was asked to share my projects or past experiences and explain the details. If you don't have one, try to create one, it does not have to be super innovate or new, but try to avoid copying from others. The first time I got an internship, I was asked about my personal Mario RL and that really impressed them. It is important to be able to talk about your projects and share your learnings. Make it clear on what you have done and what you have learned. Make sure you don't look like you are only vibe coding and not able to explain your project at all, and try to review the details of the project.

About the writer​

Joshua Adrian Cahyono

What is AI?​

Types of AI:​

Why is AI important?​

What kinds of companies or industries need AI researchers?​

What are some available roles within this field?​

Personal Journey

Where it all began...​

The Start of My AI Journey​

Internship and Research Experiences​

1. MLDA (Machine Learning Data Analytics) Research Group (Year 1) - Visual Retrieval​

2. A*STAR Internship (Year 2) - Offline RL for Medical Recommendation​

3. NTU URECA (Year 2) - LMMs Benchmarks and Evaluations Pipeline​

4. Continental Internship (Year 3) - Autonomous Mobile Robot Navigation​

Conclusion and Key Lessons​

Recommended Paths

Fundamentals (For beginners who have no prior experience)​

Practical for Beginners (For those who understand Basic ML Theory)​

Diving Deeper (For those who wants to be an expert in AI)​

Join AI Challenges and Competitions:​

Paper Reading and Conferences:​

Discussing and Sharing with AI communities:​

Subfields of AI​

Pre-Generative AI​

Traditional ML​

Neural Networks​

Time Series​

Computer Vision​

Natural Language Processing​

Reinforcement Learning​

Unsupervised and Self-Supervised Learning​

Generative AI​

Large Language Models​

Diffusion Models​

Multimodal Models​

Agentic AI and Systems​

Technical AI Interviews​