Lessons learned after 6.5 years of machine learning

I started learning machine learning over 6 years ago. This field really gained traction. In 2018, when I took my first university course on classic machine learning behind the scenes, an important way to lead to the AI boom in the early 2020s was already being developed. The GPT model has been published and followed other companies to push the limits using the model in both performance and parameter size. For me, it was the perfect time to learn machine learning. Because the field was always moving so fast that there was something new.

Usually every six to twelve months, sometimes I look back at the year and mentally fast-forward from university lectures to doing commercial AI research. Looking back, I often find new principles that come with me while studying ML. This review has shown that working deeply on one narrow topic is an important principle of my progress over the past few years. Beyond deep work, I have identified three other principles. They are not necessarily technical insights, but rather patterns of thinking and method.

The importance of deep work

Winston Churchill is not only famous for his eloquence, but also for his incredible speed. There is a general story about the oral controversy between him and Mrs. Astor, the first woman in the British Parliament. In an attempt to end the discussion with him, she said:

If I were your wife, I would poison your tea.

Churchill replied with the sharpness of his trademark:

And if I was your husband, I would drink it.

Giving such a resourceful opponent is praised for its unusual skill, and not everyone is born with such a reflective glow. Fortunately, in our domain of ML research and engineering, quick wit is not a superpower that will keep you apart. What you do is the ability to concentrate deeply.

Machine learning work, particularly the research aspect, is not at a fast pace in the traditional sense. It requires long stretches of uninterrupted, intense thinking. Coding ML algorithms, debugging ambiguous data problems, creating hypotheses – all requires deep work.

“Deep Work” means both.

Focusing skills Deep for a long time
environment It allows and encourages such a focus

Over the past two or three years, I have come to view deep work as essential to making meaningful progress. The time spent intensive immersion – several times a week – was far more productive than fragmented blocks of distracting productivity. And, thankfully, your environment will be set up to work deeply and support it.

For me, the most fulfilling period is always up to the deadline for submitting the paper. These are if you can laser focus. The world is narrowing down to your projects and you are flowing. Richard Feynman said it well:

To do truly good physics, you need absolute solid length. It requires a lot of concentration.

I'll exchange itPhysics” and “Machine Learning“And the points are still held.

You should (almost) ignore the trend

Have you heard of large language models? Of course, names like Llama, Gemini, Claude and Bard fill the technology news cycle. They are cool kids of the AI or “genai” as they are now called stylishly.

But here's the catch. When you're just starting out, you can gain momentum by following trends.

I used to work with researchers, and we both started by “running ML.” I call my former colleague John. For his research, he is then headed straight to Hot Hot's new search higher generation (RAG), hoping to improve the output of the language model by integrating external document searches. He also wanted to analyze the urgent capabilities of LLMs – he wanted to distill these models into smaller models, even if these models were not explicitly trained.

John's problem? The model he based on his work evolved too quickly. It took me a few weeks to run a new, cutting-edge model. By the time he did, newer, better models were already out. That pace of change, coupled with the unclear criteria for his niche, allowed him little control over his continued research. Especially for those who are new to research, like John and me at the time.

This is not a criticism of John (I probably failed too). Instead, I'm talking about this to make you consider: Does your progress rely on continuing to surf the most important waves of the latest trends?

Do boring data analysis (multiple times)

Every time I train a model, I sigh mentally in relief.

why? Because that means I have completed the hidden difficult part: data analysis.

This is the normal sequence:

You have a project.
Get some (actual) datasets.
Train the ML model.
But first… you need to prepare the data.

a Many You could be wrong in that last step.

Let's explain this with the mistakes I made while working with ERA5 weather data. We wanted to use historical weather patterns in ERA5 data to predict NDVI (normalized differential vegetation index) that indicates vegetation density.

My project required the integration of ERA5 weather data with NDVI satellite data obtained from the US weather agency NOAA. I was happy to train my vision trans, translated the NDVI data into ERA5 resolution, added as a separate layer, and even obtained shape mismatches.

A few days later, I visualized the model's predictions and… surprises! The model thought the Earth was upside down. Literally – my input data showed a normally oriented world, but my vegetation data was turned over at the equator.

What was wrong? I overlooked how the resolution translation reversed the orientation of the NDVI data.

Why did I miss it? Simple: I didn't want to do data engineering, but I skipped directly towards machine learning. But the reality is this. In the actual ML behavior, it's the job to get the data right.

Yes, academic research often uses curated datasets such as Imagenet, Cifar, and Squad. But for a real project? You need to:

Cleaning, alignment, normalization, and validation
Debug strange edge cases
Visually inspect intermediate data

And repeat this until it's really ready

I learned this in the difficult way by skipping steps I thought were not needed for my data. Don't do the same thing.

(Machine Learning) Research is a specific type of trial and error

From the outside, scientific advances always seem elegantly smooth:

Problem → Hypothesis → Experiment → Solution

But in reality, it's much more troublesome. You make a mistake – some are small, some worthy of face palms. (For example, the earth has been turned upside down.) That's fine. What's important is how you handle those mistakes.

Bad mistakes happen. But insightful mistakes will tell you something.

I now maintain a simple lab notebook to help me learn from perceived impairments faster. Before running the experiment, I write it down:

My hypothesis
I'm hoping it's going to happen
Why am I hoping for it

Then when the experimental results come back (often “no, it didn't work”), I can look back at why it failed and what it says about my assumptions.

This will turn the error into feedback and the feedback into learning. As the proverb says:

An expert is someone who has made all the mistakes that can be made in a very narrow field.

That's research.

Final Thoughts

6. Five years later I realized that doing well with machine learning has little to do with flashy trends and tuning (big language) models. In hindsight, I think it is:

Create time and space for deep work
Choose depth over hype
I take data analysis seriously
Accepting the messiness of trial and error

If you're just starting out, or even in a few years – these lessons are worth internalizing. They do not appear in the keynotes of the conference, but they do appear through actual progress.

Feynman quote is from the book Deep workby Cal Newport
For Churchill's quote, there are several variations, some with coffee, some with tea, some with poison

Source link