In our Author Spotlight series, TDS editors chat with community members about career paths, writing, and sources of inspiration in data science and AI. Today we are pleased to share our conversation with Vyacheslav Efimov.
Vyacheslav is a senior machine learning engineer specializing in NLP and computer vision. One of his passions is creating educational content. Vyacheslav has published over 60 articles for data science that explain complex concepts in simple terms and provide rich visualizations.
He has written many articles and explanatory articles for beginners regarding TDS. Has teaching you the basics changed the way you design or debug real systems at work?
I noticed a correlation: the more you teach something, the deeper your understanding will be. In real life, when I write a new article, I try to go into the details while keeping it simple for my readers. Emphasizing information in this way provides a deeper understanding of the algorithm's workflow.
In that sense, if you encounter an error in an algorithm you used in a past job you wrote about, you are more likely to be able to quickly find a solution to the problem on your own. From another perspective, writing an article on an unfamiliar topic and researching it yourself increases your confidence in applying that particular algorithm to your work, since you already know its scope, advantages, disadvantages, and certain details and limitations.
This allows you to come up with unique solutions that may not be obvious to others and back up your choices to other teammates, managers, or stakeholders. That knowledge is valuable to me.
With so many new models coming out every day, it's easy to feel completely overwhelmed. How do you decide what's worth a “deep dive” and what's just “getting the gist”? Has your strategy for managing this changed a bit lately?
Today, a huge variety of models and tools are emerging every day. It's easy to feel lost when you don't know what to pursue next.
Since time is limited, I usually dig deeper into topics that I feel can be applied to work or personal projects. This will help you feel more confident when presenting or explaining your results.
Typically, companies want to see results as quickly as possible. This is also one of the reasons why I focus on theoretical concepts in my articles, since I don't have time to dig deep into theory at work.
In this way, you can effectively combine practical experience from work and theoretical insights into your blog. Both of these components are important to experienced data scientists.
Have you ever participated in an AI hackathon? What did you learn from having such tight deadlines? Did you need to get better at scoping or determining a model for your project? Do you find yourself using these “hackathon lessons” when conceptualizing new ideas from scratch?
Hackathons typically last from a few hours to two days. This is a very short period of time to develop a fully functional product. But at the same time, in the past, we had to prioritize which features to focus on. In general, time management is a valuable skill to have. If there are multiple solutions to address a problem, you need to choose the one that best fits your business needs while also considering time constraints.
What's also great is that after each hackathon, you can evaluate yourself on how long it took you to implement a particular feature. For example, let's say you had to develop a RAG pipeline for the first time and it took about 4 hours to implement. The next time you face a similar problem at work or at a hackathon, you can accurately estimate in advance how much time it would take if you decide to use the same method. In that sense, hackathon experience allows you to better define time limits for the methods you want to implement in your project.
For me, the biggest lesson from the hackathon was not to focus on perfection when creating an MVP. An MVP is important, but you also need to present your product in a compelling way to clients and investors, explaining its business value, the problem it solves, and why your product is better than existing solutions on the market. In this regard, hackathons teach you to come up with better ideas that solve real problems, while at the same time quickly delivering an MVP with the most important features.
For readers considering a career path: YourRoadmap to becoming a data scientist” series ranges from basic to advanced ML. If you were to rewrite this series now, which topics would be promoted, demoted, or removed entirely, and why?
I wrote this article series a year ago. To me, all of the concepts and topics I listed are current for aspiring data scientists. The mathematics, computer science, and machine learning topics we introduce here are all essential foundations for machine learning engineers.
As we are now in the second half of 2025, we also add to the requirements that you have at least minimal experience with prompt engineering and are familiar with AI generation tools such as GitHub Copilot, Gemini CLI, and Cursor, which allow you to improve your work efficiency.
It should be noted that IT companies have higher demands and expectations than in previous years for young engineers entering the data science field. This makes sense, as modern AI tools can perform junior-level tasks so well that many companies now prefer to rely on them over entry-level engineers, as they don't have to pay a salary and get the same results in both cases.
So, if a machine learning engineer has the strong basic skills discussed in that series of articles, it becomes much easier to tackle more complex topics autonomously.
Your background blends software engineering and ML. How does that foundation shape your writing?
Having strong software engineering skills is one of the biggest benefits you can get as a data scientist.
- This shows you the importance of creating well-structured software documentation and reproducible ML pipelines.
- You'll have a better understanding of how to make your code clean and readable for others.
- Understand algorithm constraints and which data structures to choose for specific tasks based on the needs of your system.
- You can more easily collaborate with backend and DevOps engineers on integrating code modules.
- You don't have to rely on others to write SQL queries to get the data you need from your database.
The list could go on and on…
As for my articles, there aren't many articles that introduce large amounts of code. However, when we do so, we always try to make it easy to read and understand for others. I always put myself in other people's shoes and ask myself how the text and code examples in articles would be perceived and easily reproduced if I were in their shoes. This awareness has become even more important to me due to my background in software engineering, and I follow established best practices to deliver the final product.
When I look at you portfolio and GitHubhas combined software engineering fundamentals with ML from the beginning. What is one engineering habit you wish more aspiring data scientists would adopt early on?
Many engineers, especially juniors, tend to underestimate the importance of creating good documentation and reproducible pipelines. This has also happened to me in the past when I was focused on developing robust models and conducting research.
As it turns out, when I needed to change context and go back to work on my previous project a few weeks later, I was spending a lot of time figuring out how to run my old code in a messy Jupyter notebook or how to install the required libraries again. In the past, we could have spent a little more time developing a well-documented README.md that explains all the steps required to run the pipeline from scratch.
Since it was almost impossible to rerun the pipeline from scratch, we also couldn't experiment with other entry parameters, making the situation even more frustrating.
It was a painful experience for me, but one of the most valuable lessons I learned. So, if I were to give advice to aspiring data scientists about one particular habit, it would be:
“Always make sure your machine learning pipeline is reusable and well-documented.”
Over the past year, has AI changed the way you work day-to-day as a ML engineer? What has become easier, what has become harder, and what has remained the same?
In recent years, we have observed a significant increase in powerful AI engineering tools.
- LLMs can answer almost any question, give advice, and find bugs in the software.
- Cursor, Lovable, and Bolt serve as AI-powered IDEs for developers
- AI agents can complete multi-step tasks
As a machine learning engineer, regularly adapting to these tools is essential to using them effectively.
What made it easier?
Starting in 2025, I will see the following positive impacts on my work:
- It has become easy for me to quickly test ideas and prototypes. For example, at work I was sometimes given a computer vision problem that was outside the scope of my knowledge. This way, you can ask ChatGPT to suggest some ideas to solve your problem. There were times when ChatGPT generated code that I tried to run without understanding how it worked under the hood.
Then there are two possible cases.- If the code ran successfully and the initial problem was resolved, I looked further into the OpenCV documentation to try to understand how it finally works.
- If the code doesn't solve the problem, I either ignore the code, report an error to ChatGPT, or try to find a solution myself.
As you can see, we were able to quickly test a working solution, without taking any risks, and saving hours of research time.
- Another great use case for me was inserting error messages directly into ChatGPT instead of searching the internet for a solution. Although it worked fine most of the time, it was occasionally affected by errors related to issues such as library installation, system errors, and pipeline deployment on the cloud.
- Finally, I'm a big fan of AI hackathons. Having a tool that can generate both the frontend and backend of a system makes a big difference to me. You can now quickly prototype and test your MVP in hours. What I'm currently developing in a one-day hackathon could take a full week of work.
What has become more difficult/riskier?
- When using AI to write code, there is a high chance that sensitive data will be exposed. Imagine you have a file or piece of code that contains important credentials, and you accidentally input it into an AI model. That way, third-party tools will know your sensitive credentials. This can especially occur if you use tools such as Cursor and store credentials in a separate file rather than .env. Therefore, you must always be extremely careful.
- Another risk is not properly testing the code generated by the AI and not knowing how to roll it back. AI tools can introduce invisible errors in your code, especially when used to modify or refactor existing code. To ensure that AI-generated code does not degrade, you must thoroughly review and test the generated code portions, and save your changes in a way that allows you to roll back to the previous correct version at any time if necessary.
- If you rely too much on generative AI tools, you run the risk of making your code unreadable, containing overly long functions, repetitive, or not working properly. Therefore, it's important to understand that AI tools work better for prototyping than for maintaining high-quality production code.
What hasn't changed
What remains consistent for me is the importance of understanding the internal workflow of the algorithms you use, maintaining a strong computer science foundation, and writing high-quality code, among other important skills. In other words, the basic principles of software development are always needed to use AI tools effectively.
In that sense, I like to compare the set of AI tools available to junior developer replacements in teams who can delegate less important tasks. You can ask anything you want, but we cannot guarantee 100% that the task will be performed correctly. This is where having strong fundamental expertise becomes important.
To learn more about Vyacheslav's work and keep up with his latest articles, follow him on TDS or LinkedIn.
