How to learn the mathematics needed for machine learning

Machine Learning


It can be a scary topic for people.

Many of you want to work in machine learning, but the mathematics skills you need may seem overwhelming.

To be here, I will tell you that it's not as intimidating as you think, and provide you with roadmap, resources and advice on how to learn mathematics effectively.

Let's get into it!

Do you need maths for machine learning?

I'm often asked:

Do you need to know mathematics to work in machine learning?

The short answer is generally yesBut the depth and scope of mathematics you need to know depends on the type of role you are aiming for.

Research-based roles such as:

  • Research Engineer –Engineers who perform experiments based on research ideas.
  • Research Scientist –Full-time researcher on cutting-edge models.
  • Applied research scientist– Somewhere between research and industry.

It requires particularly strong mathematics skills.

It also depends on the company you work for. If you are in the role of a machine learning engineer or data scientist or technology:

  • deepmind
  • Microsoft AI
  • Meta-study
  • Google Research

They also need strong mathematical skills as they work in labs similar to those in universities and university labs.

In fact, most machine learning and AI research is being carried out in large corporations rather than universities, due to the economic costs of running models on large data that could reach millions of pounds.

For these roles and positions I mentioned, mathematics skills should be kept to a minimum for bachelor's degree on topics such as mathematics, physics, computer science, statistics, engineering, and more.

However, ideally one of these subjects is a master's or doctoral degree, as these degrees teach the roles based on these research and the research skills required for the company.

This may sound encouraging to some of you, but this is just the truth from the statistics.

According to the 2021 Kaggle Machine Learning & Data Science Survey notebook, the role of research scientists is extremely popular among PhD and PhD.

sauce.

And generally, the higher the education, the more money you earn, and it correlates with mathematics knowledge.

sauce.

However, if you want to work in the industry on a production project, you will need significantly fewer mathematics skills. Many of the people who work as machine learning engineers and data scientists do not have a “target” background.

This is because the industry is not so “research” concentrated. It is often about determining the best business strategy or decision and implementing it in a machine learning model.

Sometimes we need a simple decision-making engine and machine learning is overkill.

High school mathematics knowledge is usually sufficient for these roles. Still, we may need to hone important fields, especially for specific specialties such as interviews, reinforcement learning and time series.

To be honest, the majority of roles are in the industry, so the mathematics skills most people need are not at PhD or Master's level.

But if I said these qualifications don't give you any advantages, I would lie.

There are three core areas you need to know.

  • statistics
  • Calculus
  • Linear algebra

statistics

I may be a bit biased, but statistics are the most important area you need to know and put the effort into understanding.

Because most machine learning comes from statistical learning theory, learning statistics means essentially learning machine learning or its fundamentals.

These are areas that you need to study.

  • Descriptive statistics-This is useful for general analysis and model diagnosis. This is to summarise and portray the data in the best way.
    • Mean: Mean, Median, Mode
    • Spread: Standard deviation, variance, covariance
    • Plot: bars, lines, pie, histograms, error bars
  • Probability distribution– This is the center of the statistics as it defines the shape of the probability of an event. Many, and I mean many distributions, but you certainly don't need to learn all of them.
    • usually
    • Binary
    • gamma
    • Log normal
    • Poisson
    • Geometry
  • Probability Theory– As I said earlier, machine learning is based on statistical learning and comes from understanding how probability works. The most important concept is
    • Maximum likelihood estimation
    • Central Restriction Theorem
    • Bayesian statistics
  • Hypothesis test– Most real use cases of data and machine learning revolve around testing. Test your model in production or run A/B tests on your customers. Therefore, it is extremely important to understand how to run hypothesis tests.
    • Importance level
    • z test
    • t-test
    • Chi-square test
    • sampling
  • Modeling and reasoning– Models like linear regression, logistic regression, polynomial regression, and regression algorithms originally came from statistics rather than machine learning.
    • Linear regression
    • Logistic Regression
    • Polynomial regression
    • Model residuals
    • Model uncertainty
    • Generalized linear model

Calculus

Most machine learning algorithms learn in some way from gradient descent. And gradient descent has roots in calculations.

There are two main areas of calculation that need to be covered.

Differentiation

  • What are derivatives?
  • The derivative of a general function.
  • Turning points, Maxima, Minima, Saddle points.
  • Partial derivatives and multivariate calculations.
  • Chain and product rules.
  • Convex vs. non-convex differentiation function.

Integration

  • What is integration?
  • Integration with parts and alternatives.
  • Integration of common functions.
  • Area and volume integration.

Linear algebra

Linear algebra is used in deep learning everywhere in machine learning. Most models represent data and functions as matrices and vectors.

  • Vector
    • What is a vector?
    • Size, direction
    • Dot Products
    • Vector Products
    • Vector operations (add, subtracted, etc.)
  • matrix
    • What is a matrix?
    • trace
    • reverse
    • Transposed
    • Determinants
    • Dot Products
    • Matrix decomposition
  • Eigenvalues ​​and eigenvectors
    • Eigenvector search
    • Eigenvalue decomposition
    • Spectrum Analysis

There are plenty of resources and it really comes down to your learning style.

If you are chasing textbooks, you can't make the following mistakes, and it's almost everything you need:

  • Practical statistics for data scientists– I always recommend this book and for good reason. This is the only textbook you realistically need to learn data science and machine learning statistics.
  • Mathematics for Machine Learning– As the name suggests, this textbook teaches mathematics for machine learning. Much of the information in this book may be overkill, but if you study everything, your math skills will be excellent.

If you need some online courses, we've heard good things about the next course.

Study advice

The amount of mathematics content you need to learn may seem overwhelming, but don't worry.

The main thing is to break it down in stages.

Choose one of three: Statistics, Linear Algebra, or Calculus.

You need to look at what you wrote above and choose one of the things you need to know. It doesn't have to be something we recommended above.

That's the first task. There is no such thing, so don't get complicated by looking for the “best resource”.

Next, start working through resources, but don't blindly read or watch the video.

Take proactive notes and document your understanding. I personally write blog posts. This essentially employs Feynman's technique to mean “teaching” others what I know.

Blogging may be too much for some people, so make sure you have good notes, physically or digitally, in your own words, so that you can refer to them later.

The learning process is generally very simple and research is being conducted on how to do it effectively. The general key points are:

  • Do a little every day
  • Frequently checking old concepts (repeated intervals)
  • Document your learning

It's all about the process. Follow it and you will learn!


Join my free newsletter, Data Plateshares weekly tips, insights, and advice from my experience as a practice machine learning engineer. Plus, as a subscriber, you get me Free Data Science/Machine Learning Resume Template!

Data Plate | Egor Howell | Subsac
Advice and learning about data science, technology and entrepreneurship. Click to read the data, Egorhower,Newsletter.egorhowell.com



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *