How to get into ML for a developer?
I see two variations of this question a lot:
- I'm a developer. I see that ML/AI is really popular these days. How do I become an ML specialist and learn how to train LLMs?
- I'm a SysOp. How do I learn about training neural networks and building LLMs?
The best answer for you career? Don't waste time on becoming an ML specialist. Don't learn how to train neural networks from the scratch.
Everybody else will be doing that, going through the courses like "Create LLM in PyTorch in 10 days" or "Tensorflow in 30 days". This feels like an intuitive way to get into the industry.
While starting data science and machine learning departments in companies, I've observed the following pattern. It is already fairly easy to hire data scientists or ML specialists. They cost money, but there is a steady supply of them. Everybody is getting into the field this way.
ML specialists can build a convincing prototype that business really likes. However, here is what happens next - prototypes goes into a production and suddenly strange things start happening: messy code, broken APIs, OOMs, performance problems.
There is a big gap between building a prototype and developing a robust production system. Here are a few things that come as a surprise to ML-only teams:
- schema versioning (APIs and data schemas)
- code quality and patterns
- uptime
- throughput and latency
- deployments and scaling
- integration
- CI/CD and process automation
- A/B testing
- telemetry, monitoring, logging and observability
Companies start scrambling at this point, trying to hire ML Engineers/ML Operations:
- people that don't know how to train a model;
- people that know enough about models to be able to integrate them into business systems (ML Engineers) and operate them in production (ML Ops);
- people that can support ML teams and teach them good engineering practices as needed.
More people become ML-only specialists, higher is the demand for all-rounded engineers that can support these specialists.
So long story short, if you are an engineer with experience in building or operating software systems, don't waste your talent and time on becoming an ML specialist from scratch. Just add a little bit of awareness in ML to make your skills immediately applicable in this domain.
You can start by investing your time in a more practical way:
- take an existing model and learn how to execute a request against it;
- put it behind an API of your choice;
- deploy all that inside a docker (ideally, try renting a GPU-based machine for a few hours);
- repeat that for a few different models, including LLMs;
- repeat that inside a Kubernetes cluster for extra bonus points;
- put all the related code to Github.
This will give you 70% of the necessary skills needed to support ML teams in bringing their services to production.
This might not look like much, but I've been on a hiring side for this kind of role. It is nearly impossible to find somebody that can do that.
Also go through this checklist to catch up with LLM from a practical standpoint: ChatGPT quickstart for developers
If you get there and need more guidance, don't hesitate to reach out to me in the newsletter comments.
Published: May 04, 2023.
Next post in Ship with ChatGPT story: How to segment texts for embeddings?
🤗 Check out my newsletter! It is about building products with ChatGPT and LLMs: latest news, technical insights and my journey. Check out it out