Posts ML Devlog, Episode 1
Post
Cancel

ML Devlog, Episode 1

In my search to understand Machine Learning, I decided to start a devlog, where I will be logging my progress building a full ML stack from scratch using C++.
My choice of using C++ for this task might be dubious as most mainstream ML projects are built in python; plus, with the current build-fast/ship-fast startup model, more than once we are encouraged to not reinvent the wheel; and when we do, reinventing it in the most low-effort manner possible

After reading Andrej Karparthy’s “Yes you should understand backprop”, I realized that there is a lot of ML theory that I need to understand so that I can master the full breath of the field, before diving deep into the fancy libraries and cool research. I am pretty confident this field is going to explode in the next 5 years, and that a good grasp of the fundamentals will be critical in thriving in a post-AGI world.
So far, I have followed some very useful tuitorials on how to build a custom NN:

For the past two months, I have been learning, building and sharpening my skills and understanding of DNNs. Now, I believe it is time I put my skills and understanding to the test by writing and documenting a neural network in which I can fit a custom word2vec model.
From there, I plan to move-on and build popular DNN architectures such as LSTM, CNN, RNN, GRU, and BIGRU. Afterwards, I plan to build a video-captioning tool, similar to Karpathy’s Char NN.

First, I will need ot implement a good numpy-esque library that will fit my needs. I plan to build naively the first few versions of that library, then slowly optimize as I learn more.

This post is licensed under CC BY 4.0 by the author.