Sanity of mindset to explore AI/ Machine / Deep Learning things
Along with my many projects either at work or hobbies for exploring (let's say, try to learn) so many new AI/ML/DL (Artificial Intelligence, Machine Learning, Deep Learning) papers, algorithms, libraries, and tools popping up like the greens growing out of melting ice glaciers due to the rising temperature in polar zones. Mostly, I either use either Google CoLab (time limits in timeout) or my DrSnowbird's (my open source) jupyter ML container (at your machine no-time-out) creating many Jupyter notebooks for trying out many of those new ML/DL algorithms, frameworks, papers, etc. At some point, I will share those hundreds of Jupyter notebooks that I created or transcribed other papers or algorithms. Let me stay focus on the topic, for now, I am just like many of my fellow human engineers and scientists in trying to either keep up or grow my "state-of-the-art" knowledge/skills in AI/ML/DL to "try not to be obsolete" or perhaps, out of curiosity of "what's this new tools/algorithms actually about?".
Well, often I lost myself in the exploding galaxy of AI/ML/DL new things popping up daily and I always have to come back to frame my mind to normal sanity after a while in chasing the endless learning of new AI/ML/DL algorithms, papers, libraries. I eventually build up my "guidelines" to frame my "base mindset" to "explore" any of those new things not just ML/DL ones as my dumb-and-simple mindset to not to lost along my "exploration" ranging from NLP (natural language processing, e.g., language understanding, semantics, question, and answering, etc.), CV (computer vision, object-tracking / recognition, large-scale processing), Graph-based deep learning (to do similar to NLP sub-words (as Transformer does) and image learning in DL CNN algorithms to apply features to create abstraction as our human vision perception understanding to try to learn and predict pattern in networks e.g., social networks, messaging flows, or some call deep graph learning). Sorry, let me back to the topic again.
Including myself like many other fellow human scientists and engineers, I will first maintain the basic camp of knowledge such as algebra (equations, matrices), numerical analysis (your old pales like finite difference maths, Newton, 1st / 2nd order differential/derivatives), and basic probability concepts. When in doubt, you can use those online mathematics tools to break down the fancy complicated mathematical equations in the papers to plot it out or try it out with dump-and-simple just a few actual numbers in variables. The most important for me to learn is actually not to remember the equations in the algorithms or papers, I always try to understand the fancy equations in simple "common sense". For example, as Deep Learning commonly used "discrete item Entropy Loss", I will use a simple pen and paper with 3 number to actually try it manually. And, then, I will just build a simple python, java, Excel spreadsheet to try out with just a few numbers (human common is good in comprehending 3~5 numbers unless your brain's specialty more). There are many good blogs explaining AI/ML/DL terms, say entropy loss with even visual explanation. Still, once you can comprehend the "common sense" you can just drop the specific equations since you comprehend, again, "in common-sense interpretation" with the conceptual flow in your mind about how and why it works the way it should.
This principle is the most important in learning any algorithm besides AI/ML/DL algorithms. If you understand the common-sense concept of why and how it works, you will be able to go into the algorithms to change or customize them later when the algorithms are not working as you want to. For example, when applying some pre-trained Deep Learning models to your own data and it doesn't work as it should. Now, it is time to recall your "common sense understanding" as your weapon to figure out why and how to fix the problems. Knowing how it works and why it works or why it should not is the most important principle and not how to use ready-to-use code or ML/DL code you created. Yet another example, when your ML/DL algorithms don't converge or reach higher precisions after all your trials in changing hyperparameters, it is time to apply the basic understanding to each part of algorithms or computations and use your common-sense understanding to help you to figure out why not working.
Well, not meant to overwhelm some of you with terms, in short, I always use simple-and-dumb small examples with my knowledge of basic maths I already master (or know). Humans are very good at "induction thinking". Once you comprehend the "common sense" of how it really works, your amazing (as your human fellow engineers and scientists) will naturally induct it to generalize equations as any fancy greeks in those technical papers. With this simple mindset as 1st principal, at least, for me, I was able to ride through one-by-one new technical algorithms even though I can't remember all those fancy complex greeks and equations. Once you build the habit as I did use the "dumb-and-simple" first principal, you will not be intimidated by any of those new paper's fancy greek equations.
With that mindset and your basic camp of "foundation mathematics from your college or even high-school (depending on the things you try to comprehend)", the next principle for me is to "identify what this thing (new algorithm, technical paper) is really about". For example, there are many pre-trained DL models, before I invest my time to learn the details and coding in how to use it, I controlled my "impulse in trying to load it and learn how to use it". I will apply my simple first level buckets of AI/ML/DL catalog "type" buckets, e.g., what this "new thing" is about, e.g., NLP, Image, Data (general and broad) - binary, text (unstructured, semi-structured, fully structured (very raw!)), graph (network style, one thing having some link or relation to other, etc.), voice, signal stream (musical), video stream, etc. as my first level buckets. The next mental bucket is "what this new thing is about from ML/DL aspect". For example, the buckets like features processing, time-series related classification, precision, ML/DL of some kind of algorithms, etc., precision improvements of some type of ML/DL algorithms, turning of pre-trained, or application to some specific domains, etc. When in doubt, I use something like Google ML/DL online terms catalog to guide me to create my own multiple indexing tree with "buckets in label what the new thing is about".
Thirdly, I will ask myself, if possible, to identify when I should use or when not to, what the limitations of the new thing, what the assumptions are. So, this will help me to have some knowledge when explaining what and why suitable ML/DL algorithms solve my ML/DL problems.
In all, for me, I found that the above simple-and-dumb mindset to navigate the exploration of the ever-growing exponentially exploding sky of new AI/ML/DL algorithms, libraries, papers, etc. Actually, I apply these simple-and-dumb principles before starting my journey in exploring any other knowledge areas to avoid push myself into insanity. Some of you may already have a similar mindset in handling the exploration journey as I am doing facing daily. Hopefully, this will make you feel your journey more enjoyable in exploring new AI/ML/DL knowledge.
Cheer! Enjoy your journey in navigating through the new galaxy of exploding AI/ML/DL things.
DrSnowbird.