A big part of the utility of math (especially in ML) is having breadth rather than depth. The strategy of picking out specific things you don't know from papers and looking them up is only effective if you have the breadth in your background to understand the answers you find.

Broad knowledge is also what helps you manage the exponential tree of complexity you're encountering.

You won't have seen all the things you come across, but you'll develop the ability to make good judgements about what you need to read to achieve your goals. You'll learn how to recognize when a reference you're reading is more (or less) technical than you need, and how to search for something more appropriate. You'll also learn how and when you can use results without understanding the details.

Finally, as a general grad student strategy trying to learn everything just in time is not a path to success. Even if you had the perfect math oracle that you want it would be setting you up to be left behind. All the oracle gives you is the ability to catch up quickly to the ideas of others. Your job as a grad student is to generate new knowledge and to do that you need to seek things out on your own, not just follow along the latest trend. Part of your job is to go out hunting for ideas that your peers haven't found yet and bring them back to your field.

AI doesn't need to follow the human model, just like planes don't need to flap their wings like a bird. For most jobs AI will be very different from humans. Even when AI acts as human for entertainment I would imagine them being very different internally, as their job is to mimic aspects of human behaviors, not actually a human as a whole.

Almost all of machine learning is about representing data as vectors and performing linear and non-linear transformations in order to perform classification, regression, etc.

Most of ML is fitting models to data. To fit a model you minimize some error measure as a function of its real valued parameters, e.g. the weights of the connections in a neural network. The algorithms to do the minimization are based on gradient descent, which depends on derivatives, i.e. differential calculus.

Deep Learn - Implementation of research papers on Deep Learning+ NLP+ CV in Python using Keras, TensorFlow and Scikit Learn.

Machine Learning for Humans - Great article.

KubeFlow - Machine Learning Toolkit for Kubernetes.

TL-GAN: transparent latent-space GAN - Use supervised learning to illuminate the latent space of GAN for controlled generation and edit.

Grokking Deep Learning - Repository accompanying "Grokking Deep Learning" book.

Grenade - Deep Learning in Haskell.

Deep Learning Book Chapter Summaries - Attempting to make the Deep Learning Book easier to understand.

PracticalAI - Practical approach to learning machine learning.

RLgraph - Flexible computation graphs for deep reinforcement learning.

Nevergrad - Gradient-free optimization platform.

Convolution arithmetic - Technical report on convolution arithmetic in the context of deep learning.

FloydHub - Managed cloud platform for data scientists.

Style Transfer as Optimal Transport - Algorithm that transfers the distribution of visual characteristics, or style, of a reference image onto a subject image via an Optimal Transport plan.

Recommenders - Examples and best practices for building recommendation systems, provided as Jupyter notebooks.

AdaNet - Lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention.

DAWNBench - Benchmark suite for end-to-end deep learning training and inference.

Interpretable machine learning book (2018) - Explaining the decisions and behavior of machine learning models.

Kubeflow - Machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.

Machine Learning Feynman Experience - Collection of concepts I tried to implement using only Python, NumPy and SciPy on Google Colaboratory.

Tensor2Tensor - Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Deep learning drizzle - Various ML, reinforcement learning video lectures.

Xfer - Transfer Learning library for Deep Neural Networks.

Learning to Discover Efficient Mathematical Identities - Exploring how machine learning techniques can be applied to the discovery of efficient mathematical identities.

CleverHans - Adversarial example library for constructing attacks, building defenses, and benchmarking both.

Google AI Research - Contains code released by Google AI Research.

Deploying Deep Learning - Training guide for inference and deep vision runtime library for NVIDIA DIGITS and Jetson Xavier/TX1/TX2.

fairseq - Sequence-to-sequence learning toolkit for Torch from Facebook AI Research tailored to Neural Machine Translation (NMT).

TinyFlow - Tutorial code on how to build your own Deep Learning System in 2k Lines.

Deep Learning Models - Collection of various deep learning architectures, models, and tips.

Multi-Level Intermediate Representation Overview - MLIR project aims to define a common intermediate representation (IR) that will unify the infrastructure required to execute high performance machine learning models in TensorFlow and similar ML frameworks.

PySparNN - Approximate Nearest Neighbor Search for Sparse Data in Python.

ICML - International Conference on Machine Learning.

Differentiation for Hackers - The goal of this handbook is to demystify algorithmic differentiation, the tool that underlies modern machine learning.

ML and DS Applications in Industry - Curated list of applied machine learning and data science notebooks and libraries across different industries.

HoloClean - Machine Learning System for Data Enrichment.

Snorkel - System for quickly generating training data with weak supervision.

RAdam - On The Variance Of The Adaptive Learning Rate And Beyond.