5 minutes of Data Science

Subscribe
Archives
September 12, 2022

Week 36

5 Minutes of Data Science - week 36

Highlights from September 05 to September 11

Foreword

Hi folks!

Second week of the new format. A quick recap: github's trending repositories have a lot of diffusion-related content. Also exciting to know that DeepMind is exploring language models on confirmed veracity - models that only output content that is true.

Enjoy and see you next week.

Pedro


Blogs

  • My journey from DeepMind intern to mentor, by DeepMind
  • In conversation with AI: building better language models, by DeepMind
  • Learning to Walk in the Wild from Terrain Semantics, by Google AI
  • A Multi-Axis Approach for Vision Transformer and MLP Models, by Google AI
  • Digitizing Smell: Using Molecular Maps to Understand Odor, by Google AI
  • Master’s student uses SURE opportunity to explore impact of machine learning, by Amazon Science
  • Automatically optimizing execution of dynamic tensor operations, by Amazon Science
  • Pinch-grasping robot handles items with precision, by Amazon Science
  • A quick guide to Amazon’s 40-plus papers at Interspeech 2022, by Amazon Science
  • Interspeech 2022, by Apple Machine Learning

Podcasts

  • Zero-Cost Proxies: How to find the best neural network without training (Ep. 201), by Data Science At Home
  • Fairness in e-Commerce Search, by Data Skeptic
  • Ryan Fedasiuk - Can the U.S. and China collaborate on AI safety?, by Towards Data Science
  • Licensing & automating creativity, by Practical AI
  • Understanding Collective Insect Communication with ML, w/ Orit Peleg - #590, by The TWIML AI

Reddit

  • Big brain time, at r/Data Science (💬69)
  • Happy meme Monday, at r/Data Science (💬32)
  • Here are the questions I was asked for my entry level DS job!, at r/Data Science (💬209)
  • [P] Simple fastai based face restoration project, GitHub link in comments., at r/Machine Learning (💬34)
  • [R] SIMPLERECON — 3D Reconstruction without 3D Convolutions — 73ms per frame !, at r/Machine Learning (💬24)
  • [P] pytorch's Newest nvFuser, on Stable Diffusion to make your favorite diffusion model sample 2.5 times faster (compared to full precision) and 1.5 times faster (compared to half-precision), at r/Machine Learning (💬13)
  • Fellow statisticians, how do you develop your reading comprehension in statistics? what are your learning strategies?, at r/Ask Statistics (💬17)
  • What is the best way to explain the difference between Standard Deviation and Mean Absolute Deviation?, at r/Ask Statistics (💬19)
  • Modeling for causal inference vs prediction, at r/Ask Statistics (💬29)
  • AI Turns my Drawings into Pure Art || Stable Diffusion Drawing App, at r/Latest in ML (💬0)
  • General Video Recognition with AI (How AI Understands Videos), at r/Latest in ML (💬1)

Github jupyter notebook trends

  • CompVis/stable-diffusion: A latent text-to-image diffusion model
  • openai/CLIP: Contrastive Language-Image Pretraining
  • alexeygrigorev/mlbookcamp-code: The code from the Machine Learning Bookcamp book and a free course based on the book
  • altryne/sd-webui-colab: A repo for the maintenance of the Colab version of stable-diffusion-webui repo
  • alembics/disco-diffusion:
  • CompVis/latent-diffusion: High-Resolution Image Synthesis with Latent Diffusion Models
  • meituan/YOLOv6: YOLOv6: a single-stage object detection framework dedicated to industrial applications.
  • CompVis/taming-transformers: Taming Transformers for High-Resolution Image Synthesis
  • rinongal/textual_inversion:
  • Stability-AI/stability-sdk: SDK for interacting with stability.ai APIs (e.g. stable diffusion inference)
  • kaieye/2022-Machine-Learning-Specialization:
  • dennybritz/reinforcement-learning: Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.
  • rasbt/python-machine-learning-book-3rd-edition: The "Python Machine Learning (3rd edition)" book code repository
  • google-research/google-research: Google Research
  • DataTalksClub/mlops-zoomcamp: Free MLOps course from DataTalks.Club
  • fivethirtyeight/data: Data and code behind the articles and graphics at FiveThirtyEight
  • xuebinqin/DIS: This is the repo for our new project Highly Accurate Dichotomous Image Segmentation
  • ageron/handson-ml2: A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
  • WongKinYiu/yolov7: Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
  • jakevdp/PythonDataScienceHandbook: Python Data Science Handbook: full text in Jupyter Notebooks
  • nianticlabs/monodepth2: [ICCV 2019] Monocular depth estimation from a single image
  • MorvanZhou/PyTorch-Tutorial: Build your neural network easy and fast, 莫烦Python中文教学

Github python trends

  • AUTOMATIC1111/stable-diffusion-webui: Stable Diffusion web UI
  • sd-webui/stable-diffusion-webui: Stable Diffusion web UI
  • xinntao/Real-ESRGAN: Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
  • hpcaitech/ColossalAI: Colossal-AI: A Unified Deep Learning System for Big Model Era
  • vinta/awesome-python: A curated list of awesome Python frameworks, libraries, software and resources
  • python-poetry/poetry: Python dependency management and packaging made easy.
  • TencentARC/GFPGAN: GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
  • karpathy/minGPT: A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
  • impira/docquery: An easy way to extract information from documents
  • microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
  • kakaobrain/coyo-dataset: COYO-700M: Large-scale Image-Text Pair Dataset
  • alibaba/EasyCV: An all-in-one toolkit for computer vision
  • tiangolo/fastapi: FastAPI framework, high performance, easy to learn, fast to code, ready for production
  • eloialonso/iris: Transformers are Sample Efficient World Models
  • geohot/tinygrad: You like pytorch? You like micrograd? You love tinygrad!
  • WZMIAOMIAO/deep-learning-for-image-processing: deep learning for image processing including classification and object-detection etc.
  • PaddlePaddle/PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
  • apache/airflow: Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

See you next week!

Don't miss what's next. Subscribe to 5 minutes of Data Science:
GitHub X LinkedIn
Powered by Buttondown, the easiest way to start and grow your newsletter.