Original Source Here
D4S Sunday Briefing #157
A weekly newsletter with the latest developments in Data Science and Machine Learning and Artificial Intelligence.
May 29, 2022
Welcome to the May 29th edition of the Sunday Briefing.
This week we’re on hiatus from blogging, but you can still catch up on our recent posts. First, we celebrate World Goth Day with the latest V4Sci post “Unknown Pleasures Plot: Recreating an iconic album cover and your favorite 80s t-shirt”. Over at Medium, we have “Exploring the Dow-Jones Industrial Average using Linear Regression”. You can also catch up on the latest post in the G4Sci series: Prim’s Minimum Spanning Tree Algorithm: Finding the shortest path to every node.
On our regularly scheduled content we explore some Lessons Learned From Running Apache Airflow at Scale, the physics origins of the most important statistical ideas of recent times and the end of Big Data.
While on the more academic front we explore belief propagation for permutations, rankings, and partial orders, how group mixing drives inequality in face-to-face gatherings and whether or not prompt-based Models are Clueless.
This weeks ‘Data Science Book’ highlight is Data Science Book is “The Practitioner’s Guide to Graph Data” by D. K. Gosnell and M. Broecheler. As always you can find all the previous book recommendations on our website. In the video of the week we have a lecture by Leslie Lamport: Thinking Above the Code.
Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, just go ahead and forward this email to them. This will help us spread the word!
The D4S Team
The latest post on the Visualization for Data Science substack: Unknown Pleasures Plot: Recreating an iconic album cover and your favorite 80s t-shirt is now out, Don’t forget to Subscribe so you’re first in line to receive every post.
The latest post on the Graphs for Data Science substack: Prim’s Minimum Spanning Tree Algorithm: Finding the shortest path to every node is now out.You should Sign Up to make sure you never miss a post!
The latest post in the CoVID-19 series, ‘How to model the effects of vaccination’ takes a look at how simple modifications of the SIR model can help us better understand how vaccines work. As usual, all the code is available in GitHub: http://github.com/DataForScience/Epidemiology101
The latest post in the Causality series covers section ‘3.7 — Mediation’, a recipe to calculate the controlled directed effect. The code for each blog post in this series is hosted by a dedicated GitHub repository: https://github.com/DataForScience/Causality
Data Science Book:
This weeks Data Science Book is “The Practitioner’s Guide to Graph Data” by D. K. Gosnell and M. Broecheler. Graph Thinking and Graph Data are topics near and dear to our hearts here at D4Sci (checkout G4Sci if you haven’t yet) and this book does an excellent job of introducing both fundamental and advanced topics and techniques using practical real world datasets and state of the art graph databases. The book is exceptionally well written and easy to follow, with practical “rules of thumb” generously sprinkled throughout along with practical examples that you can use to grok as the various concepts are they are introduced. A must have for anyone interested in Graph Thinking and Graph Databases.
Tutorials and blog posts that came across our desk this week.
- Lessons Learned From Running Apache Airflow at Scale [shopify.engineering]
- Physics origins of the most important statistical ideas of recent times [science-memo.blogspot.com]
- (How to Write a (Lisp) Interpreter (in Python)) [norvig.com]
- Useful Python decorators for Data Scientists [bytepawn.com]
- The end of Big Data [benn.substack.com]
- Why unprecedented bird flu outbreaks sweeping the world are concerning scientists [nature.com]
- Complete Detailed Tutorial on Linear Regression in Python for Beginners [pub.towardsai.net]
- Artificial intelligence is breaking patent law [nature.com]
Fresh From The Press:
Some of the most interesting academic papers published recently
- Belief propagation for permutations, rankings, and partial orders (G. T. Cantwell, C. Moore)
- Group mixing drives inequality in face-to-face gatherings (M. Oliveira, F. Karimi, M. Zens, J. Schaible, M. Génois, M. Strohmaier)
- Addressing the socioeconomic divide in computational modeling for infectious diseases (M. Tizzoni, E. O. Nsoesie, L. Gauvin, M. Karsai, N. Perra, S. Bansal)
- Are Prompt-based Models Clueless? (P. Kavumba, R. Takahashi, Y. Oda)
- FLiB: Fair Link Prediction in Bipartite Network (P. Kansal, N. Kumar, S. Verma, K. Singh, P. Pouduval)
- Autonomous graph mining algorithm search with best performance trade-off (M. Yoon, T. Gervet, B. Hooi, C. Faloutsos)
- Neighbourhood matching creates realistic surrogate temporal networks (A. Longa, G. Cencetti, S. Lehmann, A. Passerini, B. Lepri)
Video of the Week:
Interesting discussions, ideas or tutorials that came across our desk.
Leslie Lamport: Thinking Above the Code
All the videos of the week are now available in our Youtube playlist.
Opportunities to learn from us:
- Jun 07, 2022 — Graphs and Network Algorithms for Everyone [Register] 🆕
- Jun 23, 2022 — Why and What If — Causal Analysis for Everyone [Register] 🆕
Long form tutorials:
- Natural Language Processing 5.5h, covering basic and advancing techniques using NLTK and Keras
- Times Series Analysis for Everyone 6h covering data pre-processing, visualization, ARIMA, ARCH and Deep Learning models
Thank you for subscribing to our weekly newsletter with a quick overview of the world of Data Science and Machine Learning. Please share with your contacts to help us grow!
Publishes on Sunday.
Read all stories on Medium: https://bgoncalves.medium.com/membership
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot