Picture this: generating web or phone apps is no longer a daunting task – you can simply describe your desired functionality in plain English and watch as lines of high-quality code are generated before your eyes. The ability to understand, learn and create code using cutting-edge code Generative AI (GenAI) tools has far-reaching implications, such […]
A Machine Learning Engineer’s Top 5 Predictions for the Future of Generative AI
What is GenAI? Generative AI (GenAI) empowers end-users to generate content, such as images and text, quickly and easily. Entrepreneurs are taking advantage of this technology to create a growing number of startups that utilize GenAI models for various aspects of content creation. In the coming year, we can expect to see a proliferation of […]
Staying Up-To-Date on AI/ML
Great Email Newsletters on AI/ML All newsletters are released weekly. Import AI – AI newsletter that summarizes recent news articles and research; I enjoy how honest and succinct this newsletter is; also like that the implications of new algorithms are always discussed by Jack, who is an advocate for improved ML model explicability and data […]
Considerations for building a rules engine in Python
I recently looked into how to implement a deterministic rule-based model on batches of data in Python and was surprised by the complexity of potential solutions I found. I want to implement a set of rules that when not obeyed will trigger an alert. It is basically a framework for applying a glorified set if-else/switch […]
The next coding frontier- comparing about Julia, Go & Rust with Python
Currently, Python is the dominant programming language of data science and machine learning and is popular for more general scripting. It’s pretty awesome compared to its predecessors like C/C++, FORTRAN due to its ease of use, flexibility and readability. Python also has an active and robust library culture after over 30 years of existence. However, […]
Integrating Both Python & R into Data Science Workflows
These days, I highly prefer coding in Python as compared to other languages that I previously used like Matlab or R. However, I have always wondered when data science teams should use one programming language over another for certain tasks. If all team members know R and Python equally well and need to train a […]
Top 10 mistakes to avoid when using Hive/Impala on Hadoop
I recently took a deep dive into Hadoop for a project where I needed to automate the population of tables using JSONs and CSVs. Inevitably, I made some mistakes along the way and would like to share the lessons learned. By sharing them, I hope to save you some time! Here are 10 mistakes to […]
Software Engineering as a Data Scientist
Many of us in Data Science come from math, biology, chemistry or engineering or other non-Computer Science backgrounds, which may mean that we don’t have much experience writing and maintaining large code bases. Recently, I found myself getting frustrated with the structure of some of my code and searching for a better way to structure […]
Using Decorators in Python
In Python, decorators allow Data Scientists to extend and modify callables, such as functions, methods and classes, without explicitly changing the callable. Using decorators can improve the readability of your code as well code flexibility and modularity. In this article, we’ll discuss why we would use decorators, how to implement decorators and give a few […]
Binder & Repl.it
I recently discovered two great tools for easily creating interactive coding environments without installing a thing. These tools facilitate sharing of code in multiple languages and are wonderful resources for demonstrating programming concepts when teaching a course. Binder The first tool is called Binder, is open-source and was released in 2017. It is awesome because […]
Creating Projects from Cookiecutter Templates
Ever want to generate a new repo based on a predefined template? Now you can using Cookiecutter! I will show you how to easily spin up a fresh Cookiecutter repo for your latest data science project in Python. Cookiecutter is an awesome command-line tool and Python package that creates projects (aka populates repo folders) based […]