Google Colab - a must-have tool for Developer and Data Scientist
- Valeria Aynbinder
- Data science, Coding
- 15 Oct, 2024
13 amazing features and usage tips.
What is Google Colab?
Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows anybody to write and execute arbitrary python code through the browser, and is especially well suited to machine learning, data analysis and education.
Google Colab is based on Jupyter Notebook, a web application for creating “computational documents” (according to Jupyter Notebook creators). In simple words, Jupyter Notebook allows you to write a kind of streamlined document (or notebook) that includes code snippets, outputs, texts, images and many more, but in addition to being able to read and write this beautifully formatted document, you can also execute its code parts and examine the inputs in the same document.
If you are unfamiliar with Jupyter Notebooks, please take a look at a Jupyter Notebook example (a picture is worth a thousand words…)
For those who are familiar with Jupyter Notebook, I would define Google Colab as: “hosted Jupyter Notebook on steroids” 💪
Google Colab is a 🆓 web application that is available to you with you regular Google account (just like Google Drive or Gmail)
Use cases — when you should consider using Google Colab
- Educational: if you are new to Python, and you are looking for a quick start to follow some tutorial, or to write and try simple python code, Colab is a great option! All you need to do is to open your browser and start coding — no installations needed!
- Sandbox: even if you are an experienced developer, and you are working on a complex project, you may find Google Colab extremely useful if you just want to try unknown (and potentially heavy) packages without the need to install them on your local environment!
- Presentations / talks: if you need to give a talk on a topic that requires you to present and / or run Python code, Google Colab is your best friend! First of all, it allows you to write and run your code during your talk seamlessly, and in a very presentational way (check out my tips below). In addition, if you happen to give some kind of hands-on session/course for corporate companies, you probably know what a headache 🤕 it is to convince them to install something on their company’s laptops. Now imagine what it takes to convince corporate IT to install all the stuff needed for one-day seminar in deep learning, for example. Installation requirements will probably include: python, pip, dozen of python packages, jupyter, etc… There is a great chance they will give up on your session rather than investing dozens of IT hours to install everything on personal laptops of 30 employees. Since Colab allows running all the needed stuff for deep learning on the cloud using your browser and the only thing you need is Google account, I was able to conduct my courses easily in many big corporate companies (and their IT was happy and endlessly grateful)
- Data Science: if you are a data scientist, or if you are working on a personal project that involves data analysis and/or machine learning, not only will Google Colab provide you a convenient and interactive environment with all the most popular packages pre-installed, but it will also provide you with free GPU to train your models!
Main advantages
- No downloads or installations needed! Really, not at all! Google Colab is based on Python 3 and comes with main data science, machine learning, deep learning, image processing and NLP packages pre-installed: numpy, pandas, sklearn, scipy, matplotlib, seaborn, tensorflow, pytorch, nltk, spacy, and many more are instantly available to you.
- You get compute resources available to you free of charge! The resources you get are as follows: — 12 GB RAM — 100GB of disk space — GPU/TPU ❗️ Let’s admit, not many of us have GPU installed on their personal laptops, so if you are working with heavy models, an ability to use GPU is an incredibly important advantage!
- Collaboration Google Colab allows you to write code in an interactive and convenient way, share it with others and collaborate on it, just like if it was a regular document.
13 amazing features and usage tips
- Files upload / download: Just click the folder 📁 icon on the left menu pane and you’ll get an access to current runtime session files, from here you can upload and download files as well:
After simple drag-and-drop you can access your uploaded files from the Python code as follows:
Note that all the uploaded files are only available as long as current session is active. After the session disconnects, all the files will be lost. Hence I suggest mounting your Google Drive to Colab. It will guarantee that all the files you are creating from your python code are safely stored in your drive.
- Mounting your Google Drive: Open the left-pane menu and click the Drive icon:
Follow the instructions and authenticate your drive, and in 2 seconds all the files from your Google drive will be instantly and seamlessly available to your python code without the need to upload or download them:
- Interactive tables for data analysis: Interactive tables allow you to view / filter / sort your tabular data not only from code, but also using interactive user-friendly UI:
Just click on the icon under the table, and you’ll get a full search, filter and sort functionality:
- Variable Inspector: Give a try to Colab Variable Inspector (click the x icon in the left-pane menu). It displays all the variables currently defined in your notebook and allows you to examine them more closely. The variables are automatically updated, added and removed as you execute your code.
- Notebook collaboration with comments: Comments are available for each code cell in Colab notebook, and the functionality is similar to this of a regular Google Doc. Great collaboration feature!
- Playground Mode: 🤾♀ In Playground Mode changes you make to the notebook will not be saved. Very useful for presentations and playing with the code! Access Playground mode from the upper menu: File =>Open in playground mode
Now you can safely play with your code without changing the original notebook.
- Table of contents: Use automatically created table of contents for your notebook, and you’ll never get lost as the notebook get longer and longer. Click on the upper icon in the left pane menu, and you’ll get a navigable table of contents for your notebook based on section and titles you wrote.
- Forms: Forms allow you to get inputs from users in an interactive and user-friendly way, store their input values in variables and continue working with these variables in your code. The input forms in your notebook will look like this:
The following code will create the first form presented above:
Check out the official Forms Colab for more options and code snippets.
- GitHub integration: You can easily save all your notebooks to your GitHub account. In the upper menu navigate to File => Save a copy in GitHub, and after short authentication process you will be able to sync your Colab notebooks with your GitHub:
- Useful Magics: Magic commands in Jupyter notebook are special commands that provide special functionalities to your notebook. Magic commands are written in a regular code cell, and the command name is preceded by % or %% Below are a couple of very useful magic commands I use on a daily basis:
- %%time: displays amount of time it takes to the cell code to run
- %%html: formats html code written in the cell
- %who: displays currently defined variables
You can find more magic commands in Jupyter documentation. Take a look, there are many interesting magics!
- Running regular terminal commands: If you need to run an old good terminal command like ls, or cp, or pip install, or whatnot, you can do it just from a regular cell code, appending exclamation mark ! before the command.
- Use GPU / TPU for 🆓: Enable GPU/TPU for your current session from the top menu: File => Edit => Notebook Settings
You should avoid using GPU unless you really need one since GPU resources available to you are limited, though usually you get enough GPU for personal use. Read more about GPU resource limits here.
- Commands Palette: There are many more great features and options in Google Colab. Check them out in Commands Palette. In the top menu, go to: Tools =>Commands Palette, and continue exploring!
I hope you found something new and useful in this article.
Thanks for reading!