Method 1: Using Google Account¶
You can register a datacabinet account using your Google account at datacabinet systems. First time you login/register, it will take 5 minutes to provision your account after which you can login using the same method you registered.
After login, your hard drive is not available if there is another session still running. You can force the hard drive to connect to your current session This button will remind you if you have another session going on: . It will disconnect the other session if you click it. When this button is green () - the hard drive is correctly attached to the current session. You will need to enter an access code to be able to run Jupyter notebooks after your trial runs out.Go to datacabinet systems to login. You can either login using your credentials with datacabinet or use your google account.
A DataCabinet project comprises of a set of user code files and one conda environment. The conda environment can have package dependencies (like tensorflow or keras), notebook extensions (nbgrader, nbpresent), language kernels for notebooks etc.
To add a project, first you need to sign-in using your credentials or google account. You have to choose a Project Name and a password. Our environments also have a default Python version. You can choose the one you like. For other languages, you have to install the corresponding Jupyter kernel. Most of them are available as pypi or conda packages. See install packages for details.
Launching a notebook is easy. Just click the play button on project manager. It will prompt you for the password that was entered during project creation time(add or import).
Notebook is used as the UI to access projects in DataCabinet. A datacabinet project consists of :
User notebooks: Using notebooks you can create lots of interesting stuff: from simple notebooks to math-heavy presentations and autograded assignments. To get started with notebooks go to Notebook Basics. To create a notebook, on the Files tab, click the New button. Notebooks are composed of cells. Cells are the text area that is run as one unit. Some things worth noting about cells are:
- It has a type(code or markdown). Code type can be run whereas markdown type is only for rendering rich text in the browser. You can set the cell type using cell->cell type from the main menubar.
- The cells itself can have a toolbar. This is what is displayed on top of the cell for various purposes. You can edit the metadata or set assignment cell type using the toolbar.
- For markdown cells, there is an edit more and a display mode. Pressing "Run" takes from edit mode to display mode. And double clicking takes from display mode to edit mode.
Integrated terminal- Launch a terminal using New -> Terminal
A default kernel - DataCabinet projects all have a kernel with the name of the project itself. It is the kernel that is unique to each project and has the python version that was chosen when building the project. More about kernels:
- A kernel is the process that takes the code from the cell and runs it incrementally using REPL. REPL stands for Read, Evaluate, Print Loop.
- Kernels are available for various languages in notebook(even for compiled languages like C++ and Java). The best thing about kernels is that it keeps the values of various objects as long as the kernel is not interrupted or restarted. See packages for details.
- You can associate environment variables with kernels. Read more about kernels here: Kernels
A HOME directory: You can have any type of file in the HOME directory
- User packages
Every project you create in DataCabinet has a corresponding Conda environment with the same name as the project.
You can install additional packages for a project using either conda install packagename or pip install packagename.
In the Datacabinet platform, there are some installations that interfere with functioning and should not be pip installed(Jupyter, zmq etc). Sometimes these come as dependencies as well. For eg, the conda r-essentials package has Jupyter as a dependency. If somehow a wrong version of these packages is installed and Jupyter does not start, you may lose direct access to the project. It is a good idea to keep exporting your project at regular intervals so that you can retrieve the projects. Please write to us if you get into this situation and we can fix it. We are working on a way to fix it properly through versions.
Note: If you need to install pip packages from source, they cannot be exported. See Export Project on how to handle these.
A sample installation of a package numpy is highlighted below:
Datacabinet allows you to export your project at any point. A common workflow of working on a project would be to create a project, write some code, install some packages and do some more initialization steps. At this point, we may want to export the package to share it with people. The export command is built for replicating the current state of your project to anyone.
For Python projects installed using pip from sources(using
pip install -e <dir>), you will need to make sure the directory is not exported by putting it in a .export_ignore file. The name of the package needs to be specified in the .export_ignore file in separate lines.
Also, please do not use any other conda channel than the default channel. It will not import properly.
Exporting a project exports the code(and other files) and the conda/pip packages. You can also write a script that can do further setup when a user imports the project. See more about this script in the import section.
Other users can import that project (using the share ID provided) and get a copy of the project exactly at the point the original user exported it.
You can share this ID with anyone, you want to share the project with.
To export a project, you need to login using your credentials or your google account. The next steps are mentioned below:
Import project takes a project ID and does the following:
- Copies all the files
- Installs all packages
- IF there is a runnable file called import_init.sh in the base directory of the project, it runs that file.
There are two ways to import a project from someone else:
Method 1: Directly using the project id.¶
Get the id of the project from the sharer and use the import button. You will have to choose the name of the project and a password for the Jupyter notebook in Step 2. The import project button will be disabled until the hard drive is connected. There is a regression currently where after clicking the "Import Project" button, the dialiog closes but tile does not appear for 20 seconds. Importing twice will lead to error. Please wait after importing for 20 seconds.
When you finish writing code in the created file or just want to check current progress, just go to Cell > Run Cells. The result will be displayed in the new dialog.
DataCabinet provides you with the space on the NFS share (2GB) which allows you to publish your code using unix authorization mechanisms.