Working in a Session¶
Sessions combine interactive data science tools with packages and compute resources. Sessions are perfect for iterative analytical work, such as exploratory data analysis or feature engineering. The Platform currently supports Jupyter and RStudio, with Zeppelin coming soon.
When you launch a session, you may select from the the default set of
environments created by your administrator. You can install additional
libraries once inside a session, just like you would on a regular laptop
(for example, using
pip in Python). To learn more, see the
The DataScience.com Platform currently supports two interactive session tools:
- Jupyter: Jupyter is a staple in the Python open source data community, but has kernels for R and many other languages. For more resources, see the Project Jupyter community page.
- RStudio: RStudio is a fully-featured development environment primarily for R programmers. The DataScience.com Platform supports the open source version of RStudio. For more information, see their docs.
Launch a session¶
To start a session, select “Launch a Session” from the project actions button, then configure the following options:
- Branch: Determine the branch of your repo that you’ll work on. The files from the most recent commit on your branch will be available in the session.
- Name: Opt whether to name the session to help you keep track of multiple, concurrent sessions.
- Tool: Choose an interactive tool to use in your session.
- Compute Resource: Select from a list of machine sizes specified by your administrator.
- Environment: Choose a set of pre-installed libraries. For more on environments, see the Environments and Dependencies page .
- Additional Requirements: Install additional dependencies at runtime from a text file. For more on additional requirements, see the Environments and Dependencies page.
You can navigate back to a running session from your project’s Activity tab, or from the Running Resources menu, shown here:
Just like traditional Git workflows on a personal computer, sessions clone from a branch, changes are staged (automatically by the sync menu), and then you push your changes with a commit message back to the Git remote.
After you’ve made some changes to your files in a session, you can save them by syncing back to the Git repo. From the top Platform chrome bar in your session, drop down the Session menu, and select Sync.
On the Sync menu, you’ll see which files have been added, deleted, or modified. Using the checkboxes, you can select which files you would like to sync. You can enter an optional commit message and then sync your changes back to the Git repo.
Be mindful of file sizes. Most Git providers have size limits for files you can store. For example, GitHub limits files to 100MB. Also, the DataScience.com Platform web app has a upload/download limit of 200MB, which affects downloading files from the Jupyter file browser.
If the file changes you’ve made don’t conflict with changes your team has made since you started your session, the Platform will push all your files as a new commit to the active branch.
If there are conflicts, you’ll have two choices:
- Cancel: this option reverts your Git status back to the moment you hit Sync. You may keep working and manually resolve conflicts using the Jupyter or RStudio file editors.
- Create Branch: this option creates a new branch and pushes your changes to that branch. The parent of the branch will be the commit that was originally loaded into your Session.
Git commands behind the scenes¶
Below are the exact commands that run for each Sync feature.
Loading the Sync menu:
git add . git commit -m <message you provide> git fetch git merge <branch you chose when launching> --no-commit --no-ff
Cancelling a Sync after a conflict:
Creating a new branch after a conflict:
git branch <name you provide>
Shut down a session¶
A session will run and consume compute resources until you stop it. To shut down a session, navigate to the Session menu in the top bar and select Shutdown.
You can’t recover unsaved changes from a session after shutting down. If you want to save the work you have done, make sure to sync your files before shutting down.