Best Practices: Migrating Existing Work¶
In this article you will learn how to migrate existing work onto the Platform. There are three different example cases:
- Case 1: You want to move existing GitHub/GitLab/Bitbucket repositories onto the Platform
- Case 2: You want to copy files from your local environment into an existing project on the Platform
- Case 3: Your work is not in a version-controlled repository. Where do you start?
Moving Existing GitHub/GitLab/Bitbucket Repositories on the Platform¶
This is the easiest case. Make sure that under Settings (accessible via your avatar drop-down in the top right) you have your Git provider credentials in place.
Once you have verified your credentials, go back to the Projects page, click New Project, choose your Git provider, and enter the name of the repo you want to migrate to the Platform.
Copying Files from Your Local Environment into a Project¶
If you have a written notebook or a script on your laptop and you want to move those files into an existing project, there are two methods you may follow:
- Method 1: Clone the repository of the project on your machine,
git addthe files,
git committhem and push your branch to
remote. Open a session on the Platform under that project and you should see that the new files are accessible in your project.
- Method 2: You can add files to your project by using the Upload button within your Jupyter session.
If you have multiple files that you want to move to an existing project, create a file archive (tar) and upload it to the Platform. From a Python Jupyter notebook, enter the following command to unpack the file:
!tar -xvf filename.tar
If you have compressed the file with
gzip, you can unpack and
decompress the file with a single command:
!tar -xjvf filename.tar.gz
Migrating Work That is Not in a Version-Controlled Repository¶
In this case, you have files in a folder either locally or in a remote environment that is not version-controlled.
- Avoid copying or moving large data files in your project. If your team is using the cloud, put these files on a shared file system such as Amazon AWS S3 or Microsoft Azure Blob. The Docker containers are of finite size and you don’t want to version control large data files. Github, for example, has a file size limit of 100MB. Keep your repository under 1GB in size.