Compute Resources

Before any analysis, you’ll have the option to select your compute resource. This option specifies the RAM and CPUs available to your analysis. For example, if you chose 16 GB/2 CPU from the Compute Resource dropdown, you’ll have 16 GB of RAM and 2 CPUs available to your analysis.

These standard resource sizes represent available space on the servers that your admin provisioned to run the Platform. These long-lived servers are often referred to as the “standing pool.” Since these servers are always on and ready to accept a new analysis, you can expect your analysis to launch in a matter of seconds.

When clusters start to fill up with requests for resources, some of the compute resource options will become unavailable. This will be indicated by a warning icon next to that resource option. In such a scenario, you could either request your administrator to increase the standing pool resources by adding nodes to the cluster, wait for services to end and resources to free up, or use on-demand compute resources (if this feature is enabled for your installation). Such on-demand compute resources are the topic of the next section.

../_images/Screen_Shot_2017-12-07_at_1.40.07_PM.png

On-demand Compute Resources for VPC Installations

Warning

The following feature is applicable only to Amazon Web Services (AWS) installations.

Installations on Amazon Web Services (AWS) may take advantage of on-demand compute. In the image below, there are a number of AWS EC2 instance sizes available. When you choose one of the on-demand options, the Platform will create a new EC2 instance and run your analysis on it. When your analysis is finished, the Platform will destroy the EC2 instance and you’ll only be billed (by AWS) for the time you used it.

Creating brand new EC2 instances can take several minutes to start and you will see a run status of Provisioning: Creating followed by Provisioning: Queued while the server is being created.

../_images/Screen_Shot_2017-12-07_at_1.40.22_PM.png

Tag Management for On-Demand Resources

AWS allows metadata to be assigned to EC2 instances via tags. Tags are useful for resource categorization, tracking, and management. See more information in the Amazon documentation.

The DataScience.com Platform allows users to pass in custom tags via project-level environment variables. To add a custom tag, navigate to the Settings tab in a project and click Environment Variables in the left bar. Enter a variable with the key ON_DEMAND_TAG. The value of this key should be a semi-colon-separated list of key-value pairs that are separated by a comma.

../_images/Project-environment-variables-2.png

As an on-demand instance is provisioned, the value tied to the project variable with the key ON_DEMAND_TAG will be passed into the EC2 metadata tag.

Note

Amazon has a number of restrictions for tags. Consult their documentation for more information.

Warning

Duplicate tags cause provisioning error

Tags with duplicate keys will be de-duplicated based on the first key that is received.

Be sure that each key in your environment variable’s value appears only once. Additionally, ensure that your admin has not added other fixed tags with any of the same keys as your custom tags.