Skip to content

Latest commit

 

History

History
93 lines (59 loc) · 4.67 KB

hdinsight-hadoop-create-linux-clusters-azure-powershell.md

File metadata and controls

93 lines (59 loc) · 4.67 KB
titledescriptionms.servicems.topicms.toolms.customauthorms.authorms.reviewerms.date
Create Apache Hadoop clusters using PowerShell - Azure HDInsight
Learn how to create Apache Hadoop, Apache HBase, or Apache Spark clusters on Linux for HDInsight by using Azure PowerShell.
azure-hdinsight
how-to
azure-powershell
hdinsightactive, devx-track-azurepowershell, linux-related-content
hareshg
hgowrisankar
nijelsf
01/02/2025

Create Linux-based clusters in HDInsight using Azure PowerShell

[!INCLUDE selector]

Azure PowerShell is a powerful scripting environment that you can use to control and automate the deployment and management of your workloads in Microsoft Azure. This document provides information about how to create a Linux-based HDInsight cluster by using Azure PowerShell. It also includes an example script.

If you don't have an Azure subscription, create a free account before you begin.

Prerequisites

[!INCLUDE updated-for-az]

Azure PowerShell Az module.

Create cluster

[!INCLUDE delete-cluster-warning]

To create an HDInsight cluster by using Azure PowerShell, you must complete the following procedures:

  • Create an Azure resource group
  • Create an Azure Storage account
  • Create an Azure Blob container
  • Create an HDInsight cluster

Note

Using PowerShell to create an HDInsight cluster with Azure Data Lake Storage Gen2 is not currently supported.

The following script demonstrates how to create a new cluster:

[!Code-powershellmain]

The values you specify for the cluster login are used to create the Hadoop user account for the cluster. Use this account to connect to services hosted on the cluster such as web UIs or REST APIs.

The values you specify for the SSH user are used to create the SSH user for the cluster. Use this account to start a remote SSH session on the cluster and run jobs. For more information, see Use SSH with HDInsight.

Important

If you plan to use more than 32 worker nodes (either at cluster creation or by scaling the cluster after creation), you must also specify a head node size with at least 8 cores and 14 GB of RAM.

For more information on node sizes and associated costs, see HDInsight pricing.

It can take up to 20 minutes to create a cluster.

Create cluster: Configuration object

You can also create an HDInsight configuration object using New-AzHDInsightClusterConfig cmdlet. You can then modify this configuration object to enable additional configuration options for your cluster. Finally, use the -Config parameter of the New-AzHDInsightCluster cmdlet to use the configuration.

Customize clusters

Delete the cluster

[!INCLUDE delete-cluster-warning]

Troubleshoot

If you run into issues with creating HDInsight clusters, see access control requirements.

Next steps

Now that you've successfully created an HDInsight cluster, use the following resources to learn how to work with your cluster.

Apache Hadoop clusters

Apache HBase clusters

Apache Spark clusters

close