

- #Pip install jupyter notebook libraries how to
- #Pip install jupyter notebook libraries driver
- #Pip install jupyter notebook libraries portable
To use these local libraries, export your results from your Spark driver on the cluster to your notebook and use the notebook magic to plot your results locally. Unlike the notebook-scoped libraries, these local libraries are only available to the Python kernel and are not available to the Spark environment on the cluster. If you cannot connect your EMR cluster to a repository, use the Python libraries pre-packaged with EMR Notebooks to analyze and visualize your results locally within the notebook. The notebook-scoped libraries discussed previously require your EMR cluster to have access to a PyPI repository. Using local Python libraries in EMR Notebooks Run the following command from the notebook cell:Īfter closing your notebook, the Pandas and Matplot libraries that you installed on the cluster using the install_pypi_package API are garbage and collected out of the cluster. Open your notebook and make sure the kernel is set to PySpark. For more information, see Amazon Customer Reviews Dataset on the Registry of Open Data for AWS.


This post demonstrates the notebook-scoped libraries feature of EMR Notebooks by analyzing the publicly available Amazon customer reviews dataset for books. For more information, see Scenarios and Examples in the Amazon VPC User Guide. There are different ways to configure your VPC networking to allow clusters inside the VPC to connect to an external repository. For more information, see Creating a Notebook. The cluster should have access to the public or private PyPI repository from which you want to import the libraries. To use this feature in EMR Notebooks, you need a notebook attached to a cluster running EMR release 5.26.0 or later. At the end of the notebook session, the libraries you install through EMR Notebooks are automatically removed from the hosting EMR cluster. This allows you to recreate the library environment when you switch the notebook to a different cluster by re-executing the notebook code.
#Pip install jupyter notebook libraries portable

Multiple notebook users can import their preferred version of the library and use it without dependency clashes on the same cluster. These notebook-scoped libraries take precedence over bootstrapped libraries.
#Pip install jupyter notebook libraries how to
This post also discusses how to use the pre-installed Python libraries available locally within EMR Notebooks to analyze and plot your results. Before this feature, you had to rely on bootstrap actions or use custom AMI to install additional libraries that are not pre-packaged with the EMR AMI when you provision the cluster. This post discusses installing notebook-scoped libraries on a running cluster directly via an EMR Notebook. Last year, AWS introduced EMR Notebooks, a managed notebook environment based on the open-source Jupyter notebook application.
