There are multiple scenario that you may need to use different set of python libraries in your python code or ETL scripts.
a. You can either set up a separate development endpoint for each set.
b. You can overwrite the library .zip file(s) that your development endpoint loads every time you switch scripts.
In the console you canĀ specify one or more library .zip files for a development endpoint when you create it. You can zip all the libraries and keep in a S3 path ( s3://freshers-in-bucket/prefix/site-packages.zip) . If you may need to point to multiple zip files then you can mention all separated by comma (s3://freshers-in-bucket-A/prefix/libA.zip,s3://freshers-in-bucket-B/prefix/libB.zip) . If you are going to update the library then you can use the console to re-import them into your development endpoint.
You can specify library files using the AWS Glue APIs as well as bellow.
dep = glue.create_dev_endpoint( EndpointName="freshers_in_DevEndpoint", RoleArn="arn:aws:iam::42398602034423", SecurityGroupIds="in-dfr5gdddreww", SubnetId="subnet-f4234ddgd", PublicKey="ssh-rsa ASSDFEeerwTFJKTDSQWEQWFDGHGHGy...", NumberOfNodes=2, ExtraPythonLibsS3Path="s3://freshers-in-bucket-A/prefix/libA.zip,s3://freshers-in-bucket-B/prefix/libB.zip")
For Zeppelin Notebook
Call the following PySpark function before importing a package or packages from your .zip file
sc.addPyFile("/home/glue/downloads/python/freshers-in-packages.zip")
CreateJob : If you are doing a create job then you need to use –extra-py-files default parameter
job = glue.create_job(Name='freshersSampleJob', Role='Glue_Freshers_Role', Command={'Name': 'freshers_in', 'ScriptLocation': 's3://freshers_bucket/scripts/freshers_sample_script.py'}, DefaultArguments={'--extra-py-files': 's3://freshers-in-bucket-A/prefix/libA.zip,s3://freshers-in-bucket-B/prefix/libB.zip'})
For Jobrun , you can always override the default library setting with a different one
runId = glue.start_job_run(JobName='freshers_in_sampleJob', Arguments={'--extra-py-files': 's3://freshers-in-bucket-A/prefix/libA.zip'}) AWS Glue