Skip to content

Downloading data from Azure cloud to spark executor pod running in kubernetes is getting stuck #45406

@gagandhakar-gh

Description

@gagandhakar-gh

We are using below azure python sdk..

azure-common=1.1.28=pyhd8ed1ab_0
azure-identity=1.7.1=pyhd8ed1ab_0
azure-nspkg=3.0.2=py_0
azure-storage-blob=12.9.0=pyhd8ed1ab_0
azure-storage-common=2.1.0=py310hff52083_7
msrestazure=0.6.4=pyhd8ed1ab_0
azure-core==1.26.4
azure-mgmt-core==1.3.0
azure-mgmt-storage==16.0.0

While downloading data from cloud to spark executor pod running in kubernetes, below python code is getting stuck (may be once in 10 days)

**from azure.identity import ClientSecretCredential, DefaultAzureCredential
from azure.storage.blob import BlobServiceClient

credential = DefaultAzureCredential()

blob_service_client = BlobServiceClient(account_url=storage_url, credential=credential)

client = blob_service_client.get_container_client(azure_folders)

blobs = client.list_blobs(name_starts_with=bucket_sub_folder)

for blob in blobs:
stream = client.download_blob(blob.name)
with open(input_file, mode="wb") as download_file_handle:
for chunk in stream.chunks(): # Use default chunk size
download_file_handle.write(chunk)**

Metadata

Metadata

Assignees

No one assigned

    Labels

    customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-triageWorkflow: This is a new issue that needs to be triaged to the appropriate team.questionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions