Difference between revisions of "Blockdownload"
(Created page with "=OsProject= {{OsProject |id=blockdownload |state=active |owner=WolfgangFahl |title=blockdownload |url=https://github.com/WolfgangFahl/blockdownload |version=0.0.1 |descriptio...") |
|||
Line 7: | Line 7: | ||
|title=blockdownload | |title=blockdownload | ||
|url=https://github.com/WolfgangFahl/blockdownload | |url=https://github.com/WolfgangFahl/blockdownload | ||
− | |version=0.0. | + | |version=0.0.2 |
|description=segmented/blockwise download of large files | |description=segmented/blockwise download of large files | ||
− | |date=2025-05- | + | |date=2025-05-08 |
|since=2025-05-05 | |since=2025-05-05 | ||
|storemode=property | |storemode=property | ||
}} | }} | ||
{{pip|blockdownload}} | {{pip|blockdownload}} | ||
+ | = blockdownload = | ||
+ | A tool that downloads large files in parallel chunks and reassembles them - perfect for improving download speeds and handling interruptions. | ||
+ | |||
+ | [https://pypi.org/project/blockdownload/ pypi] | ||
+ | [https://github.com/WolfgangFahl/blockdownload/actions/workflows/build.yml Github Actions Build] | ||
+ | [https://pypi.python.org/pypi/blockdownload/ PyPI Status] | ||
+ | [https://github.com/WolfgangFahl/blockdownload/issues GitHub issues] | ||
+ | [https://github.com/WolfgangFahl/blockdownload/issues/?q=is%3Aissue+is%3Aclosed GitHub closed issues] | ||
+ | [https://WolfgangFahl.github.io/blockdownload/ API Docs] | ||
+ | [https://www.apache.org/licenses/LICENSE-2.0 License] | ||
+ | |||
+ | == Overview == | ||
+ | |||
+ | '''blockdownload''' is a Python-based utility that divides large file downloads into smaller, manageable blocks. This approach offers several advantages: | ||
+ | |||
+ | * '''Parallel downloading''': Download multiple blocks simultaneously for improved speed | ||
+ | * '''Resume capability''': Continue interrupted downloads without starting over | ||
+ | * '''Integrity verification''': MD5 checksums for each block ensure data integrity | ||
+ | * '''Cross-platform''': Works on Linux, macOS, and potentially other platforms | ||
+ | |||
+ | === Documentation === | ||
+ | |||
+ | [http://wiki.bitplan.com/index.php/blockdownload Wiki] | ||
+ | |||
+ | === Usage === | ||
+ | <pre> | ||
+ | usage: blockdownload [-h] --name NAME [--blocksize BLOCKSIZE] | ||
+ | [--unit {KB,MB,GB}] [--from-block FROM_BLOCK] | ||
+ | [--to-block TO_BLOCK] [--boost BOOST] [--progress] | ||
+ | [--yaml YAML] [--force] [--output OUTPUT] | ||
+ | url target | ||
+ | |||
+ | Segmented file downloader using HTTP range requests. | ||
+ | |||
+ | positional arguments: | ||
+ | url URL to download from | ||
+ | target Target directory to store .part files | ||
+ | |||
+ | options: | ||
+ | -h, --help show this help message and exit | ||
+ | --name NAME Name for the download session (used for .yaml control | ||
+ | file) | ||
+ | --blocksize BLOCKSIZE | ||
+ | Block size (default: 10) | ||
+ | --unit {KB,MB,GB} Block size unit (default: MB) | ||
+ | --from-block FROM_BLOCK | ||
+ | First block index | ||
+ | --to-block TO_BLOCK Last block index (inclusive) | ||
+ | --boost BOOST Number of concurrent download threads (default: 1) | ||
+ | --progress Show tqdm progress bar | ||
+ | --yaml YAML Path to the YAML metadata file (for standalone | ||
+ | reassembly) | ||
+ | --force Overwrite output file if it exists | ||
+ | --output OUTPUT Path where the final target file will be saved | ||
+ | </pre> | ||
+ | |||
+ | === Example Usage === | ||
+ | |||
+ | The example below demonstrates downloading a | ||
+ | [https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/ Debian netinst ISO] image 633MB with 32MB blocks | ||
+ | <code>https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso</code> | ||
+ | |||
+ | ==== scripts/debian12 ==== | ||
+ | <pre> | ||
+ | blockdownload \ | ||
+ | https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso \ | ||
+ | /tmp/debian12 \ | ||
+ | --name debian12 \ | ||
+ | --blocksize 32 \ | ||
+ | --unit MB \ | ||
+ | --boost 1 \ | ||
+ | --progress \ | ||
+ | --output /tmp/debian12/debian12.iso | ||
+ | </pre> | ||
+ | |||
+ | ==== How It Works ==== | ||
+ | |||
+ | # '''Chunking''': The target file is split into blocks (32MB in the example) | ||
+ | # '''Parallel Download''': Each block is downloaded as a separate <code>.part</code> file | ||
+ | # '''Progress Tracking''': Real-time progress is displayed during download | ||
+ | # '''Assembly''': After downloading, blocks are reassembled into the final file | ||
+ | # '''Verification''': MD5 checksums verify the integrity of the final file | ||
+ | |||
+ | ==== Output Files ==== | ||
+ | |||
+ | The tool creates several files in the output directory: | ||
+ | |||
+ | * '''Block files''': <code>{name}-{block_number}.part</code> - Each downloaded block | ||
+ | * '''YAML metadata''': <code>{name}.yaml</code> - Contains block information, checksums, and offsets | ||
+ | * '''Final file''': The reassembled target file | ||
+ | * '''MD5 checksum''': <code>{name}.md5</code> - Verification of the final file | ||
+ | |||
+ | ==== Parameters ==== | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! Parameter !! Description | ||
+ | |- | ||
+ | | <code>--name</code> || Base name for download files | ||
+ | |- | ||
+ | | <code>--blocksize</code> || Size of each block | ||
+ | |- | ||
+ | | <code>--unit</code> || Unit for blocksize (KB, MB, GB) | ||
+ | |- | ||
+ | | <code>--boost</code> || Factor to improve download speed | ||
+ | |- | ||
+ | | <code>--progress</code> || Show download progress | ||
+ | |- | ||
+ | | <code>--output</code> || Path for the final assembled file | ||
+ | |} | ||
+ | |||
+ | ==== Example Output ==== | ||
+ | |||
+ | <pre> | ||
+ | Creating target: 100%|███████████████████████| 664M/664M [00:00<00:00, 1.45GB/s] | ||
+ | created /tmp/debian12/debian12.iso - 633.00 MB | ||
+ | File reassembled successfully: /tmp/debian12/debian12.iso | ||
+ | Creating target: 100%|███████████████████████| 664M/664M [00:00<00:00, 1.43GB/s] | ||
+ | 6b6604d894b6d861e357be1447b370db /tmp/debian12/debian12.iso | ||
+ | </pre> | ||
+ | |||
+ | ==== Metadata Structure ==== | ||
+ | |||
+ | The tool maintains a YAML file with detailed information about each block: | ||
+ | |||
+ | * Block number | ||
+ | * File path | ||
+ | * Offset position | ||
+ | * MD5 checksums (both for the entire block and block header) | ||
+ | |||
+ | This metadata allows for robust resumption of interrupted downloads and verification of data integrity. | ||
+ | |||
+ | <source lang='yaml'> | ||
+ | name: debian12 | ||
+ | url: https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso | ||
+ | blocksize: 32 | ||
+ | size: 663748608 | ||
+ | unit: MB | ||
+ | chunk_size: 8192 | ||
+ | md5: 6b6604d894b6d861e357be1447b370db | ||
+ | blocks: | ||
+ | - block: 0 | ||
+ | path: debian12-0000.part | ||
+ | offset: 0 | ||
+ | md5: b94ec0df4842a004dee7eea04eb91119 | ||
+ | md5_head: c1708f385846feeb96eea7646e1cc14d | ||
+ | - block: 1 | ||
+ | path: debian12-0001.part | ||
+ | offset: 33554432 | ||
+ | md5: f51efe8ee2ccf11745f2e0814afac48e | ||
+ | md5_head: e4e89d6ece01b898440ef7c5ec03938c | ||
+ | - block: 2 | ||
+ | path: debian12-0002.part | ||
+ | offset: 67108864 | ||
+ | md5: 4628498be89b83486c7a9ac34feb5662 | ||
+ | md5_head: 8c18d6a5313f84920599ab8869e99ad4 | ||
+ | - block: 3 | ||
+ | path: debian12-0003.part | ||
+ | offset: 100663296 | ||
+ | md5: 9ac5ade845864965088f11425da79733 | ||
+ | md5_head: 7200bfb69f3d8855ce36a08b40241ace | ||
+ | - block: 4 | ||
+ | path: debian12-0004.part | ||
+ | offset: 134217728 | ||
+ | md5: 651d1afdeb96bfbdb2f0d1ff4eb7dc9f | ||
+ | md5_head: 67836f61ac788d996931ec50d8cfe702 | ||
+ | - block: 5 | ||
+ | path: debian12-0005.part | ||
+ | offset: 167772160 | ||
+ | md5: b0a5944367270d4d0f460439e0515970 | ||
+ | md5_head: b781097a91c20a1c5c275c814b54e153 | ||
+ | - block: 6 | ||
+ | path: debian12-0006.part | ||
+ | offset: 201326592 | ||
+ | md5: 8609b0ab9534397f60d45953a51e1a7f | ||
+ | md5_head: 2784576959977c5d6af30c9644704e6e | ||
+ | - block: 7 | ||
+ | path: debian12-0007.part | ||
+ | offset: 234881024 | ||
+ | md5: e9de5729504f0275e546937630f158a8 | ||
+ | md5_head: 4a372b296c1dae51661a72e36c120e5c | ||
+ | - block: 8 | ||
+ | path: debian12-0008.part | ||
+ | offset: 268435456 | ||
+ | md5: be8f6cdea534cb690aa061394ad6f77c | ||
+ | md5_head: 770786dc612d1c319b1b3c9b2da44011 | ||
+ | - block: 9 | ||
+ | path: debian12-0009.part | ||
+ | offset: 301989888 | ||
+ | md5: 357ff7e961026d0722674ccf8e8cedc1 | ||
+ | md5_head: fbcdc0f7c0c4d6ef061bae4780d6d026 | ||
+ | - block: 10 | ||
+ | path: debian12-0010.part | ||
+ | offset: 335544320 | ||
+ | md5: 3e5a63a397ac81fe862848833ff6fde0 | ||
+ | md5_head: 85b1e258da5d1db4852def86a4664b8f | ||
+ | - block: 11 | ||
+ | path: debian12-0011.part | ||
+ | offset: 369098752 | ||
+ | md5: a3cf4ee62f28dde9e1997f1166d2915d | ||
+ | md5_head: 7e1f35e49cb45836614befb30e9a4ad5 | ||
+ | - block: 12 | ||
+ | path: debian12-0012.part | ||
+ | offset: 402653184 | ||
+ | md5: 2a6877a426d2a4a9b00e0416087d5920 | ||
+ | md5_head: fc081369872dfab4c7391d9d9c62491c | ||
+ | - block: 13 | ||
+ | path: debian12-0013.part | ||
+ | offset: 436207616 | ||
+ | md5: 9bfd396bb5c7492174e233c23df17516 | ||
+ | md5_head: 8e0338b0d2ccba220da4cdc5f6811d6d | ||
+ | - block: 14 | ||
+ | path: debian12-0014.part | ||
+ | offset: 469762048 | ||
+ | md5: 97774ebf805893e6fffc1d783b15e350 | ||
+ | md5_head: 3cc273002414d80e342afaafd94f8fc8 | ||
+ | - block: 15 | ||
+ | path: debian12-0015.part | ||
+ | offset: 503316480 | ||
+ | md5: 46c66aa5751a9c195073c5c42e6da7b5 | ||
+ | md5_head: b9f1d528ac7ce3c862e64ca055e23faf | ||
+ | - block: 16 | ||
+ | path: debian12-0016.part | ||
+ | offset: 536870912 | ||
+ | md5: 271308910a719094c68ab605e134f1b7 | ||
+ | md5_head: 4c70a88e4e53ec232347b12670ca29d4 | ||
+ | - block: 17 | ||
+ | path: debian12-0017.part | ||
+ | offset: 570425344 | ||
+ | md5: 4a5813a5aa4c9f6aab172606283fa6af | ||
+ | md5_head: f28a14c700238cf056c0f054f56de40d | ||
+ | - block: 18 | ||
+ | path: debian12-0018.part | ||
+ | offset: 603979776 | ||
+ | md5: 53b21c20c2b9f85fd6ab14b6d8b50650 | ||
+ | md5_head: 25112155d35d8568338c99ba318b92a1 | ||
+ | - block: 19 | ||
+ | path: debian12-0019.part | ||
+ | offset: 637534208 | ||
+ | md5: 88c0e81466d1538b869fd5b8c3449a97 | ||
+ | md5_head: 97b251252405b0ee3e6d04e0499b2c1e | ||
+ | </source> | ||
+ | |||
+ | |||
+ | === Use Cases === | ||
+ | |||
+ | * Downloading large ISO images | ||
+ | * Handling unstable internet connections | ||
+ | * Improving download speeds through parallelization | ||
+ | * Verifying integrity of large downloads | ||
+ | |||
+ | === Authors === | ||
+ | * [http://www.bitplan.com/Wolfgang_Fahl Wolfgang Fahl] |
Revision as of 07:41, 8 May 2025
OsProject
OsProject | |
---|---|
id | blockdownload |
state | active |
owner | WolfgangFahl |
title | blockdownload |
url | https://github.com/WolfgangFahl/blockdownload |
version | 0.0.2 |
description | segmented/blockwise download of large files |
date | 2025-05-08 |
since | 2025-05-05 |
until |
Installation
pip install blockdownload
# alternatively if your pip is not a python3 pip
pip3 install blockdownload
# local install from source directory of blockdownload
pip install .
upgrade
pip install blockdownload -U
# alternatively if your pip is not a python3 pip
pip3 install blockdownload -U
blockdownload
A tool that downloads large files in parallel chunks and reassembles them - perfect for improving download speeds and handling interruptions.
pypi Github Actions Build PyPI Status GitHub issues GitHub closed issues API Docs License
Overview
blockdownload is a Python-based utility that divides large file downloads into smaller, manageable blocks. This approach offers several advantages:
- Parallel downloading: Download multiple blocks simultaneously for improved speed
- Resume capability: Continue interrupted downloads without starting over
- Integrity verification: MD5 checksums for each block ensure data integrity
- Cross-platform: Works on Linux, macOS, and potentially other platforms
Documentation
Usage
usage: blockdownload [-h] --name NAME [--blocksize BLOCKSIZE] [--unit {KB,MB,GB}] [--from-block FROM_BLOCK] [--to-block TO_BLOCK] [--boost BOOST] [--progress] [--yaml YAML] [--force] [--output OUTPUT] url target Segmented file downloader using HTTP range requests. positional arguments: url URL to download from target Target directory to store .part files options: -h, --help show this help message and exit --name NAME Name for the download session (used for .yaml control file) --blocksize BLOCKSIZE Block size (default: 10) --unit {KB,MB,GB} Block size unit (default: MB) --from-block FROM_BLOCK First block index --to-block TO_BLOCK Last block index (inclusive) --boost BOOST Number of concurrent download threads (default: 1) --progress Show tqdm progress bar --yaml YAML Path to the YAML metadata file (for standalone reassembly) --force Overwrite output file if it exists --output OUTPUT Path where the final target file will be saved
Example Usage
The example below demonstrates downloading a
Debian netinst ISO image 633MB with 32MB blocks
https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso
scripts/debian12
blockdownload \ https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso \ /tmp/debian12 \ --name debian12 \ --blocksize 32 \ --unit MB \ --boost 1 \ --progress \ --output /tmp/debian12/debian12.iso
How It Works
- Chunking: The target file is split into blocks (32MB in the example)
- Parallel Download: Each block is downloaded as a separate
.part
file - Progress Tracking: Real-time progress is displayed during download
- Assembly: After downloading, blocks are reassembled into the final file
- Verification: MD5 checksums verify the integrity of the final file
Output Files
The tool creates several files in the output directory:
- Block files:
{name}-{block_number}.part
- Each downloaded block - YAML metadata:
{name}.yaml
- Contains block information, checksums, and offsets - Final file: The reassembled target file
- MD5 checksum:
{name}.md5
- Verification of the final file
Parameters
Parameter | Description |
---|---|
--name |
Base name for download files |
--blocksize |
Size of each block |
--unit |
Unit for blocksize (KB, MB, GB) |
--boost |
Factor to improve download speed |
--progress |
Show download progress |
--output |
Path for the final assembled file |
Example Output
Creating target: 100%|███████████████████████| 664M/664M [00:00<00:00, 1.45GB/s] created /tmp/debian12/debian12.iso - 633.00 MB File reassembled successfully: /tmp/debian12/debian12.iso Creating target: 100%|███████████████████████| 664M/664M [00:00<00:00, 1.43GB/s] 6b6604d894b6d861e357be1447b370db /tmp/debian12/debian12.iso
Metadata Structure
The tool maintains a YAML file with detailed information about each block:
- Block number
- File path
- Offset position
- MD5 checksums (both for the entire block and block header)
This metadata allows for robust resumption of interrupted downloads and verification of data integrity.
name: debian12
url: https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso
blocksize: 32
size: 663748608
unit: MB
chunk_size: 8192
md5: 6b6604d894b6d861e357be1447b370db
blocks:
- block: 0
path: debian12-0000.part
offset: 0
md5: b94ec0df4842a004dee7eea04eb91119
md5_head: c1708f385846feeb96eea7646e1cc14d
- block: 1
path: debian12-0001.part
offset: 33554432
md5: f51efe8ee2ccf11745f2e0814afac48e
md5_head: e4e89d6ece01b898440ef7c5ec03938c
- block: 2
path: debian12-0002.part
offset: 67108864
md5: 4628498be89b83486c7a9ac34feb5662
md5_head: 8c18d6a5313f84920599ab8869e99ad4
- block: 3
path: debian12-0003.part
offset: 100663296
md5: 9ac5ade845864965088f11425da79733
md5_head: 7200bfb69f3d8855ce36a08b40241ace
- block: 4
path: debian12-0004.part
offset: 134217728
md5: 651d1afdeb96bfbdb2f0d1ff4eb7dc9f
md5_head: 67836f61ac788d996931ec50d8cfe702
- block: 5
path: debian12-0005.part
offset: 167772160
md5: b0a5944367270d4d0f460439e0515970
md5_head: b781097a91c20a1c5c275c814b54e153
- block: 6
path: debian12-0006.part
offset: 201326592
md5: 8609b0ab9534397f60d45953a51e1a7f
md5_head: 2784576959977c5d6af30c9644704e6e
- block: 7
path: debian12-0007.part
offset: 234881024
md5: e9de5729504f0275e546937630f158a8
md5_head: 4a372b296c1dae51661a72e36c120e5c
- block: 8
path: debian12-0008.part
offset: 268435456
md5: be8f6cdea534cb690aa061394ad6f77c
md5_head: 770786dc612d1c319b1b3c9b2da44011
- block: 9
path: debian12-0009.part
offset: 301989888
md5: 357ff7e961026d0722674ccf8e8cedc1
md5_head: fbcdc0f7c0c4d6ef061bae4780d6d026
- block: 10
path: debian12-0010.part
offset: 335544320
md5: 3e5a63a397ac81fe862848833ff6fde0
md5_head: 85b1e258da5d1db4852def86a4664b8f
- block: 11
path: debian12-0011.part
offset: 369098752
md5: a3cf4ee62f28dde9e1997f1166d2915d
md5_head: 7e1f35e49cb45836614befb30e9a4ad5
- block: 12
path: debian12-0012.part
offset: 402653184
md5: 2a6877a426d2a4a9b00e0416087d5920
md5_head: fc081369872dfab4c7391d9d9c62491c
- block: 13
path: debian12-0013.part
offset: 436207616
md5: 9bfd396bb5c7492174e233c23df17516
md5_head: 8e0338b0d2ccba220da4cdc5f6811d6d
- block: 14
path: debian12-0014.part
offset: 469762048
md5: 97774ebf805893e6fffc1d783b15e350
md5_head: 3cc273002414d80e342afaafd94f8fc8
- block: 15
path: debian12-0015.part
offset: 503316480
md5: 46c66aa5751a9c195073c5c42e6da7b5
md5_head: b9f1d528ac7ce3c862e64ca055e23faf
- block: 16
path: debian12-0016.part
offset: 536870912
md5: 271308910a719094c68ab605e134f1b7
md5_head: 4c70a88e4e53ec232347b12670ca29d4
- block: 17
path: debian12-0017.part
offset: 570425344
md5: 4a5813a5aa4c9f6aab172606283fa6af
md5_head: f28a14c700238cf056c0f054f56de40d
- block: 18
path: debian12-0018.part
offset: 603979776
md5: 53b21c20c2b9f85fd6ab14b6d8b50650
md5_head: 25112155d35d8568338c99ba318b92a1
- block: 19
path: debian12-0019.part
offset: 637534208
md5: 88c0e81466d1538b869fd5b8c3449a97
md5_head: 97b251252405b0ee3e6d04e0499b2c1e
Use Cases
- Downloading large ISO images
- Handling unstable internet connections
- Improving download speeds through parallelization
- Verifying integrity of large downloads