Difference between revisions of "Blockdownload"

From BITPlan Wiki
Jump to navigation Jump to search
(Created page with "=OsProject= {{OsProject |id=blockdownload |state=active |owner=WolfgangFahl |title=blockdownload |url=https://github.com/WolfgangFahl/blockdownload |version=0.0.1 |descriptio...")
 
Line 7: Line 7:
 
|title=blockdownload
 
|title=blockdownload
 
|url=https://github.com/WolfgangFahl/blockdownload
 
|url=https://github.com/WolfgangFahl/blockdownload
|version=0.0.1
+
|version=0.0.2
 
|description=segmented/blockwise download of large files
 
|description=segmented/blockwise download of large files
|date=2025-05-05
+
|date=2025-05-08
 
|since=2025-05-05
 
|since=2025-05-05
 
|storemode=property
 
|storemode=property
 
}}
 
}}
 
{{pip|blockdownload}}
 
{{pip|blockdownload}}
 +
= blockdownload =
 +
A tool that downloads large files in parallel chunks and reassembles them - perfect for improving download speeds and handling interruptions.
 +
 +
[https://pypi.org/project/blockdownload/ pypi]
 +
[https://github.com/WolfgangFahl/blockdownload/actions/workflows/build.yml Github Actions Build]
 +
[https://pypi.python.org/pypi/blockdownload/ PyPI Status]
 +
[https://github.com/WolfgangFahl/blockdownload/issues GitHub issues]
 +
[https://github.com/WolfgangFahl/blockdownload/issues/?q=is%3Aissue+is%3Aclosed GitHub closed issues]
 +
[https://WolfgangFahl.github.io/blockdownload/ API Docs]
 +
[https://www.apache.org/licenses/LICENSE-2.0 License]
 +
 +
== Overview ==
 +
 +
'''blockdownload''' is a Python-based utility that divides large file downloads into smaller, manageable blocks. This approach offers several advantages:
 +
 +
* '''Parallel downloading''': Download multiple blocks simultaneously for improved speed
 +
* '''Resume capability''': Continue interrupted downloads without starting over
 +
* '''Integrity verification''': MD5 checksums for each block ensure data integrity
 +
* '''Cross-platform''': Works on Linux, macOS, and potentially other platforms
 +
 +
=== Documentation ===
 +
 +
[http://wiki.bitplan.com/index.php/blockdownload Wiki]
 +
 +
=== Usage ===
 +
<pre>
 +
usage: blockdownload [-h] --name NAME [--blocksize BLOCKSIZE]
 +
                    [--unit {KB,MB,GB}] [--from-block FROM_BLOCK]
 +
                    [--to-block TO_BLOCK] [--boost BOOST] [--progress]
 +
                    [--yaml YAML] [--force] [--output OUTPUT]
 +
                    url target
 +
 +
Segmented file downloader using HTTP range requests.
 +
 +
positional arguments:
 +
  url                  URL to download from
 +
  target                Target directory to store .part files
 +
 +
options:
 +
  -h, --help            show this help message and exit
 +
  --name NAME          Name for the download session (used for .yaml control
 +
                        file)
 +
  --blocksize BLOCKSIZE
 +
                        Block size (default: 10)
 +
  --unit {KB,MB,GB}    Block size unit (default: MB)
 +
  --from-block FROM_BLOCK
 +
                        First block index
 +
  --to-block TO_BLOCK  Last block index (inclusive)
 +
  --boost BOOST        Number of concurrent download threads (default: 1)
 +
  --progress            Show tqdm progress bar
 +
  --yaml YAML          Path to the YAML metadata file (for standalone
 +
                        reassembly)
 +
  --force              Overwrite output file if it exists
 +
  --output OUTPUT      Path where the final target file will be saved
 +
</pre>
 +
 +
=== Example Usage ===
 +
 +
The example below demonstrates downloading a
 +
[https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/ Debian netinst ISO] image 633MB with 32MB blocks
 +
<code>https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso</code>
 +
 +
==== scripts/debian12 ====
 +
<pre>
 +
blockdownload \
 +
  https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso \
 +
  /tmp/debian12 \
 +
  --name debian12 \
 +
  --blocksize 32 \
 +
  --unit MB \
 +
  --boost 1 \
 +
  --progress \
 +
  --output /tmp/debian12/debian12.iso
 +
</pre>
 +
 +
==== How It Works ====
 +
 +
# '''Chunking''': The target file is split into blocks (32MB in the example)
 +
# '''Parallel Download''': Each block is downloaded as a separate <code>.part</code> file
 +
# '''Progress Tracking''': Real-time progress is displayed during download
 +
# '''Assembly''': After downloading, blocks are reassembled into the final file
 +
# '''Verification''': MD5 checksums verify the integrity of the final file
 +
 +
==== Output Files ====
 +
 +
The tool creates several files in the output directory:
 +
 +
* '''Block files''': <code>{name}-{block_number}.part</code> - Each downloaded block
 +
* '''YAML metadata''': <code>{name}.yaml</code> - Contains block information, checksums, and offsets
 +
* '''Final file''': The reassembled target file
 +
* '''MD5 checksum''': <code>{name}.md5</code> - Verification of the final file
 +
 +
==== Parameters ====
 +
 +
{| class="wikitable"
 +
! Parameter !! Description
 +
|-
 +
| <code>--name</code> || Base name for download files
 +
|-
 +
| <code>--blocksize</code> || Size of each block
 +
|-
 +
| <code>--unit</code> || Unit for blocksize (KB, MB, GB)
 +
|-
 +
| <code>--boost</code> || Factor to improve download speed
 +
|-
 +
| <code>--progress</code> || Show download progress
 +
|-
 +
| <code>--output</code> || Path for the final assembled file
 +
|}
 +
 +
==== Example Output ====
 +
 +
<pre>
 +
Creating target: 100%|███████████████████████| 664M/664M [00:00<00:00, 1.45GB/s]
 +
created /tmp/debian12/debian12.iso - 633.00 MB
 +
File reassembled successfully: /tmp/debian12/debian12.iso
 +
Creating target: 100%|███████████████████████| 664M/664M [00:00<00:00, 1.43GB/s]
 +
6b6604d894b6d861e357be1447b370db  /tmp/debian12/debian12.iso
 +
</pre>
 +
 +
==== Metadata Structure ====
 +
 +
The tool maintains a YAML file with detailed information about each block:
 +
 +
* Block number
 +
* File path
 +
* Offset position
 +
* MD5 checksums (both for the entire block and block header)
 +
 +
This metadata allows for robust resumption of interrupted downloads and verification of data integrity.
 +
 +
<source lang='yaml'>
 +
name: debian12
 +
url: https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso
 +
blocksize: 32
 +
size: 663748608
 +
unit: MB
 +
chunk_size: 8192
 +
md5: 6b6604d894b6d861e357be1447b370db
 +
blocks:
 +
- block: 0
 +
  path: debian12-0000.part
 +
  offset: 0
 +
  md5: b94ec0df4842a004dee7eea04eb91119
 +
  md5_head: c1708f385846feeb96eea7646e1cc14d
 +
- block: 1
 +
  path: debian12-0001.part
 +
  offset: 33554432
 +
  md5: f51efe8ee2ccf11745f2e0814afac48e
 +
  md5_head: e4e89d6ece01b898440ef7c5ec03938c
 +
- block: 2
 +
  path: debian12-0002.part
 +
  offset: 67108864
 +
  md5: 4628498be89b83486c7a9ac34feb5662
 +
  md5_head: 8c18d6a5313f84920599ab8869e99ad4
 +
- block: 3
 +
  path: debian12-0003.part
 +
  offset: 100663296
 +
  md5: 9ac5ade845864965088f11425da79733
 +
  md5_head: 7200bfb69f3d8855ce36a08b40241ace
 +
- block: 4
 +
  path: debian12-0004.part
 +
  offset: 134217728
 +
  md5: 651d1afdeb96bfbdb2f0d1ff4eb7dc9f
 +
  md5_head: 67836f61ac788d996931ec50d8cfe702
 +
- block: 5
 +
  path: debian12-0005.part
 +
  offset: 167772160
 +
  md5: b0a5944367270d4d0f460439e0515970
 +
  md5_head: b781097a91c20a1c5c275c814b54e153
 +
- block: 6
 +
  path: debian12-0006.part
 +
  offset: 201326592
 +
  md5: 8609b0ab9534397f60d45953a51e1a7f
 +
  md5_head: 2784576959977c5d6af30c9644704e6e
 +
- block: 7
 +
  path: debian12-0007.part
 +
  offset: 234881024
 +
  md5: e9de5729504f0275e546937630f158a8
 +
  md5_head: 4a372b296c1dae51661a72e36c120e5c
 +
- block: 8
 +
  path: debian12-0008.part
 +
  offset: 268435456
 +
  md5: be8f6cdea534cb690aa061394ad6f77c
 +
  md5_head: 770786dc612d1c319b1b3c9b2da44011
 +
- block: 9
 +
  path: debian12-0009.part
 +
  offset: 301989888
 +
  md5: 357ff7e961026d0722674ccf8e8cedc1
 +
  md5_head: fbcdc0f7c0c4d6ef061bae4780d6d026
 +
- block: 10
 +
  path: debian12-0010.part
 +
  offset: 335544320
 +
  md5: 3e5a63a397ac81fe862848833ff6fde0
 +
  md5_head: 85b1e258da5d1db4852def86a4664b8f
 +
- block: 11
 +
  path: debian12-0011.part
 +
  offset: 369098752
 +
  md5: a3cf4ee62f28dde9e1997f1166d2915d
 +
  md5_head: 7e1f35e49cb45836614befb30e9a4ad5
 +
- block: 12
 +
  path: debian12-0012.part
 +
  offset: 402653184
 +
  md5: 2a6877a426d2a4a9b00e0416087d5920
 +
  md5_head: fc081369872dfab4c7391d9d9c62491c
 +
- block: 13
 +
  path: debian12-0013.part
 +
  offset: 436207616
 +
  md5: 9bfd396bb5c7492174e233c23df17516
 +
  md5_head: 8e0338b0d2ccba220da4cdc5f6811d6d
 +
- block: 14
 +
  path: debian12-0014.part
 +
  offset: 469762048
 +
  md5: 97774ebf805893e6fffc1d783b15e350
 +
  md5_head: 3cc273002414d80e342afaafd94f8fc8
 +
- block: 15
 +
  path: debian12-0015.part
 +
  offset: 503316480
 +
  md5: 46c66aa5751a9c195073c5c42e6da7b5
 +
  md5_head: b9f1d528ac7ce3c862e64ca055e23faf
 +
- block: 16
 +
  path: debian12-0016.part
 +
  offset: 536870912
 +
  md5: 271308910a719094c68ab605e134f1b7
 +
  md5_head: 4c70a88e4e53ec232347b12670ca29d4
 +
- block: 17
 +
  path: debian12-0017.part
 +
  offset: 570425344
 +
  md5: 4a5813a5aa4c9f6aab172606283fa6af
 +
  md5_head: f28a14c700238cf056c0f054f56de40d
 +
- block: 18
 +
  path: debian12-0018.part
 +
  offset: 603979776
 +
  md5: 53b21c20c2b9f85fd6ab14b6d8b50650
 +
  md5_head: 25112155d35d8568338c99ba318b92a1
 +
- block: 19
 +
  path: debian12-0019.part
 +
  offset: 637534208
 +
  md5: 88c0e81466d1538b869fd5b8c3449a97
 +
  md5_head: 97b251252405b0ee3e6d04e0499b2c1e
 +
</source>
 +
 +
 +
=== Use Cases ===
 +
 +
* Downloading large ISO images
 +
* Handling unstable internet connections
 +
* Improving download speeds through parallelization
 +
* Verifying integrity of large downloads
 +
 +
=== Authors ===
 +
* [http://www.bitplan.com/Wolfgang_Fahl Wolfgang Fahl]

Revision as of 07:41, 8 May 2025

OsProject

OsProject
edit
id  blockdownload
state  active
owner  WolfgangFahl
title  blockdownload
url  https://github.com/WolfgangFahl/blockdownload
version  0.0.2
description  segmented/blockwise download of large files
date  2025-05-08
since  2025-05-05
until  


Installation

pip install blockdownload
# alternatively if your pip is not a python3 pip
pip3 install blockdownload 
# local install from source directory of blockdownload 
pip install .

upgrade

pip install blockdownload  -U
# alternatively if your pip is not a python3 pip
pip3 install blockdownload -U


blockdownload

A tool that downloads large files in parallel chunks and reassembles them - perfect for improving download speeds and handling interruptions.

pypi Github Actions Build PyPI Status GitHub issues GitHub closed issues API Docs License

Overview

blockdownload is a Python-based utility that divides large file downloads into smaller, manageable blocks. This approach offers several advantages:

  • Parallel downloading: Download multiple blocks simultaneously for improved speed
  • Resume capability: Continue interrupted downloads without starting over
  • Integrity verification: MD5 checksums for each block ensure data integrity
  • Cross-platform: Works on Linux, macOS, and potentially other platforms

Documentation

Wiki

Usage

usage: blockdownload [-h] --name NAME [--blocksize BLOCKSIZE]
                     [--unit {KB,MB,GB}] [--from-block FROM_BLOCK]
                     [--to-block TO_BLOCK] [--boost BOOST] [--progress]
                     [--yaml YAML] [--force] [--output OUTPUT]
                     url target

Segmented file downloader using HTTP range requests.

positional arguments:
  url                   URL to download from
  target                Target directory to store .part files

options:
  -h, --help            show this help message and exit
  --name NAME           Name for the download session (used for .yaml control
                        file)
  --blocksize BLOCKSIZE
                        Block size (default: 10)
  --unit {KB,MB,GB}     Block size unit (default: MB)
  --from-block FROM_BLOCK
                        First block index
  --to-block TO_BLOCK   Last block index (inclusive)
  --boost BOOST         Number of concurrent download threads (default: 1)
  --progress            Show tqdm progress bar
  --yaml YAML           Path to the YAML metadata file (for standalone
                        reassembly)
  --force               Overwrite output file if it exists
  --output OUTPUT       Path where the final target file will be saved

Example Usage

The example below demonstrates downloading a Debian netinst ISO image 633MB with 32MB blocks https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso

scripts/debian12

blockdownload \
  https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso \
  /tmp/debian12 \
  --name debian12 \
  --blocksize 32 \
  --unit MB \
  --boost 1 \
  --progress \
  --output /tmp/debian12/debian12.iso

How It Works

  1. Chunking: The target file is split into blocks (32MB in the example)
  2. Parallel Download: Each block is downloaded as a separate .part file
  3. Progress Tracking: Real-time progress is displayed during download
  4. Assembly: After downloading, blocks are reassembled into the final file
  5. Verification: MD5 checksums verify the integrity of the final file

Output Files

The tool creates several files in the output directory:

  • Block files: {name}-{block_number}.part - Each downloaded block
  • YAML metadata: {name}.yaml - Contains block information, checksums, and offsets
  • Final file: The reassembled target file
  • MD5 checksum: {name}.md5 - Verification of the final file

Parameters

Parameter Description
--name Base name for download files
--blocksize Size of each block
--unit Unit for blocksize (KB, MB, GB)
--boost Factor to improve download speed
--progress Show download progress
--output Path for the final assembled file

Example Output

Creating target: 100%|███████████████████████| 664M/664M [00:00<00:00, 1.45GB/s]
created /tmp/debian12/debian12.iso - 633.00 MB
File reassembled successfully: /tmp/debian12/debian12.iso
Creating target: 100%|███████████████████████| 664M/664M [00:00<00:00, 1.43GB/s]
6b6604d894b6d861e357be1447b370db  /tmp/debian12/debian12.iso

Metadata Structure

The tool maintains a YAML file with detailed information about each block:

  • Block number
  • File path
  • Offset position
  • MD5 checksums (both for the entire block and block header)

This metadata allows for robust resumption of interrupted downloads and verification of data integrity.

name: debian12
url: https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.10.0-amd64-netinst.iso
blocksize: 32
size: 663748608
unit: MB
chunk_size: 8192
md5: 6b6604d894b6d861e357be1447b370db
blocks:
- block: 0
  path: debian12-0000.part
  offset: 0
  md5: b94ec0df4842a004dee7eea04eb91119
  md5_head: c1708f385846feeb96eea7646e1cc14d
- block: 1
  path: debian12-0001.part
  offset: 33554432
  md5: f51efe8ee2ccf11745f2e0814afac48e
  md5_head: e4e89d6ece01b898440ef7c5ec03938c
- block: 2
  path: debian12-0002.part
  offset: 67108864
  md5: 4628498be89b83486c7a9ac34feb5662
  md5_head: 8c18d6a5313f84920599ab8869e99ad4
- block: 3
  path: debian12-0003.part
  offset: 100663296
  md5: 9ac5ade845864965088f11425da79733
  md5_head: 7200bfb69f3d8855ce36a08b40241ace
- block: 4
  path: debian12-0004.part
  offset: 134217728
  md5: 651d1afdeb96bfbdb2f0d1ff4eb7dc9f
  md5_head: 67836f61ac788d996931ec50d8cfe702
- block: 5
  path: debian12-0005.part
  offset: 167772160
  md5: b0a5944367270d4d0f460439e0515970
  md5_head: b781097a91c20a1c5c275c814b54e153
- block: 6
  path: debian12-0006.part
  offset: 201326592
  md5: 8609b0ab9534397f60d45953a51e1a7f
  md5_head: 2784576959977c5d6af30c9644704e6e
- block: 7
  path: debian12-0007.part
  offset: 234881024
  md5: e9de5729504f0275e546937630f158a8
  md5_head: 4a372b296c1dae51661a72e36c120e5c
- block: 8
  path: debian12-0008.part
  offset: 268435456
  md5: be8f6cdea534cb690aa061394ad6f77c
  md5_head: 770786dc612d1c319b1b3c9b2da44011
- block: 9
  path: debian12-0009.part
  offset: 301989888
  md5: 357ff7e961026d0722674ccf8e8cedc1
  md5_head: fbcdc0f7c0c4d6ef061bae4780d6d026
- block: 10
  path: debian12-0010.part
  offset: 335544320
  md5: 3e5a63a397ac81fe862848833ff6fde0
  md5_head: 85b1e258da5d1db4852def86a4664b8f
- block: 11
  path: debian12-0011.part
  offset: 369098752
  md5: a3cf4ee62f28dde9e1997f1166d2915d
  md5_head: 7e1f35e49cb45836614befb30e9a4ad5
- block: 12
  path: debian12-0012.part
  offset: 402653184
  md5: 2a6877a426d2a4a9b00e0416087d5920
  md5_head: fc081369872dfab4c7391d9d9c62491c
- block: 13
  path: debian12-0013.part
  offset: 436207616
  md5: 9bfd396bb5c7492174e233c23df17516
  md5_head: 8e0338b0d2ccba220da4cdc5f6811d6d
- block: 14
  path: debian12-0014.part
  offset: 469762048
  md5: 97774ebf805893e6fffc1d783b15e350
  md5_head: 3cc273002414d80e342afaafd94f8fc8
- block: 15
  path: debian12-0015.part
  offset: 503316480
  md5: 46c66aa5751a9c195073c5c42e6da7b5
  md5_head: b9f1d528ac7ce3c862e64ca055e23faf
- block: 16
  path: debian12-0016.part
  offset: 536870912
  md5: 271308910a719094c68ab605e134f1b7
  md5_head: 4c70a88e4e53ec232347b12670ca29d4
- block: 17
  path: debian12-0017.part
  offset: 570425344
  md5: 4a5813a5aa4c9f6aab172606283fa6af
  md5_head: f28a14c700238cf056c0f054f56de40d
- block: 18
  path: debian12-0018.part
  offset: 603979776
  md5: 53b21c20c2b9f85fd6ab14b6d8b50650
  md5_head: 25112155d35d8568338c99ba318b92a1
- block: 19
  path: debian12-0019.part
  offset: 637534208
  md5: 88c0e81466d1538b869fd5b8c3449a97
  md5_head: 97b251252405b0ee3e6d04e0499b2c1e


Use Cases

  • Downloading large ISO images
  • Handling unstable internet connections
  • Improving download speeds through parallelization
  • Verifying integrity of large downloads

Authors