Streams API (High Performance I/O)

For extremely large datasets where allocating millions of Cell objects (or even Python lists) simultaneously would consume too much memory, pyopenxlsx provides XLStreamWriter and XLStreamReader.

These classes bypass the standard document model and write/read directly to/from the underlying XML streams on disk, offering the lowest possible memory footprint.

Stream Writer

stream_writer allows you to append rows sequentially. Important: Once a stream writer is active, you should not use standard cell assignment methods on that worksheet until writer.close() is called.

from pyopenxlsx import Workbook

with Workbook("large_output.xlsx") as wb:
    ws = wb.active
    
    # Open a stream writer for this worksheet
    writer = ws.stream_writer()
    
    # 1. Append rows of plain values
    writer.append_row(["ID", "Name", "Score"])
    
    # 2. Append rows with styles (using tuples: (value, style_index))
    bold_style = wb.add_style(font=wb.styles.fonts().create(XLFont(bold=True)))
    writer.append_row([
        (1, bold_style), 
        ("Alice", bold_style), 
        99.9  # Plain value, inherits default style
    ])

    for i in range(1000000):
        # Appends a row immediately to the XML stream
        writer.append_row([i, f"User_{i}", 99.9])
        
    # Optional: Close the stream manually if not using context manager
    # writer.close()

# Recommended: Use context manager (auto-closes)
with Workbook("large_output.xlsx") as wb:
    ws = wb.active
    with ws.stream_writer() as writer:
        writer.append_row([1, 2, 3])

## Stream Reader

`stream_reader` allows you to iterate through rows sequentially without loading the entire worksheet into memory.

```python
from pyopenxlsx import Workbook

with Workbook("large_input.xlsx") as wb:
    ws = wb.active
    
    # Open a stream reader for this worksheet
    reader = ws.stream_reader()
    
    # Iterate through rows sequentially using Python iterator
    for row_data in reader:
        current_row_idx = reader.current_row_index
        # Process row_data... (a list of values)
        # print(f"Row {current_row_idx}: {row_data}")

Use Cases

  • Exporting database query results directly to Excel.

  • Parsing multi-gigabyte .xlsx files where loading the DOM would trigger Out-Of-Memory errors.