Skip to content

feat(c++/python): support stream deserialization for c++ and python#3307

Draft
chaokunyang wants to merge 1 commit intoapache:mainfrom
chaokunyang:cpp_stream_deserialization
Draft

feat(c++/python): support stream deserialization for c++ and python#3307
chaokunyang wants to merge 1 commit intoapache:mainfrom
chaokunyang:cpp_stream_deserialization

Conversation

@chaokunyang
Copy link
Collaborator

@chaokunyang chaokunyang commented Feb 6, 2026

Why?

C++ and Python deserialization currently assumes data is already materialized in memory-backed buffers. This PR adds stream-backed deserialization support so payloads can be read incrementally from input streams while preserving existing serialization behavior and error handling.

What does this PR do?

  • Adds stream infrastructure in C++ (ForyInputStreamBuf/ForyInputStream) and integrates it with Buffer so reads can request more bytes on demand.
  • Adds Python-to-C++ stream bridge (Fory_PyCreateBufferFromStream) so pyfory.buffer.Buffer can be constructed from Python objects that implement read(size).
  • Updates deserialization paths to be stream-safe by using ensure_size checks for header/string/fixed-field reads and falling back from batched varint reads on stream-backed buffers.
  • Extends build rules to compile/link new stream components and adds stream-focused C++ tests (stream_test.cc, buffer stream tests).
  • Adds Python stream tests (python/pyfory/tests/test_stream.py) and buffer stream coverage in python/pyfory/tests/test_buffer.py.

Related issues

N/A

Does this PR introduce any user-facing change?

  • Does this PR introduce any public API change?
  • Does this PR introduce any binary protocol compatibility change?

Benchmark

N/A

@chaokunyang chaokunyang marked this pull request as draft February 6, 2026 15:43
Zakir032002 added a commit to Zakir032002/fory that referenced this pull request Feb 15, 2026
Implements apache#3300 aligned with C++ PR apache#3307 stream model.

- Add ForyStreamBuf: growable buffer wrapping dyn Read, no compaction
- Make Reader stream-aware: ensure_readable before reads, sync_stream_pos after
- Add byte-at-a-time varint fallbacks for stream-backed readers
- Fix deserialize_from to transfer stream state via take/restore pattern
- Preserve zero-overhead in-memory fast path (branch-light)
- Add 12 comprehensive stream tests (primitives, structs, strings,
  sequential decode, truncated stream errors, Vec, regression)

Closes apache#3300
Zakir032002 added a commit to Zakir032002/fory that referenced this pull request Feb 16, 2026
- Add standalone ForyStreamBuf with growable buffer
- Implements fill_buffer for on-demand reading from std::io::Read
- Buffer grows monotonically without compaction
- No integration with Reader yet (zero impact on existing code)
- Includes 4 unit tests for basic functionality

Design follows C++ PR apache#3307 and addresses apache#3300.
Part 1 of 3-phase implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments