Skip to content

compression-experiment-64KB#4554

Open
minal-kyada wants to merge 7716 commits intoapache:trunkfrom
minal-kyada:zstd-64KB-chunk-compression-experiment
Open

compression-experiment-64KB#4554
minal-kyada wants to merge 7716 commits intoapache:trunkfrom
minal-kyada:zstd-64KB-chunk-compression-experiment

Conversation

@minal-kyada
Copy link

Thanks for sending a pull request! Here are some tips if you're new here:

  • Ensure you have added or run the appropriate tests for your PR.
  • Be sure to keep the PR description updated to reflect all changes.
  • Write your PR title to summarize what this PR proposes.
  • If possible, provide a concise example to reproduce the issue for a faster review.
  • Read our contributor guidelines
  • If you're making a documentation change, see our guide to documentation contribution

Commit messages should follow the following format:

<One sentence description, usually Jira title or CHANGES.txt summary>

<Optional lengthier description (context on patch)>

patch by <Authors>; reviewed by <Reviewers> for CASSANDRA-#####

Co-authored-by: Name1 <email1>
Co-authored-by: Name2 <email2>

The Cassandra Jira

smiklosovic and others added 30 commits August 20, 2025 10:53
Adds setNativeTransportMaxConcurrentConnectionsPerIp and
getNativeTransportMaxConcurrentConnectionsPerIp to StorageProxyMBean so
these methods can be used from JMX like the
*NativeTransportMaxConcurrentConnections methods.

patch by Andy Tolbert; reviewed by Chris Lohfink for CASSANDRA-20642
* cassandra-4.0:
  Add NativeTransportMaxConcurrentConnectionsPerIp to StorageProxyMBean
* cassandra-4.1:
  Add NativeTransportMaxConcurrentConnectionsPerIp to StorageProxyMBean
* cassandra-5.0:
  Add NativeTransportMaxConcurrentConnectionsPerIp to StorageProxyMBean
As per CASSANDRA-20045, we want to prevent full repair against
disk full scenarios. Current protection exists only for incremental
repair. This change updates the config name to not be
incremental repair specific, using the Replace annotation.

patch by Himanshu Jindal; reviewed by David Capwell, Jaydeepkumar Chovatia for CASSANDRA-20045
patch by Stefan Miklosovic; reviewed by Brandon Williams for CASSANDRA-20849
patch by Stefan Miklosovic; reviewed by Brandon Williams for CASSANDRA-20848
…ple partitions

patch by David Capwell; reviewed by Ariel Weisberg for CASSANDRA-20857
…mbstone is a max clustering value

patch by Dmitry Konstantinov; reviewed by Stefan Miklosovic for CASSANDRA-20855
…slWithDeprecatedSslStoragePort failing on missing stdout content

patch by David Capwell; reviewed by Caleb Rackliffe for CASSANDRA-20698
.build/run-ci is a a python script used to create and interact with k8s provisioned ci-cassandra.apache.org clones
See .build/run-ci.d/README.md for docs on usage.

.jenkins/k8s/ contains the k8s jenkins helm chart, JSaC configuration, and docker image for provisioning.

.build/run-ci.d/  contains python requirements and tests for .build/run-ci

 patch by Richa Mishra, Nishant Barola, Aleksandr Volochnev, Mick Semb Wever; reviewed by Richa Mishra, Nishant Barola, Aleksandr Volochnev, Brandon Hsieh, Mick Semb Wever, Brandon Williams for CASSANDRA-18145

Co-authored-by: Richa Mishra <richa.mishra@infracloud.io>
Co-authored-by: Nishant Barola <nishant.barola@infracloud.io>
Co-authored-by: Aleksandr Volochnev <a.volochnev@gmail.com>
Co-authored-by: Mick Semb Wever <mck@apache.org>
* cassandra-5.0:
  K8s immutable provisioning of ci-cassandra.apache.org jenkins instances
javadoc target is behaving unpredictably, sometimes fails, sometimes does not.
I strongly suspect that it just does not have enough memory available and it fails.
This is currently a blocker for releases. My empirical testing shows that more memory
we assign to javadoc generation, less probable it is it will fail to finish successfuly.

patch by Stefan Miklosovic for CASSANDRA-20868
dcapwell and others added 29 commits December 5, 2025 16:14
…ted then the shared Row.Builder is not freed causing all future mutations to be rejected

patch by David Capwell; reviewed by Bernardo Botella Corbi, Caleb Rackliffe, Dmitry Konstantinov for CASSANDRA-21055
Patch by Yifan Cai; Reviewed by Stefan Miklosovic for CASSANDRA-21047
 - DurabilityQueue/ShardScheduler deadlock
 - MemtableCleanerThread.Cleanup assumes Boolean parameter is non-null, which is invalid if an exception has been thrown
 - AccordDurableOnFlush may be invoked while Accord is starting up, so should use AccordService.unsafeInstance
 - AccordCache shrink without lock regression
 - Cleanup system_accord compaction leftovers before starting up
 - system_accord_debug.txn order
 - system_accord_debug.txn_blocked_by order
 - system_accord_debug.shard_epochs order
Improve:
 - Set DefaultProgressLog.setMode(Catchup) during Catchup
 - IdentityAccumulators only need to readLast, not readAll
 - Limit number of static segments we compact at once to sstable
 - If too many static segments on startup, wait for them to be compacted

patch by Benedict; reviewed by Alex Petrov for CASSANDRA-21053
patch by Stefan Miklosovic; reviewed by Brad Schoening for CASSANDRA-21037
patch by Isaac Reath; reviewed by Jyothsna Konisa, Stefan Miklosovic for CASSANDRA-21057
Converting collections or UDTs to raw bytes is nonsensical - it
allows reading raw serialized bytes which have no meaningful
interpretation.

patch by Mikołaj Diakowski; reviewed by Stefan Miklosovic, Brandon Williams for CASSANDRA-20982
Patch by Arra Praveen; reviewed by Jyothsna Konisa, Brad Schoening for CASSANDRA-20800
patch by Sunil Ramchandra Pawar; reviewed by Caleb Rackliffe and David Capwell for CASSANDRA-20949
patch by Alan Wang; reviewed by David Capwell, Marcus Eriksson for CASSANDRA-21048
… them to ArrayCell instead of BufferCell.

Additionally replace List with array for bind values (we know the size in advance during a decoding), so in total: List<List> is replaced with byte[][] QueryOptions classes support both ways to get values now: using an old API with ByteBuffer and a new API with byte[].

Patch by Dmitry Konstantinov; reviewed by Michael Semb Wever for CASSANDRA-20166
 - DefaultLocalListeners.ComplexListeners iterator IndexOutOfBoundsException
 - Race condition initialising empty ActiveEpochs, when minimum pending epoch can move backwards
 - SyncPoints must be declared in an epoch containing the ranges, and PENDING_REMOVAL ranges will reject non-syncpoint transactions
 - AccordExecutorMetrics is now registered on startup
 - getRecentValues for non-cumulative histogram should not subtract prior values
Improve:
 - Report ephemeral read, epoch waits and timeout metrics
 - Remove Topologies.SelectNodeOwnership, as no need to SLICE anymore
 - Introduce SystemEventListener for epoch waiting and timeout metrics
 - No-op but log if gcBefore provided to CFK is in the past

patch by Benedict; reviewed by Alex Petrov for CASSANDRA-21076
patch by Rishabh Saraswat; reviewed by Jyothsna Konisa, Brad Schoening for CASSANDRA-21007
…rather than INVALID and TxnReferenceOperation didn't handle all collections prperly

patch by David Capwell; reviewed by Benedict Elliott Smith, Caleb Rackliffe, Jyothsna Konisa for CASSANDRA-21061
patch by Brad Schoening; reviewed by Jyothsna Konisa for CASSANDRA-20405
patch by Brad Schoening; reviewed by Jyothsna Konisa for CASSANDRA-20405
Patch by Dmitry Konstantinov; reviewed by Benedict Elliott Smith, Jyothsna Konisa for CASSANDRA-21080
Use a plain loop to check if it is ASCII symbol before going into more complicated UTF8 parsing.
Avoid ValueAccessor to get extra boost for the ASCII check, especially in non-monomorphic cases.

Patch by Dmitry Konstantinov; reviewed by Jyothsna Konisa, Stefan Miklosovic for CASSANDRA-21075
…n races

Also Fix Cassandra:
 - In memory size calculation for CommandsForKey include Unmanaged
 - Accord load out-of-band cleanup should use SafeRedundantBefore
ALso Improve Cassandra:
 - Report replay information on begin replay
 - Improve AccordService shutdown
 - Log command store RedundantBefore on shutdown
 - Segment compaction should wait for readOrder barrier to replace segments, for additional safety
 - Journal segments should share readOrder with sstables
Also Improve Accord:
 - Iterate LocalListeners in order, so can query more effectively on node
 - Refine AbstractReplay.minReplay/shouldReplay

patch by Benedict; reviewed by Alex Petrov for CASSANDRA-21804
…le splits is added) and for test-burn to 4

Patch by Dmitry Konstantinov; reviewed by Michael Semb Wever, Jyothsna Konisa for CASSANDRA-21082
@minal-kyada minal-kyada force-pushed the zstd-64KB-chunk-compression-experiment branch from 96eed40 to 40bcdbb Compare February 20, 2026 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.