Class BroadcastableClusterInfo
- java.lang.Object
-
- org.apache.cassandra.spark.bulkwriter.BroadcastableClusterInfo
-
- All Implemented Interfaces:
java.io.Serializable,IBroadcastableClusterInfo
public final class BroadcastableClusterInfo extends java.lang.Object implements IBroadcastableClusterInfo
Broadcastable wrapper for single cluster with ZERO transient fields to optimize Spark broadcasting.Only essential fields are broadcast; executors reconstruct CassandraClusterInfo to fetch other data from Sidecar.
Why ZERO transient fields matters:
Spark'sSizeEstimatoruses reflection to estimate object sizes before broadcasting. Each transient field forces SizeEstimator to inspect the field's type hierarchy, which is expensive. Logger references are particularly costly due to their deep object graphs (appenders, layouts, contexts). By eliminating ALL transient fields and Logger references, we:- Minimize SizeEstimator reflection overhead during broadcast preparation
- Reduce broadcast variable serialization size
- Avoid accidental serialization of non-serializable objects
- See Also:
- Serialized Form
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.StringclusterId()ID string that can uniquely identify a cluster.static BroadcastableClusterInfofrom(ClusterInfo source, BulkSparkConf conf)Creates a BroadcastableCluster from a CassandraClusterInfo by extracting essential fields.BulkSparkConfgetConf()java.lang.StringgetLowestCassandraVersion()org.apache.cassandra.spark.data.partitioner.PartitionergetPartitioner()ClusterInforeconstruct()Reconstructs a full ClusterInfo instance from this broadcastable data on executors.
-
-
-
Method Detail
-
from
public static BroadcastableClusterInfo from(@NotNull ClusterInfo source, @NotNull BulkSparkConf conf)
Creates a BroadcastableCluster from a CassandraClusterInfo by extracting essential fields. Executors will reconstruct CassandraClusterInfo to fetch other data from Sidecar.- Parameters:
source- the source ClusterInfo (typically CassandraClusterInfo)conf- the BulkSparkConf needed to connect to Sidecar on executors
-
getConf
public BulkSparkConf getConf()
- Specified by:
getConfin interfaceIBroadcastableClusterInfo- Returns:
- the BulkSparkConf configuration needed to reconstruct ClusterInfo on executors
-
getLowestCassandraVersion
public java.lang.String getLowestCassandraVersion()
- Specified by:
getLowestCassandraVersionin interfaceIBroadcastableClusterInfo- Returns:
- the lowest Cassandra version in the cluster
-
getPartitioner
public org.apache.cassandra.spark.data.partitioner.Partitioner getPartitioner()
- Specified by:
getPartitionerin interfaceIBroadcastableClusterInfo- Returns:
- the partitioner used by the cluster
-
clusterId
@Nullable public java.lang.String clusterId()
Description copied from interface:IBroadcastableClusterInfoID string that can uniquely identify a cluster. When writing to a single cluster, this may be null. When in coordinated write mode (writing to multiple clusters), this must return a unique string.- Specified by:
clusterIdin interfaceIBroadcastableClusterInfo- Returns:
- cluster id string, null if absent
-
reconstruct
public ClusterInfo reconstruct()
Description copied from interface:IBroadcastableClusterInfoReconstructs a full ClusterInfo instance from this broadcastable data on executors. Each implementation knows how to reconstruct itself into the appropriate ClusterInfo type. This allows adding new broadcastable types without modifying the reconstruction logic inAbstractBulkWriterContext.- Specified by:
reconstructin interfaceIBroadcastableClusterInfo- Returns:
- reconstructed ClusterInfo (CassandraClusterInfo or CassandraClusterInfoGroup)
-
-