syft.frameworks.torch.linalg.lr¶
Module Contents¶
-
class
syft.frameworks.torch.linalg.lr.EncryptedLinearRegression(crypto_provider: BaseWorker, hbc_worker: BaseWorker, precision_fractional: int = 6, fit_intercept: bool = True)¶ Multi-Party Linear Regressor based on Jonathan Bloom’s algorithm. It performs linear regression using Secure Multi-Party Computation. While the training is performed in SMPC, the final regression coefficients are public at the end and predictions are made in clear on local or pointer Tensors.
Reference: Section 2 of https://arxiv.org/abs/1901.09531
- Parameters
crypto_provider – a BaseWorker providing crypto elements for AdditiveSharingTensors (used for SMPC) such as Beaver triples
hbc_worker – The “Honest but Curious” BaseWorker. SMPC operations in PySyft use SecureNN protocols, which are based on 3-party computations. In order to apply it for more than 3 parties, we need a “Honest but Curious” worker. To perform the Encrypted Linear Regression, the algorithm chooses randomly one of the workers in the pool and secret share all tensors with the chosen worker, the crypto provider and the “Honest but Curious” worker. Its main role is to avoid collusion between two workers in the pool if the algorithm secred shared the tensors with two randomly chosen workers and the crypto provider. The “Honest but Curious” worker is essentially a legitimate participant in a communication protocol who will not deviate from the defined protocol but will attempt to learn all possible information from legitimately received messages.
precision_fractional – precision chosen for FixedPrecisionTensors
fit_intercept – whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations (e.g. data is expected to be already centered)
-
coef¶ torch.Tensor of shape (n_features, ). Estimated coefficients for the linear regression problem.
-
intercept¶ torch.Tensor of shape (1, ) if fit_intercept is set to True, None otherwise. Estimated intercept for the linear regression.
-
pvalue_coef¶ numpy.array of shape (n_features, ). Two-sided p-value for a hypothesis test whose null hypothesis is that the each coeff is zero.
-
pvalue_intercept¶ numpy.array of shape (1, ) if fit_intercept is set to True, None otherwise. Two-sided p-value for a hypothesis test whose null hypothesis is that the intercept is zero.
-
fit(self, X_ptrs: List[torch.Tensor], y_ptrs: List[torch.Tensor])¶ Fits the linear model using Secured Multi-Party Linear Regression. The final results (i.e. coefficients and p-values) will be public.
-
predict(self, X: torch.Tensor)¶ Performs predicion of linear model on X, which can be a local torch.Tensor or a wrapped PointerTensor. The result will be either a local torch.Tensor or a wrapped PointerTensor, depending on the nature of X.
-
summarize(self)¶ Prints a summary of the coefficients and its statistics. This method should be called only after training of the model.
-
_check_ptrs(self, X_ptrs, y_ptrs)¶ Method that check if the lists of pointers corresponding to the explanatory and explained variables have their elements as expected. It also computes parallelly some Regressor’s attributes such as number of features and total sample size.
-
static
_add_intercept(X_ptrs)¶ Adds a column-vector of 1’s at the beginning of the tensors X_ptrs
-
static
_get_workers(ptrs)¶ Method that returns the pool of workers in a tuple
-
static
_remote_dot_products(X_ptrs, y_ptrs)¶ This method computes the aggregated dot-products remotely. It corresponds to the Compression stage (or Compression within) of Bloom’s algorithm
Method that secret share a list of remote tensors between a worker of the pool and the ‘honest but curious’ worker, using a crypto_provider worker
-
_compute_pvalues(self)¶ Compute p-values of coefficients (and intercept if fit_intercept==True)
-
class
syft.frameworks.torch.linalg.lr.DASH(crypto_provider: BaseWorker, hbc_worker: BaseWorker, precision_fractional: int = 6)¶ Distributed Association Scan Hammer (DASH) algorithm based on Jonathan Bloom’s algorithm. It uses Secured Multi-Party Computation at combine phase. While the training is performed in SMPC, the final regression coefficients are public at the end.
Reference: Section 2 of https://arxiv.org/abs/1901.09531
- Parameters
crypto_provider – a BaseWorker providing crypto elements for ASTs such as Beaver triples
hbc_worker – The “Honest but Curious” BaseWorker. SMPC operations in PySyft use SecureNN protocols, which are based on 3-party computations. In order to apply it for more than 3 parties, we need a “Honest but Curious” worker. To perform the DASH algorithm, we choose randomly one of the workers in the pool and secret share all tensors with the chosen worker,the crypto provider and the “Honest but Curious” worker. Its main role is to avoid collusion between two workers in the pool if the algorithm secred shared the tensors with two randomly chosen workers and the crypto provider. The “Honest but Curious” worker is essentially a legitimate participant in a communication protocol who will not deviate from the defined protocol but will attempt to learn all possible information from legitimately received messages.
precision_fractional – precision chosen for FixedPrecisionTensors
-
coef¶ torch.Tensor of shape (n_features, ). Estimated coefficients for DASH algorithm.
-
pvalue¶ numpy.array of shape (n_features, ). Two-sided p-value for a hypothesis test whose null hypothesis is that the each coeff is zero.
-
fit(self, X_ptrs: List[torch.Tensor], C_ptrs: List[torch.Tensor], y_ptrs: List[torch.Tensor])¶
-
get_coeff(self)¶
-
get_standard_errors(self)¶
-
get_p_values(self)¶
-
_check_ptrs(self, X_ptrs, C_ptrs, y_ptrs)¶ Method that check if the lists of pointers corresponding to the response vector, transient covariate vectors and independent permanent covariate vectors have their elements as expected. It also computes parallelly some Regressor’s attributes such as degrees of freedom and total sample size.
-
static
_get_workers(ptrs)¶ Method that returns the pool of workers in a tuple
-
static
_remote_dot_products(X_ptrs, C_ptrs, y_ptrs)¶ This method computes the aggregated dot-products remotely. It corresponds to the Compression stage (or Compression within) of DASH algorithm
-
static
_remote_qr(C_ptrs)¶ Performs the QR decompositions of permanent covariate matrices remotely. It returns a list with the upper right matrices located in each worker
-
static
_inv_upper(R)¶ Performs the inversion of a right upper matrix (2-dim tensor) in MPC by solving the linear equation R * R_inv = I with backward substitution.
Method that secret share a list of remote tensors between a worker of the pool and the ‘honest but curious’ worker, using a crypto_provider worker
-
_compute_pvalues(self)¶ Compute p-values of coefficients