syft.workers.base¶
Module Contents¶
-
syft.workers.base.logger¶
-
class
syft.workers.base.BaseWorker(hook: FrameworkHook, id: Union[int, str] = 0, data: Union[List, tuple] = None, is_client_worker: bool = False, log_msgs: bool = False, verbose: bool = False, auto_add: bool = True)¶ Bases:
syft.workers.abstract.AbstractWorker,syft.generic.object_storage.ObjectStorageContains functionality to all workers.
Other workers will extend this class to inherit all functionality necessary for PySyft’s protocol. Extensions of this class overrides two key methods _send_msg() and _recv_msg() which are responsible for defining the procedure for sending a binary message to another worker.
At it’s core, BaseWorker (and all workers) is a collection of objects owned by a certain machine. Each worker defines how it interacts with objects on other workers as well as how other workers interact with objects owned by itself. Objects are either tensors or of any type supported by the PySyft protocol.
- Parameters
hook – A reference to the TorchHook object which is used to modify PyTorch with PySyft’s functionality.
id – An optional string or integer unique id of the worker.
known_workers – An optional dictionary of all known workers on a network which this worker may need to communicate with in the future. The key of each should be each worker’s unique ID and the value should be a worker class which extends BaseWorker. Extensions of BaseWorker will include advanced functionality for adding to this dictionary(node discovery). In some cases, one can initialize this with known workers to help bootstrap the network.
data – Initialize workers with data on creating worker object
is_client_worker – An optional boolean parameter to indicate whether this worker is associated with an end user client. If so, it assumes that the client will maintain control over when variables are instantiated or deleted as opposed to handling tensor/variable/model lifecycle internally. Set to True if this object is not where the objects will be stored, but is instead a pointer to a worker that exists elsewhere.
log_msgs – An optional boolean parameter to indicate whether all messages should be saved into a log for later review. This is primarily a development/testing feature.
auto_add – Determines whether to automatically add this worker to the list of known workers.
-
abstract
_send_msg(self, message: bin, location: BaseWorker)¶ Sends message from one worker to another.
As BaseWorker implies, you should never instantiate this class by itself. Instead, you should extend BaseWorker in a new class which instantiates _send_msg and _recv_msg, each of which should specify the exact way in which two workers communicate with each other. The easiest example to study is VirtualWorker.
- Parameters
message – A binary message to be sent from one worker to another.
location – A BaseWorker instance that lets you provide the destination to send the message.
- Raises
NotImplementedError – Method not implemented error.
-
abstract
_recv_msg(self, message: bin)¶ Receives the message.
As BaseWorker implies, you should never instantiate this class by itself. Instead, you should extend BaseWorker in a new class which instantiates _send_msg and _recv_msg, each of which should specify the exact way in which two workers communicate with each other. The easiest example to study is VirtualWorker.
- Parameters
message – The binary message being received.
- Raises
NotImplementedError – Method not implemented error.
-
registration_enabled(self)¶
-
remove_worker_from_registry(self, worker_id)¶ Removes a worker from the dictionary of known workers. :param worker_id: id to be removed
-
remove_worker_from_local_worker_registry(self)¶ Removes itself from the registry of hook.local_worker.
-
load_data(self, data: List[Union[FrameworkTensorType, AbstractTensor]])¶ Allows workers to be initialized with data when created
The method registers the tensor individual tensor objects.
- Parameters
data – A list of tensors
-
send_msg(self, message: Message, location: BaseWorker)¶ Implements the logic to send messages.
The message is serialized and sent to the specified location. The response from the location (remote worker) is deserialized and returned back.
Every message uses this method.
- Parameters
msg_type – A integer representing the message type.
message – A Message object
location – A BaseWorker instance that lets you provide the destination to send the message.
- Returns
The deserialized form of message from the worker at specified location.
-
recv_msg(self, bin_message: bin)¶ Implements the logic to receive messages.
The binary message is deserialized and routed to the appropriate function. And, the response serialized the returned back.
Every message uses this method.
- Parameters
bin_message – A binary serialized message.
- Returns
A binary message response.
-
send(self, obj: Union[FrameworkTensorType, AbstractTensor], workers: BaseWorker, ptr_id: Union[str, int] = None, garbage_collect_data=None, **kwargs)¶ Sends tensor to the worker(s).
Send a syft or torch tensor/object and its child, sub-child, etc (all the syft chain of children) to a worker, or a list of workers, with a given remote storage address.
- Parameters
tensor – A syft/framework tensor/object to send.
workers – A BaseWorker object representing the worker(s) that will receive the object.
ptr_id – An optional string or integer indicating the remote id of the object on the remote worker(s).
local_autograd – Use autograd system on the local machine instead of PyTorch’s autograd on the workers.
preinitialize_grad – Initialize gradient for AutogradTensors to a tensor
garbage_collect_data – argument passed down to create_pointer()
Example
>>> import torch >>> import syft as sy >>> hook = sy.TorchHook(torch) >>> bob = sy.VirtualWorker(hook) >>> x = torch.Tensor([1, 2, 3, 4]) >>> x.send(bob, 1000) Will result in bob having the tensor x with id 1000
- Returns
A PointerTensor object representing the pointer to the remote worker(s).
-
execute_command(self, message: tuple)¶ Executes commands received from other workers.
- Parameters
message – A tuple specifying the command and the args.
- Returns
A pointer to the result.
-
execute_plan_command(self, message: tuple)¶ Executes commands related to plans.
This method is intended to execute all commands related to plans and avoiding having several new message types specific to plans.
- Parameters
message – A tuple specifying the command and args.
-
send_command(self, recipient: BaseWorker, message: tuple, return_ids: str = None)¶ Sends a command through a message to a recipient worker.
- Parameters
recipient – A recipient worker.
message – A tuple representing the message being sent.
return_ids – A list of strings indicating the ids of the tensors that should be returned as response to the command execution.
- Returns
A list of PointerTensors or a single PointerTensor if just one response is expected.
-
get_obj(self, obj_id: Union[str, int])¶ Returns the object from registry.
Look up an object from the registry using its ID.
- Parameters
obj_id – A string or integer id of an object to look up.
-
respond_to_obj_req(self, request_msg: tuple)¶ Returns the deregistered object from registry.
- Parameters
request_msg (tuple) – Tuple containing object id, user credentials and reason.
-
register_obj(self, obj: object, obj_id: Union[str, int] = None)¶ Registers the specified object with the current worker node.
Selects an id for the object, assigns a list of owners, and establishes whether it’s a pointer or not. This method is generally not used by the client and is instead used by internal processes (hooks and workers).
- Parameters
obj – A torch Tensor or Variable object to be registered.
obj_id (int or string) – random integer between 0 and 1e10 or string uniquely identifying the object.
-
de_register_obj(self, obj: object, _recurse_torch_objs: bool = True)¶ De-registers the specified object with the current worker node.
- Parameters
obj – the object to deregister
_recurse_torch_objs – A boolean indicating whether the object is more complex and needs to be explored.
-
send_obj(self, obj: object, location: BaseWorker)¶ Send a torch object to a worker.
- Parameters
obj – A torch Tensor or Variable object to be sent.
location – A BaseWorker instance indicating the worker which should receive the object.
-
request_obj(self, obj_id: Union[str, int], location: BaseWorker, user=None, reason: str = '')¶ Returns the requested object from specified location.
- Parameters
obj_id (int or string) – A string or integer id of an object to look up.
location (BaseWorker) – A BaseWorker instance that lets you provide the lookup location.
user (object, optional) – user credentials to perform user authentication.
reason (string, optional) – a description of why the data scientist wants to see it.
- Returns
A torch Tensor or Variable object.
-
get_worker(self, id_or_worker: Union[str, int, 'BaseWorker'], fail_hard: bool = False)¶ Returns the worker id or instance.
Allows for resolution of worker ids to workers to happen automatically while also making the current worker aware of new ones when discovered through other processes.
If you pass in an ID, it will try to find the worker object reference within self._known_workers. If you instead pass in a reference, it will save that as a known_worker if it does not exist as one.
This method is useful because often tensors have to store only the ID to a foreign worker which may or may not be known by the worker that is de-serializing it at the time of deserialization.
- Parameters
id_or_worker – A string or integer id of the object to be returned or the BaseWorker object itself.
fail_hard (bool) – A boolean parameter indicating whether we want to throw an exception when a worker is not registered at this worker or we just want to log it.
- Returns
A string or integer id of the worker or the BaseWorker instance representing the worker.
Example
>>> import syft as sy >>> hook = sy.TorchHook(verbose=False) >>> me = hook.local_worker >>> bob = sy.VirtualWorker(id="bob",hook=hook, is_client_worker=False) >>> me.add_worker([bob]) >>> bob <syft.core.workers.virtual.VirtualWorker id:bob> >>> # we can get the worker using it's id (1) >>> me.get_worker('bob') <syft.core.workers.virtual.VirtualWorker id:bob> >>> # or we can get the worker by passing in the worker >>> me.get_worker(bob) <syft.core.workers.virtual.VirtualWorker id:bob>
-
_get_worker(self, worker: AbstractWorker)¶
-
_get_worker_based_on_id(self, worker_id: Union[str, int], fail_hard: bool = False)¶
-
add_worker(self, worker: BaseWorker)¶ Adds a single worker.
Adds a worker to the list of _known_workers internal to the BaseWorker. Endows this class with the ability to communicate with the remote worker being added, such as sending and receiving objects, commands, or information about the network.
- Parameters
worker (
BaseWorker) – A BaseWorker object representing the pointer to a remote worker, which must have a unique id.
Example
>>> import torch >>> import syft as sy >>> hook = sy.TorchHook(verbose=False) >>> me = hook.local_worker >>> bob = sy.VirtualWorker(id="bob",hook=hook, is_client_worker=False) >>> me.add_worker([bob]) >>> x = torch.Tensor([1,2,3,4,5]) >>> x 1 2 3 4 5 [syft.core.frameworks.torch.tensor.FloatTensor of size 5] >>> x.send(bob) FloatTensor[_PointerTensor - id:9121428371 owner:0 loc:bob id@loc:47416674672] >>> x.get() 1 2 3 4 5 [syft.core.frameworks.torch.tensor.FloatTensor of size 5]
-
add_workers(self, workers: List['BaseWorker'])¶ Adds several workers in a single call.
- Parameters
workers – A list of BaseWorker representing the workers to add.
-
__str__(self)¶ Returns the string representation of BaseWorker.
A to-string method for all classes that extend BaseWorker.
- Returns
The Type and ID of the worker
Example
A VirtualWorker instance with id ‘bob’ would return a string value of. >>> import syft as sy >>> bob = sy.VirtualWorker(id=”bob”) >>> bob <syft.workers.virtual.VirtualWorker id:bob>
Note
__repr__ calls this method by default.
-
__repr__(self)¶ Returns the official string representation of BaseWorker.
-
__getitem__(self, idx)¶
-
static
is_tensor_none(obj)¶
-
request_is_remote_tensor_none(self, pointer: PointerTensor)¶ Sends a request to the remote worker that holds the target a pointer if the value of the remote tensor is None or not. Note that the pointer must be valid: if there is no target (which is different from having a target equal to None), it will return an error.
- Parameters
pointer – The pointer on which we can to get information.
- Returns
A boolean stating if the remote value is None.
-
static
get_tensor_shape(tensor: FrameworkTensorType)¶ Returns the shape of a tensor casted into a list, to bypass the serialization of a torch.Size object.
- Parameters
tensor – A torch.Tensor.
- Returns
A list containing the tensor shape.
-
request_remote_tensor_shape(self, pointer: PointerTensor)¶ Sends a request to the remote worker that holds the target a pointer to have its shape.
- Parameters
pointer – A pointer on which we want to get the shape.
- Returns
A torch.Size object for the shape.
-
fetch_plan(self, plan_id: Union[str, int], location: BaseWorker, copy: bool = False)¶ Fetchs a copy of a the plan with the given plan_id from the worker registry.
This method is executed for local execution.
- Parameters
plan_id – A string indicating the plan id.
- Returns
A plan if a plan with the given plan_id exists. Returns None otherwise.
-
_fetch_plan_remote(self, plan_id: Union[str, int], copy: bool)¶ Fetches a copy of a the plan with the given plan_id from the worker registry.
This method is executed for remote execution.
- Parameters
plan_id – A string indicating the plan id.
- Returns
A plan if a plan with the given plan_id exists. Returns None otherwise.
-
fetch_protocol(self, protocol_id: Union[str, int], location: BaseWorker, copy: bool = False)¶ Fetch a copy of a the protocol with the given protocol_id from the worker registry.
This method is executed for local execution.
- Parameters
protocol_id – A string indicating the protocol id.
- Returns
A protocol if a protocol with the given protocol_id exists. Returns None otherwise.
-
_fetch_protocol_remote(self, protocol_id: Union[str, int], copy: bool)¶ Target function of fetch_protocol, find and return a protocol
-
search(self, query: Union[List[Union[str, int]], str, int])¶ Search for a match between the query terms and a tensor’s Id, Tag, or Description.
Note that the query is an AND query meaning that every item in the list of strings (query*) must be found somewhere on the tensor in order for it to be included in the results.
- Parameters
query – A list of strings to match against.
me – A reference to the worker calling the search.
- Returns
A list of PointerTensors.
-
request_search(self, query: List[str], location: BaseWorker)¶
-
_get_msg(self, index)¶ Returns a decrypted message from msg_history. Mostly useful for testing.
- Parameters
index – the index of the message you’d like to receive.
- Returns
A decrypted messaging.Message object.
-
static
create_message_execute_command(command_name: str, command_owner=None, return_ids=None, *args, **kwargs)¶ helper function creating a message tuple for the execute_command call
- Parameters
command_name – name of the command that shall be called
command_owner – owner of the function (None for torch functions, “self” for classes derived from workers.base or ptr_id for remote objects
return_ids – optionally set the ids of the return values (for remote objects)
*args – will be passed to the call of command_name
**kwargs – will be passed to the call of command_name
- Returns
(command_name, command_owner, args, kwargs), return_ids
- Return type
tuple
-
property
serializer(self, workers=None)¶ Define the serialization strategy to adopt depending on the workers it’s connected to. This is relevant in particular for Tensors which can be serialized in an efficient way between workers which share the same Deep Learning framework, but must be converted to lists or json-like objects in other cases.
- Parameters
workers – (Optional) the list of workers involved in the serialization. If not provided, self._known_workers is used.
- Returns
‘all’: serialization must be compatible with all kinds of workers ‘torch’: serialization will only work between workers that support PyTorch (more to come: ‘tensorflow’, ‘numpy’, etc)
- Return type
A str code
-
static
simplify(_worker: AbstractWorker, worker: AbstractWorker)¶
-
static
detail(worker: AbstractWorker, worker_tuple: tuple)¶ This function reconstructs a PlanPointer given it’s attributes in form of a tuple.
- Parameters
worker – the worker doing the deserialization
plan_pointer_tuple – a tuple holding the attributes of the PlanPointer
- Returns
A worker id or worker instance.
-
static
force_simplify(_worker: AbstractWorker, worker: AbstractWorker)¶
-
static
force_detail(worker: AbstractWorker, worker_tuple: tuple)¶