Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websocket for Client Service API #506

Merged
merged 23 commits into from
Mar 1, 2023
Merged

Conversation

tegefaulkes
Copy link
Contributor

@tegefaulkes tegefaulkes commented Feb 15, 2023

Description

This PR is for implementing a Websocket client and server pair for polykey client communication.

Issues Fixed

Tasks

Looking at about 5 days of work.

  • 1. Create a ClientServer for websocket based communication.
    • Create a ClientServer class for handling life-cycle of the server.
    • Add Secure connection with TLS, use encryption when writing files and reading them with the server. 0.5 days - Addressed later in issue Encrypted private key PEM format #508
    • Apply back-pressure propagation to the streams. 0.5 days
    • Support half open connections by only closing the websocket when both streams have closed.
    • Handle abruptly dropped connections. 0.5 days
    • Refuse any non-websocket communication. trivial
    • Support IPv6
  • 2. Create a ClientClient for websocket based communication.
    • Create a ClientClient class for handling life-cycle of the client. 0.5 days
    • Make the connection secure using TLS. 0.5 days
    • Add custom authentication logic for checking the server's certificate against a NodeId. 0.5 days
    • Apply back-pressure propagation to the streams. 1 days
    • Support half open connections by only closing the websocket when both streams have closed. 0.5 days
    • Handle abruptly dropped connections. 0.5 days
    • Connection timeout logic
    • Support IPv6

Final checklist

  • Domain specific tests
  • Full tests
  • Updated inline-comment documentation
  • Lint fixed
  • Squash and rebased
  • Sanity check the final build

@ghost
Copy link

ghost commented Feb 15, 2023

👇 Click on the image for a new way to code review

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map legend

@tegefaulkes
Copy link
Contributor Author

I'm looking over how to use uWebsocket and there are some things that may be issues.

  1. SSL certs are provided as files, we can't provide the certificate from memory.
  2. Certs are set when creating the server. I don't see a way to update them without restarting the server.
  3. There is no websocket client provided. So if we use this for the server side we still need something else for the client side of the connection.

@tegefaulkes
Copy link
Contributor Author

I think we can use this PR to update the names for the client and server. Before this was referred as the ClientClient and ClientServer. I feel this is a little confusing on a cold read. Maybe we should rename it to ControlClient and ControlServer? Or anything besides client to refer to the client communication.

@tegefaulkes
Copy link
Contributor Author

I have a really basic server working now. I need to make it much more robust and make the connection secure.

@CMCDragonkai CMCDragonkai changed the title Websocket based client communication Websocket for Client Service API Feb 15, 2023
@CMCDragonkai
Copy link
Member

It's the ClientService API. You can expand the name and say ClientServiceClient. But ClientClient is preferred to be short. Let's not change it right now.

@CMCDragonkai
Copy link
Member

@tegefaulkes you also need to consider how TLS will work here. Especially with uWebsockets, the TLS may be something that is built in, meaning the native library ships with OpenSSL. I see that it says it can build with openssl or wolfssl. Nodejs has its own TLS library of course, I wonder if that can be integrated, so it can use the existing openssl library in nodejs, but of course on non-node platforms the SSL library has to come from elsewhere.

Are you using the JS wrapper they are supplying or are you creating your own native bridge? If you are doing the native bridge you'll need to use the C++ style I'm using in our js-db which binds to rocksdb.

There is another framework, a bit more advanced called node-addon-api. I mentioned upgrading to that properly here: MatrixAI/js-db#37

For the purposes of speed though, using their existing wrapper will be sufficient, we can improve on it later.

@CMCDragonkai
Copy link
Member

@tegefaulkes can you expand the tasks here in this PR based on the issue spec.

@tegefaulkes
Copy link
Contributor Author

I've implemented server side writable back-pressure. ws.send() returns a number based on the result. the 3 results are 1: success, 0: backpressure and 2: message dropped. So to apply back-pressure I'm using a promise as a semaphore, in the writable stream write() method i'm awaiting the back-pressure before writing. There is a drain event that resolves the promise when back-pressure is released.

I'm looking into applying back-pressure to the readable side but I can't find a mechanism for applying it. I'd expect some way to pause any messages from being emitted for a websocket. There is a cork() method but that seems to be used for grouping outgoing messages?. Their example only shows writing back-pressure with ws.send(). I'm still looking into it.

@tegefaulkes
Copy link
Contributor Author

I can't find any way to implement back-pressure for receiving data. These two issues seem to support that pausing reads is not allowed.

uNetworking/uWebSockets.js#661
uNetworking/uWebSockets#1020

This means I don't have a way of propagating back-pressure from server reading to the client writing. The best I can do is throw an error if we exceed a buffered amount.

This would only be a problem if the stream handler can't keep up with reading. This could be likely if we do client stream where we stream something like a file. I'm not sure what to do really.

@CMCDragonkai
Copy link
Member

With respect to TLS, the encrypted TLS certificate should be using one of the PKCS standards. You should be able to use webcrypto to encrypt the file with a password. The libsodium used in keys domain won't have that supported. In the future we can build our own UWS JS binding and we can compile it in different way, or add in a JS method to the C++ code if it supports TLS certificate structure.

@CMCDragonkai
Copy link
Member

Actually we have keys/x509. You maybe be able to encrypt the x509 cert using the functions there.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Feb 17, 2023

Based on the linked issues, the pasuse/resume does exist in UWS, but perhaps not UWS.js.

There is a comment saying that pausing a node readable stream just does infinite buffering. Not sure if that is true.

But in the QUIC, pausing is a proper pause as it signals the other side to stop sending data, and shouldn't be acknowledging any data being sent.

@CMCDragonkai
Copy link
Member

You should create an issue upstream on UWSJS, and refer to those issues you already found, and ask for this feature.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Feb 17, 2023

If the underlying UWS can't do it, then all you can do is buffering.

The chrome is introducing WebSocketStream (https://developer.chrome.com/en/articles/websocketstream/) which is supposed to have backpressure on both ends.

Reference this discussion: uNetworking/uWebSockets.js#501

@CMCDragonkai
Copy link
Member

While the upstream issue is being awaited on, we can go with an unbound buffer for the client service. Clients are expected to be trusted.

@CMCDragonkai
Copy link
Member

According to uNetworking/uWebSockets#1034 no way to get the remote port right now, but not relevant to client service.

@CMCDragonkai
Copy link
Member

The design of WebSocketStream might be useful to follow for our adaptation of WS into a valid stream. See the chrome page on it.

@CMCDragonkai
Copy link
Member

I read through the SO posts on this, by design the WS protocol doesn't have backpressure, at least on the client side. The only thing bringing this in is the WebSocketStream that chrome is supporting, but no other platforms atm. Unbounded buffering is the only option right now that will work across the board.

This should be sufficient for now, as secrets are unlikely to be LARGE data anyway.

@tegefaulkes
Copy link
Contributor Author

I'm looking at the byob readable stream. Oddly the API seems to be different from the documentation. I'm expecting to be provided a controller.byobRequest that contains a view that I can write into, But the controller.byobRequest has the type undefined. It seems like the usual API here isn't supported? I'm not sure what's going on.

The only available usage is identical to a normal ReadableStream<Uint8Array>. En-queuing data and reading it is identcal. The only difference I can find is when I pass the message to enqueue, the original reference is detached? zeroed? replaced? for the BYOB stream.

That is to say. For a regular stream when you enqueue the data it is not actually copied. So you can modify the message afterwards and have the read result changed. For the BYOB stream the enqueue message can't be modified in this way. I can't say that the BYOB version is zero copy in this case. but I know that the regular readable stream is.

Now not having the expected interface for the BYOB streams is not actually a problem. I don't think they're needed for our use case. It seems that the BYOB stream is used for when you need to read the bytes into a buffer you have. That is to say, the buffer referred to in the BYOB acronym is not internal to the readable stream but provided by whatever is consuming the reader. The usage mimics how you're read a file into a buffer. The expected usage is as follows.

const readableByteStream = new ReadableStream({...});
const reader = readablebyteStream({mode: 'byob'})'
const buffer = Buffer.alloc(100);
// Our buffer is provided to the stream internals to be written to
const {value, done} = await reader.read(buffer); 
// value is the bytes written
// done is if the stream ended

Since we don't consume the streams in this way I think it's unnecessary to implement the readable stream as a BYOB. But like I said. this interface isn't actually provided in our version despite all the docs saying it's available?

Unless I'm reading into this wrong or I've misunderstood something. I'm not going to implement the readable as a BYOB stream as it doesn't fit our usage.

@CMCDragonkai
Copy link
Member

I believe the purpose of BYOB stream is to be able to have fixed size buffer. This is essential for us.

And yes it's not zero copy, it's supposed to copy whatever was written in to an intermediate fixed size buffer.

A fixed size buffer is necessary as otherwise your data transfer is objects not bytes.

TS API may be out of date. Ensure you're using the latest typescript version. Use like 4.8 or 4.9 or whatever it is.

@CMCDragonkai
Copy link
Member

Or if you're talking about node, even node types can be upgraded.

@tegefaulkes
Copy link
Contributor Author

That's the thing, I don't think it's used for having a fixed sized buffer for the readable stream. I think it's just to support a BYOB reader so it can write into a provided buffer without extra copying. See the example I gave above.

We can enforce a fixed size buffering with the regular readable stream already. We just need to use the queuing strategy options to set the high water mark and a size function to map chunks to their actual size. I've used this to enforce a maximum buffered bytes for the readable stream in the ClientServer.

Besides that. I had a look a the @types/node where the interfaces seem to be defined. Even the latest version of this doesn't have the expected interface with the BYOB. The lib.dom types do have the expected interface.

@CMCDragonkai
Copy link
Member

That's the thing, I don't think it's used for having a fixed sized buffer for the readable stream. I think it's just to support a BYOB reader so it can write into a provided buffer without extra copying. See the example I gave above.

We can enforce a fixed size buffering with the regular readable stream already. We just need to use the queuing strategy options to set the high water mark and a size function to map chunks to their actual size. I've used this to enforce a maximum buffered bytes for the readable stream in the ClientServer.

Besides that. I had a look a the @types/node where the interfaces seem to be defined. Even the latest version of this doesn't have the expected interface with the BYOB. The lib.dom types do have the expected interface.

I'm confused, every enqueue operation takes an entire buffer object, which can be or arbitrary size. Thus it's not a fixed size buffer queue.

https://github.com/whatwg/streams/blob/main/byte-streams-explainer.md

Still it seems like a good idea to use it anyway. Even if the TS interface is out of date, you can just do whatever the current node runtime supports. Just ignore the types?

@tegefaulkes
Copy link
Contributor Author

Each enqueued buffer can be an arbitrary size. but using the queuing strategy we can enforce an overall queue limit. The strategy defaults to a limit of 1 chunk. But I can tell it to queue up a certain amount of bytes. This is done by specifying the high water mark in bytes and a function that returns the number of bytes each chunk contains. Then inside the stream the controller.desiredSize is the amount of free space in the queue in bytes. I can use that to respect the back-pressure or throw an error.

The example you provided doesn't show it using a single internal buffer for buffering the data. It shows it directly writing the data into the buffer provided as a view. I can go over this in more detail in a meeting, I'm focusing on other parts right now.

@CMCDragonkai
Copy link
Member

I see, but there are other benefits to using BYOB mentioned in the link. I haven't fully understood what the implementation complexity with this yet, so we can go through it next meeting.

@tegefaulkes
Copy link
Contributor Author

Just something to keep note of. The ClientClient doesn't have a persistent connection between stream creations. Each new stream is technical a new connection. So while I can support providing multiple NodeIds we authenticate the server against. I can guarantee that it's the same node for each created stream.

@CMCDragonkai
Copy link
Member

Every stream could be a new connection. But on QUIC, every stream is multiplexed onto underlying connections.

For websockets, every stream is indeed a separate connection, unless the underlying websocket protocol is on HTTP2, in which case streams are in fact multiplexed on multiple HTTP2 connections.

The RPC isn't aware of connections anymore, it only considers streams.

@tegefaulkes
Copy link
Contributor Author

Added timeout logic to the ClientClient. We can provide the timeout in milliseconds when creating the ClientClient or to each connection as a Timer. If none is provided we depend on the websocket's timeout behaviour.

This one is hard to test since the websocket is pretty responsive to bad connections. Giving it a port without anything running gives a connection refused error. Using a random IP address with nothing running gives a different error immediately. So far as I can tell, if we can't establish a connection we'd end up with an error very quickly with websockets. But we have the option of our own timeout now.

@tegefaulkes
Copy link
Contributor Author

For the secure TLS certs I need to create a encrypted private key PEM. This is pretty similar to the current private key PEM that we can create using the keys utils privateKeyToPem utilities. There are two differences though. The PEM has a different header and footer for encrypted files and the private key is encrypted using PKCS#5 standard? The whole thing is a headache...

I need to do more digging.

@tegefaulkes
Copy link
Contributor Author

Cleaned up history and re-based on staging.

@tegefaulkes
Copy link
Contributor Author

Created a new issue for the private key problem at #508. I'll add it to the epic and remove it from the tasks here.

@tegefaulkes
Copy link
Contributor Author

This is ready for review. We'll likely go over it on monday.

@CMCDragonkai
Copy link
Member

Hey @tegefaulkes what about the final checklist?

Yes we can go through meeting today about this.

Copy link
Member

@CMCDragonkai CMCDragonkai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with a new domain src/websockets and tests/websockets.

src/clientRPC/ClientClient.ts Outdated Show resolved Hide resolved
src/clientRPC/ClientServer.ts Outdated Show resolved Hide resolved
src/clientRPC/ClientServer.ts Outdated Show resolved Hide resolved
src/clientRPC/ClientServer.ts Outdated Show resolved Hide resolved
src/clientRPC/ClientServer.ts Outdated Show resolved Hide resolved
src/clientRPC/ClientServer.ts Outdated Show resolved Hide resolved
src/clientRPC/ClientServer.ts Outdated Show resolved Hide resolved
src/clientRPC/ClientServer.ts Outdated Show resolved Hide resolved
src/clientRPC/ClientClient.ts Outdated Show resolved Hide resolved
src/clientRPC/utils.ts Outdated Show resolved Hide resolved
@tegefaulkes
Copy link
Contributor Author

Should I add events to the WebSocketStreams as well?

@tegefaulkes
Copy link
Contributor Author

tegefaulkes commented Feb 28, 2023

Using an EventTarget is fine and all for internal use. But it doesn't exactly have a nice interface for external use. For example, say we have a connection event. we'd handle it using

webSocketServer.addEventListener('connection', (event) => {
  // do things with event
  // `event` has `Event` type, we'd have to convert it to use it properly.
});

// For convenience we want to use it like this with implicit typing.
webSocketServer.addEventListener('connection', (event) => {
  // do things with event
  //`event` has `ConnectionEvent` type
});

// Ideally we have something closer to the EventEmitter interface
webSocketServer.addEventListener('connection', (streamPair, connectionInfo) => {
  // The handler has the args provided directly with the event abstracted away.
  // Would need some complex logic for removing event listeners to work properly.
});

The usage is fine but there is no type data along with it. if we're using this there will be no type hints that the connection event exists and the event object has the Event type which hides any internal detail we'll need. There are ways to fix this by wrapping the EventTarget methods. But that a bit of logic that will need to be replicated across domains so we'd want to standardise it between the websocket and quic system at a minimum.

If we want this to work anything like the EventEmitter interface node has then it will take a lot of work. If we want the EventEmitter interface and the portability then we can make a wrapper class that uses the EventTarget internally and implements the EventEmitter interface. Is this something we'd want?

I'm looking into using this with the WebSocketServer and WebSocketClient but i'm not sure I should continue until we have a standard way of using it. I've been referring to the Quic implementation of it but I don't think it's usage is set in stone yet. Our ways of implementing the convenience may differ.

So I think we should at least discuss this before I continue because I am almost certain whatever I implement will have to change.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Feb 28, 2023

Using an EventTarget is fine and all for internal use. But it doesn't exactly have a nice interface for external use. For example, say we have a connection event. we'd handle it using

webSocketServer.addEventListener('connection', (event) => {
  // do things with event
  // `event` has `Event` type, we'd have to convert it to use it properly.
});

// For convenience we want to use it like this with implicit typing.
webSocketServer.addEventListener('connection', (event) => {
  // do things with event
  //`event` has `ConnectionEvent` type
});

// Ideally we have something closer to the EventEmitter interface
webSocketServer.addEventListener('connection', (streamPair, connectionInfo) => {
  // The handler has the args provided directly with the event abstracted away.
  // Would need some complex logic for removing event listeners to work properly.
});

The usage is fine but there is no type data along with it. if we're using this there will be no type hints that the connection event exists and the event object has the Event type which hides any internal detail we'll need. There are ways to fix this by wrapping the EventTarget methods. But that a bit of logic that will need to be replicated across domains so we'd want to standardise it between the websocket and quic system at a minimum.

If we want this to work anything like the EventEmitter interface node has then it will take a lot of work. If we want the EventEmitter interface and the portability then we can make a wrapper class that uses the EventTarget internally and implements the EventEmitter interface. Is this something we'd want?

I'm looking into using this with the WebSocketServer and WebSocketClient but i'm not sure I should continue until we have a standard way of using it. I've been referring to the Quic implementation of it but I don't think it's usage is set in stone yet. Our ways of implementing the convenience may differ.

So I think we should at least discuss this before I continue because I am almost certain whatever I implement will have to change.

I'm not using event emitter style at all. It's EventTarget all the way.

About the types. Yes you have to assert type in the handler parameter itself. This is normal. It should work I believe. See the src/bin/server.ts.

In the js-quic there's a bunch of event classes, I think if you just use that it will be fine.

@CMCDragonkai
Copy link
Member

Btw TS is correct not to give you a specific type in this instance. This is because dispatching events is actually fully generic. There's no limitation on dispatching a connection event with a completely different subclass.

This is why the resolution of the genericity happens in handler. In Haskell the handler would pattern match the constructors. Here we just assert the type, or a runtime check can be done.

I prefer not to bother with a runtime check when it's obvious (programmer's guarantee) that it would not be dispatched with a different subclass. Kind of like how I originally used ! in other places.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Websocket Client and Server for Client Service API
2 participants